Large data tables missing?

mlauber71 · January 8, 2019, 6:34pm

Will the large dataset be imported and the problems will occur later or do you experience problems also when loading data? If you have problems with import it might help to use the R library Readr instead of KNIME file reader.

I know of no formal restriction for data in KNIME but it depends on how much power your system might bring. These points might be able to help you

check out the links below with hints from KNIME and other information about performance
with large datasets from CSV it might be worth having a separate workflow that stores them in a KNIME table or split up tasks into several workflows
check the memory handling of critical nodes
try to use the Parquet internal compression and see if that bringt any help
then if mit comes to execution you might check if streaming is an option for you only to process several lines at once (of course if you need to do joint that might not be possible)
(https://nodepit.com/workflow/public-server.knime.com%3A80%2F06_Control_Structures%2F01_Meta_Nodes_and_Wrapped_Nodes%2F01_Simple_Streaming_and_Wrapped_Nodes)
if you have a lot of large strings it could be possible to replace them with a dictionary and only deal with (long) integers - only if that fits your task

KNIME performance