Large data tables missing?

Will the large dataset be imported and the problems will occur later or do you experience problems also when loading data? If you have problems with import it might help to use the R library Readr instead of KNIME file reader.

I know of no formal restriction for data in KNIME but it depends on how much power your system might bring. These points might be able to help you

  • check out the links below with hints from KNIME and other information about performance
  • with large datasets from CSV it might be worth having a separate workflow that stores them in a KNIME table or split up tasks into several workflows
  • check the memory handling of critical nodes
  • try to use the Parquet internal compression and see if that bringt any help
  • then if mit comes to execution you might check if streaming is an option for you only to process several lines at once (of course if you need to do joint that might not be possible)
    (https://nodepit.com/workflow/public-server.knime.com%3A80%2F06_Control_Structures%2F01_Meta_Nodes_and_Wrapped_Nodes%2F01_Simple_Streaming_and_Wrapped_Nodes)
  • if you have a lot of large strings it could be possible to replace them with a dictionary and only deal with (long) integers - only if that fits your task

KNIME performance