Let’s say hypothetically, I have a 1 TB of data files and I want to mine the information. I am only interested in mining the first 50 GB of this 1 TB of data. Is there a stop function or setting to limit the workflow from continuing to read the full hard drive of data?
how is your data stored? There are different options depending on the file or DB format you are accessing, not necessarily on the size of the imported file, but rather on rows.