I’m using Microsoft Authentication node to connect to blob storage, and I need to loop through files in directories, so Im using 5 loops at the start.
Question is - once the loop starts, the CSV reader needs the same authentication I used before the loops to read the files, so I’m wondering how best to send the authentication SAS through the workflow, or if I need to have a separate auth node at each loop point?
Hi,
I hope I understand your problem correctly but you can simply connect the output port of the Azure Blob Storage Connector with the input port of the CSV Reader to use the same connection.
One more suggestion, you do not need to use a loop to read multiple CSV files. The new reader nodes can read multiple files out of the box which is usually faster than using a loop construct in KNIME. For more details see the File Handling Guide.
Bye
Tobias
I think that was exactly what I Was looking for, and thanks for the reminder about the CSV node reading files in directories now. Old habits die hard
I’m always curious about bow to better layout workflows and other better ways of doing things. Looking forward to the next KNIME event I can attend to learn more as well.
@tobias.koetter quick followup if you don’t mind -
Ran into an issue where the amount of data involved was just too much for the system, so I need to filter down the CSV Reader. I’ve been searching for information on using a variable in the CSV File Filter area, but coming up blank.
I have file names like:
2022-04-22___0438d7a8e8f3.csv
And there will be 50+ per day.
I want to use the Filter options in CSV Reader to look for files with “Today’s Date - 1 day” as a filter. So that on 4/22, it will read all files with 4/21 in the title, essentially working with only yesterday’s data.
I’m feeling I’m going back to the loop function so that I can use List Files node and then filter that table to the date I want and feed those records back to CSV reader?
Hello @serendipitytech,
not sure whether or not you figured out how to solve it, so I’ll just post it:
Step 1: creating a suitable needle. I’ll assume you always want yesterdays date, otherwise you’ll need to adapt this:
Create date “range”, using current execution date
Shift it to yesterday
Turn it into string
I’m already appending the catch-all RegEx bit at this point, you can also use a separate String Manipulator.
Using Wildcards works as well.
Change to nice column names (optional) and turn the needle into a Flow Variable
Step 2: Configure the CSV Reader
Just use the Flow Variable as shown in the screenshot (I’m not making you read an essay where to find it ). Things I noticed:
You have to enter a valid dummy needle
The preview in the configuration window uses the dummy needle, not the value from the Flow Variable