Combining source files & Row Filters

I have two problems I need help with:

1.  Multiple source files.  I have a number of source files (e.g. 36 files, text files, by month, "|" delimited, that are approximately 4 GB each) that all contain the same columns (variables).  Is ther a way to bring many or all of then into the workspace without using "File Reader" for each file?  I tried to use "List Files" and was able to bring all files into the List Flies function, however, I was not able to figure out how to complete the process to bring the data into my workspace.  

Is there a way bring multiple files with the same columns into a workspace at the same time?

 

2.  Row Filters.  I would like to select a number of records from a large dataset based on the values within a give column.  I was able to successfuly use "Row Filters" when there was only one value or the values were in a range.  However, I would like to perform the filter/query on approximately 20 values at the same time and most are in is a series/range.  Is there a way to do this within "Row Filters" and/or another function?

The data I am using a very large text file.

 

Thanks,

 

 

1) You will need to read these files using a loop.  The workflow should look like:

List Files > TableRow to Variable Loop Start > File Reader > Loop End

The trick is that you need to use the file URL flow variable to configure your file reader.  

 

2) Have you seen the Rule-based row filter?  This will allow you to filter on multiple criteria in a single scan of your data. 

 

Thank you for your help.  I apologize for being unfamilar with boolean search.  I tried several different itterations of the statement below:

Diagnosis Code = 366.10 OR 366.16 OR 366.17 OR 366.19 OR V43.1

"Diagnosis Code" is the column and the values to the right are some of the positive values I am looking for...

and I kept getting the following error message, "Invalid settings: line 5, col o: Expected a number, boolean, string, column, a table property or flow variable reference."

 

I am sure I am doing something very dumb...

Hello,

The syntax you are looking for is the following (if you are using KNIME 2.9 or above):

$Diagnosis Code$ = 366.10 OR $Diagnosis Code$ = 366.16 OR $Diagnosis Code$ = 366.17 OR $Diagnosis Code$ = 366.19 OR $Diagnosis Code$ = $V43.1$ => TRUE

(Here I am assuming the last part is a reference to a column named V43.1, if it is a flow variable, that should be $${DV43.1}$$ ) Or the (IMHO easier to read) following expression:

$Diagnosis Code$ IN (366.10, 366.16, 366.17, 366.19, $V43.1$) => TRUE

(Same assumption.)

Be careful though, if you see only the first 2-3 digits, those might not be the numbers you are seeing. It is better to copy-paste from the values to the expression, or convert them to Strings before the filtering and use them like this:

Diagnosis Code IN ("366.1", "366.16", "366.17", "366.19", $V43.1$) => TRUE

Hope this helps. Cheers, gabor

PS.: If you hit Ctrl+Space, you will get code completion in the editor.