File-Joing + Defining Criteria

Hello everyone,

I am not sure whether KNIME is the right application to solve my problem, hence I am complete new at Data-Mining.

I would like to evaluate several txt-files (over 100) for specific criteria. Those rows which match to one of those criteria (e.g. specific words) need to be exported to an excel file.

Thought of putting those txt-files somehow together to a database and then defining a set of criterias to search and explore.

Is KNIME usable for this application? And if yes does someone know some usable nodes to do so?

Sorry for my newbie question but I am complete new in this filed and need the data to start with my thesis….

Many thanks in advance!

Kind regards

Enrique 

Hi Enrique,

I believe this task can be achieved using some of the Text Processing nodes, part of the KNIME Labs extensions, but a lot depends on the complexity of your criteria and also on the size of the TXT files you need to ingest. In general there is always a way, if not many different alternatives, to accomplish a task with KNIME.

Have a look here as a starting point: https://tech.knime.org/knime-text-processing

If you could be more specific on your use case it would become easier to help you out or at least point you in the right direction.

Cheers,
Marco.

Hi Marco,

Thank you very much for your reply! The criteria will be consisting of certain words and commands. I want to search those files for certain tasks in a cockpit. Therefore I want to define 2 groups of keywords. The first one to in indicate a manual task (like “push, press…”) and the other one with typical cockpit devices like yoke, certain buttons etc.

If a manuel task within the cockpit is found, it should be extracted to an excel file with the document number.

The files are .pav files and have a size between 1-140KB each.

Thanks again for your help! 

Kind regards,

Enrique 

I am not familiar with .pav files, but as long as you can convert them to text and load them into KNIME you should be able to accomplish your task.

You may not even need to use the Text Processing nodes for it, probably regular filter nodes with smart use of Regular Expressions/Pattern Matching will do it.

Just give it a try and post here again if you need additional help.

Cheers,
Marco.

 

Thanks Marco! I did it with some filters and it worked perfectly fine. 

Since I used different "index Query" nodes, I want to join them together and transform them afterwards to excel. 

I think I can’t use the node "Joiner" because the RowID of the “Index Query” output tables do not fit anymore. 

I just want to have the different columns of the Table Output s next to eacht other in one file, do you have maybe a suggestion for a suitable node?

Thanks in advance! 

Look at the Column Appender node, that should do it.

Cheers,
Marco.