Data preparation textprocessing

Good day every body

I am using Knime for the first time. I'm exploring textproccessing, and I want to proccess some spanish facebook and twitter post that I previously downloaded using some extraction tecniques. I have a file with my data, but I can't even start to proccess it, because I don't know which is the correct format and file.

Currently, I have a xls file with the following columns: code, indicating the number of the post. Next, the text of the post, next the name of the facebook page or twitter account where the data is coming. Next, ID user, next the author nickname, next, number of likes, next number of comments, next number of shares, next number of retweets, next, date, and next type of the post.

Is it possible to proccess such kind of file and format, (how?) or do I need to follow some data requirments? If so, which?

Thank you in advance for your cooperation

Hi David,

you can use the XLS Reader node to read xls files. Then use the Strings to documents node to convert (some selected) string columns, like text, title, etc. into documents. These document cells can be processed with the text processing node. Make sure that all of the text that you want to process in contained in the documents. For csv formatted files, use the File Reader node.

Cheers, Kilian

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.