i want to know which Preprocessing-Nodes are used in KNIME tippically. I need it for my thesis . Its very urgent. i would be glad if someone can tell me his experiences about that.
I know column filter, row filter and discretization but i dont know about the further preprocessing.
Obviously I cannot tell you what’s typical or atypical in any general way, but for my uses it’s the following:
Text files “polluted” with tabs, commas and semicolons, especially if saved from Excel, will be pre-processed with PSPad or Notepad++ batch replacing, parsed into KNIME as strings without delimiter setting, and be split up using cell splitter and row splitter nodes, later to be treated with string replacements, format conversions (to numbers and/or dates). Groupby, value counter and sorter nodes are then used both to analyse patterns and to create final aggregates.
Those are just off the top of my head, though, and pretty much business data-oriented. If you’re into life sciences your needs may differ…
hm this information is useful to me. thanks alot for ur support. Your preprocessing is more the raw one. i will keep that in mind. I have used things like discretization so far.