hi, I am a novice to KNIME and I got a question on DATA preprocessing if I want to count ‘new york’ and ‘new york city’ as one city in a column, which node should I use in the KNIME? Thank you
You can use Cell replacer Node and replace all “new york city” and similar to “new your” or vice versa
First step should be cleaning your data. Cell Replacer node indeed is a good option. You could also use the String Manipulation node. This would enable you to use Regular Expressions to find and replace different variations of names of one city and replace them with the desired name.
If I’m not mistaken and you want to count the occurencies of cities afterwards, the groupBy node is a good choice. Just group by the city column and select count as aggregation method for an arbitrary column.