Hi All,
I have a dataset (~40 million rows) containing a column which contains a list of string values separated by a semicolon. For each of these values, there is a corresponding blank column ( 496 blank columns). the values in the list and the names of the column are identical. Unfortunately, the contents of this list are names, so i will not post it here.
I am trying to parse through the list, and mark each corresponding column with a 1 if the name is present in the list, or a 0 if it is not.
I am fairly new to knime, having used it very infrequently over the last year or so. apart from using a 496 rule engines each hard coded to look for a specific value, i cannot find a way to do this. my hard-coded approach works for subsets of this data, but is not effective on the whole set due to time and memory constraints.
is there some way i could use a python or java snippet node to accomplish this? e.g. parse the list into an array of values, then, for each value assign a value (1) to the column with the same name?
thank you in advance,
Chris