Am looking to group string values in a column so that I have an additional column with a parent string. For example
Column 1 New Column
Does "GroupBy" work for this?
What is it precisely that you aim at doing ?
GroupBy provides you with summarized statistics for any given group based on your observations.
Do you want to predict the parent class an existing set of parent-child relationships ? (see data mining nodes)
Another idea of relationships is building a tree or a graph (edges and vertices) -> see network related nodes.
Thank you for your response.
I have a very large dataset I'm working with (too big to manipulate in Excel). One character string variable has about 30 unique values. I want to create a new variable where the 30 values are grouped into only 5 values. (ex. "Hair_Brush" and "Toothbrush" would both map to "Brushes" in the new variable). This would allow me to look at my data on a higher level, should I choose to do so.
In Excel I would just add a new column, create a mapping file, then use VLOOKUP to map the 30 unique values to the 5 parent values.
So I'm looking for a node where I can set up mapping rules to create this new variable.
Rule engine maybe?
The equivalent of vlookup is Joiner with Left or Right Join.
You should have a look at the Cell Replacer:
Create a table with your groupings (which is what you did in your first post) and then apply the cell replacer to your real data with the "Append new column"-option checked.