Convert a list of coordinates to useful data for ML

Hi Community,

I need some idea of how I can manipulate this long list of coordinates (x,y,z) into valuable data that I can use for ML, I have attached a screenshot. Thanks for your help.


Hi,
I would first split the individual rows in a cell using the Cell Splitter node. Then you can filter out the “2” in the first row. After that, you should have rows of coordinates but not bunched together in one cell as it is now. Finally, you can extract the individual numbers using a Regex Split node. The Regex should be something like this: (-?[0-9]+)\s+(-?[0-9]+\.[0-9]+)\s+(-?[0-9]+\.[0-9]+)\s+(-?[0-9]+\.[0-9]+)\s+(-?[0-9]+\.[0-9]+). It means "One number without decimal point, then 4 numbers with decimal point, separated by one or more whitespace. The Regex Split node will take each match from the expression that is inside parentheses () and creates a new column with the matching text. Now you only need the String To Number node to convert the numbers from text to a double number type.
Kind regards,
Alexander

1 Like

Hi Alexander,

Thanks for the suggestions. After splitting the cells, I obtained n columns containing the numbers. I normalize the values and clean the missing ones, using a constant value=0 or mean. I used these data for an RF workflow. The results are not great but I can work with that. I was just curious about why you suggested using the Regex Split node. Could you explain it to me? Thanks!

Hi,
Glad it works! Now that you mention it, I think the Regex Split is a bit overkill. You can just use the Cell Splitter twice: once for the rows in a cell and then another time for the columns in a cell. Much easier!
The Regex Split node is useful for extracting certain parts of a text where the Cell Splitter is not powerful enough.
Kind regards,
Alexander

1 Like

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.