I am trying to convert a string “Female” and Male" to a number. I have tried the string to number node but I get an error “missing value for string”.
I tried Category to number and rules enging but I can’t seem to get it to convert. Any help appreciated.
String to number node may not be the right choice for the problem you want to solve. It is for converting numbers in a string type column. Not for assigning numbers to categories.
Can you share an example then someone can surely help you out
Welcome to KNIME Forum.
If your string only contains “Female” and “Male” there should be no issue with the Category to Number node. The Category to Number node fails if there are too many distinct values in your string column.So are you really having a limited number of different values in your string column? If so please upload a sample workflow with the Category to Number, so we can actually see what is the error
An alternative is to use a Rule Engine node like this:
$column1$ = "Male" => 0
$column1$ = "Female" => 1
// and all other values 99
TRUE => 99
But this can be quite some work to write the complete syntax with multiple values. To avoid this you can go for a GroupBy on your string column. Add a Counter Generation to add a number to all individual values, and Join this result back to your table with the string column.
I am jsut learning Knime so have taken the churn data from kaggle and am trying to cleanse the data so it can be used to train machine learning.
I don’t see anywhere to upload anything, but my workflow is pretty simple so far.
CSV reader — Column filter ---- missing value — string to number. These are the nodes I have setup so far and the string to number fails
In the parsing options I have tried all the number types.
I get the error below.
1
Problems in 1322499 rows, first error: “Female” in cell [“Row0”,
column “Gender”, row 1] can not be transformed into a number
What you need to do is to go to the configuration dialogue of String to Number node and move any column that does not have numeric values in them - e.g. in your case the Gender column - to the Excludes part:
I was under the impression the results should be something liek male becomes 1 and female becomes 2 so the data can be used to train the ML and it couldn’t train on text. Maybe I have missed something.
I think I mentioned in my earlier post that String to Number is not converting normal text (e.g. “Female”) to a number.
It is converting e.g. a string “1,0” to number format… you may want to do this if numbers are in a string type column and you want to perform calculations on them with downstream nodes. E.g. if a number is in a string type column you can’t use math formula node…
if you select a node and click on the info button (top option in your left hand panel$, you can read through the node description.
This is a good starting point to also explore some examples.
To convert categorical values to numbers you can follow what @HansS was proposing in his earlier post.