Issue with Input (CSV or Excel Reader) Type Recognition of Fields -> Cannot parse String to Number for the fields where first 50 rows missing value)

Hello there, i have a huge issue.

I am trying to convert excel type of numbers from STRING like (3009; 3.14; 13.0031) etc to DOUBLE in KNIME.

When I try to read the data either from excel or csv and parse the fields for example:
Field1 -> INTEGER (3009 or 120)
Field2 -> DOUBLE (3.14 or 13.0031)
…FieldN -> either int or double

KNIME fails to parse all fields, so I am unable to use math formulas.

It seems to fail because in the first rows (>50, not known exactly how many rows) of those fields , there are no values basically we have a case of missing values.

Since it varies for each field when values occur, I cannot skip the missing values rows.

I am left to input as STRING.

But when I try String 2 Number I have issue because KNIME can only recognize corectly INTEGER but not DOUBLE. For the DOUBLE case it recognizes wrongly because the digits after decimal point varies for example as mentioned above it can be
3.14 or it can be 3.000014 or also 3.1 etc

I used the parsing as described here: Type from String to Integer
@armingrudd @Jax

So that means that for decimal numbers the String to Number gives me the following wrong parsing:

LEFT Column are the intital numbers RIGHT Column are the wrongly parsed numbers

Can someone clever help me?

I would appreciate any help and suggestion how we could recognize all numbers so that math formulas can be aplied later on.

Thank you a lot in advance!

1 Like

Welcome to the forum. Not 100% sure if I understood the problem correctly, but have you set the correct delimiting character?

I created a quick string table and set the delimiting character to a comma (",") instead of the default period (".") and it worked for me, see screenshot:

image

It turned these strings …

image

to these numbers:

image

I noted that it applied some basic number formatting by default (e.g. using comma as a thousands separator and period as a decimal separator).

Does that help?

7 Likes

Hello @kowisoft

many thanks for your quick response!

It all comes down to the crucial settings of decimal and thousands separator!

As you correctly recognized my data comes with “,” as decimal separator.

So I had to pick “,” but this didnt work and it confused me -> every time I tried it failed.

For me it didnt do automatic formatting for the thousands separator-> It failed.

It was crucial for my case to insert as thousands seperator "."

Then it worked.

I think the automatic formatting did not work as in your case because my table for the first rows has missing values.

Having said the above also changing the type of the fields from the input node (EXCEL or CSV Reader) does not have any chance to work -> I assume it is also related to the missing values.

The workaround as discovered, is if you input as STRING and later use String 2 Number BUT always set in the settings dialog of String 2 Number the decimal AND the thousands seperator. For me it was important because the first rows of my table had missing values!

So people always check your separators and better set them manually

either

  • “,” and “.”

or opossite

  • “.” and “,”

Do you a agree with the conclusions?

In any way thank you a lot @kowisoft! For the quick response and help! :medal_sports:

6 Likes

This topic was automatically closed 182 days after the last reply. New replies are no longer allowed.