Find out data type of a specific cell

I have a column say salary but there is an error say a value as abc due to which the salary is reading as string. How can i find exactly all such occurences which are causing my integer field to be read as string?

Often the error message will include a row number. You can also sort that column to find the non numeric cells quickly. Or use Regex to find non-numeric characters. You could then remove the non-numeric characters, or split those rows off to the side so you can review them and change them manually.

1 Like

I have to do this exercise for a lot of columns and since it is right at the start, there is no error, if there is any string in integer field the entire column turns as string so i want to check if a individual cell is suited as a string or a integer

Did you try the string to number node? It should turn stings into missing characters and give a warning instead of an error. You can always loop multiple columns through it to check all of them.

Or you could use the String Manipulation node with the below code. Just put your correct column name in there. Or use the same approach with the String Manipulation (Multi-Column) node to address a larger number of columns.

regexReplace($column_name$, “[^0-9.]”, “”)

The same approach can be used with the Column Expressions node if you prefer to use that over String Manipulation.

5 Likes

I have built a utility where you don’t even need string to number node, it is dynamic and doesn’t assume specific column data types to fix it. Once it is tested, i will share it with the group.

2 Likes

Hi @r_jain ,

As @iCFO knows, I do very occasionally ( :thinking: ) like writing a component, so I thought I’d take this opportunity to see if my Multi-Column “Rule Engine” component would be able to assist with your question.

… and no this isn’t a late April Fool! :wink:


This isn’t intended necessarily to answer your question, although of course I hope it might be of use… but I thought it an interesting use case that could test and demonstrate the component.

I’ve used it here to show which of the string columns in each “bad row” contains non-numerics.

In this example, a demo table contains some letters in place of numbers:

There are also some non-string columns of data types that would make processing this in String Manipulation (Multi Column) problematic.

The non-numerics are detected using a regex which is designed (if chatgpt got it right, lol) to handle numerics including signed and unsigned integers, floating point and scientific notation, and as you can see from the rule below, it is predominantly the Rule Engine syntax (it of course uses a core Rule Engine behind the scenes!), but borrows the “currentcolumn” idea from the String Manipulation (Multi Column) node:

NOT $CURRENTCOLUMN$ MATCHES "[-+]?\d*\.?\d+([eE][-+]?\d+)?" => "NON-NUMERIC"

The demonstration workflow can be found here:

further info

2 Likes