Table validator (reference)

Mitesh_Dama · November 6, 2024, 2:20pm

Hello, I am trying to automate a workflow for cleaning the data.

First step in cleaning is to check the datatype of the columns of input file.
I want to generate different variable for each column if it differs from the reference file. How can i do it?

takbb · November 6, 2024, 3:11pm

Hi @Mitesh_Dama , welcome to the KNIME community forum,

I’m not sure what you mean by “generate different variable for each column if it differs…”. Can you show us what output you are wanting to achieve, preferably with a simple example.

The Table Validator (Reference) node can be configured to return some table structure differences, whilst the Table Difference Finder will provide information about data differences.

Take for example two data tables, with differences that I have highlighted:

These tables can be compared with those two nodes resulting in the following outputs:

NB, You need to set the configuration on the Table Validator (Reference) to ensure it doesn’t simply halt when a difference is found:

Mitesh_Dama · November 6, 2024, 3:47pm

Thanks @takbb . What I want to achieve -
I want to clean the table which has name and dimension columns.
As a refernece - Name should be string and dimensions should be number double datatype.
When I start cleaning input data, if any column is not according to correct data type, It shoud show in a separate table where I can use Case switch later to clean/ process this data.
Is there any other way to do this?
Thanks in advance

takbb · November 6, 2024, 8:07pm

Hi @Mitesh_Dama , if your data is just Name column plus other columns that should be all doubles, can you not make use of the Column Splitter to break it up into two tables.

I’m sure this is over-simplified, but without seeing your data and knowing exactly how you will go about processing it, it feels like it might work. You could also use Extract Table Specification to get a list of the column names and data types and then maybe process that information to split out the columns, but the Column Splitter seems the simplest for what you have described.

Splitting columns on data type.knwf (78.2 KB)

system · February 4, 2025, 8:08pm

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.