merge files with errors

hello everyone,

i am new to the platform, i would like to ask for some help. I need to merge several excel files that all have this structure: a first column with names and the second column with numeric values. what is the easiest way to merge them so that I have all numeric values side by side by name? (I tried concatenate but the resulting table multiplied the rows by first putting the list of names with the first value and the second one blank, then it starts over again always the first column with the names and puts me the first blank value and the second one written).

I have another problem, that in the different input excel files the first column with names is not the same for everyone. some columns have different lengths or the names are not exactly the same, i.e. for some there are errors such as: instead of writing Anna Maria there is Maria Anna, or instead of St. Anna there is Anna, St.

about the length, it is because some names are repeated with variations for example Anna Maria1 or they are sets, for example “woman”.

what is the easiest way to handle these problems?

I had thought of writing a column with the limited set of names I’m interested in but I wouldn’t know how to get it to do the checking and then how to handle the differences.

If you share some sample data you’re more likely to get some help. Also can you explain in more detail:
“about the length, it is because some names are repeated with variations for example Anna Maria1 or they are sets, for example “woman”.”

1 Like

@BackupArian I think KNIME should be able to handle your task although it will most likely involve several steps and data preparations. You have asked several question in the forum that seem to have something to do with this…

Your task will involve many steps from different domains of KNIME like similarity of strings, loops and so on.

From my experience it will be best if you could construct a maybe collection of sample files that represent the whole of your challenges (with dummy data so as to not spell any secrets). Then the community will be able to solve the problem and you can then use that in your real world task.

Of course you can try to put it together bit by bit but that can be frustrating for all participants - more often than not after one problem has been solved another complication comes along: oh well my column names are not the same, oh and I would want this for several years.

If the whole of the task would be in the sample file it will be easier to think about all the possible implications. And often during creating a sample representing you task you might be forced to structure what you expect KNIME to do.

1 Like