Allocate Misspelled country names to correct country names

anon33357744 · May 8, 2019, 9:32am

Hi everyone,
I have a question regarding fuzzy matching. I have a column about countries. These can contain typos. I also have a file with a single column that contains all country names, but all of them are spelled correctly. I used an example from the server (09_Fuzzy_String_Matching), but here I don’t have the ability to assign the right name to the misspelled names, but how can I replace it with the right name if a country was misspelled?

Thanks,
Canan

armingrudd · May 8, 2019, 2:26pm

Hi Canan,

What do you mean by “a column about countries”?
Is it some text in which the country names are mentioned?
If there is a single country name, the String Matcher node alone will do the matching.
But if you have a string containing the country names, I have to ask if the text structure for all records are the same or not?
For example, if all the strings start with:
Country name: This country is located in…
It would be easy to handle.
Would you please provide a sample data?

Best,
Armin

anon33357744 · May 12, 2019, 8:55pm

09_Fuzzy_String_Matching.knwf (86.9 KB)

Hi Armin, here you can find my workflow.
I have added the list with all existing country names in the yellow area. In the blue area you can see the file for which I should do the correct name assignment. Let’s assume that the correct list contains “Germany” (correctly written) in the yellow area. But if “Gernamy” was written in the blue area now, it should be matched with the names in the correct list and then replaced with the correct spelling. gernamy becomes Germany.

Hope that you can help me
Kind regards,
Canan

armingrudd · May 12, 2019, 11:03pm

Your workflow does not contain any data. Do not reset the workflow while exporting or provide the data files as well.

anon33357744 · May 13, 2019, 6:13am

Hi @armingrudd,

sorry i thought i uploaded the whole workflow

Kind regards,
Canan

armingrudd · May 13, 2019, 11:57pm

The output of the blue area has no difference with yellow area. All the country names in both lists are exactly the same.

anon33357744 · May 14, 2019, 7:43am

Hi @armingrudd,

this Excel sheet includes some mistakes in the column “port of Destination”.
When for example Thailand is written “Tailand” in the document “TM full file Oct 18 1 (1).xlsx” then we should get “Thailand” from the other uploaded document as an Output or better said it should replace Tailand with Thailand.

Do you know how it works?

Kind regards,
Canan

system · November 12, 2019, 7:43pm

This topic was automatically closed 182 days after the last reply. New replies are no longer allowed.