Identifying right string embedded in a string and replacing entire string with the right string.

I have a column in which there are multiple terms which are corrupted. For example the Term is Alexia 40 Various persons have entered this in the database and have various mistakes like someone has typed ALE XIA 40, someone has not left a space between ALEXIA and 40 while some one has entered ALEXIA 40 Lube O. I am attaching the excel sheet on which I am working. The requirement is to find the quantity of grades of LO supplied by Vendor and Port of supply. The problem is that the item names are not unique, and for the same item multiple names are there for same item name as mentioned earlier of example of Alexia 40… Any suggestion would be helpful. I tried using string replacer but the I would have to list all the mistakes made by everyone and there is no guarantee2021_Lube_Data.xlsx (141.1 KB)
that a new kind of mistake will not be made. Any help will be appreciated.

Hi @rahulgog,

maybe a suggestion to find all mistakes.
For each item you could check the string similarity and depending on the similarity you apply a fix.

e.g. with:


This topic was automatically closed 182 days after the last reply. New replies are no longer allowed.