NER / NLP / correct assignment of names / Fuzzy matching

Hi guys,
I have a list of transaction information. I would like to be able to see samples in the existing data, i.e. create evaluations of which companies my customer works with the most, where the products are sent to, etc. The Beneficiary is our customer and we want to know more about him, e.g. the top 5 companies he has been working with lately.c

Because the records are written by people, however, they contain different spellings or also errors in the company name.

Example:

TransactionID | Beneficiary | Benf. Country | Applicant | Appl. Country | Goods | Amount | CCY
100AX12340 | Siemens AG | Germany | Walmart Inc. | USA | Household appliance| 10000 | US
100AX12501 | Siemens Aktiengeselchaft |Germany | Power Gmbh | Germany |Lighteners| 50000 | EUR
348GH54644 | Siemens A G | germany | Fracht GmbH | Deutschland | Beleuchtung| 7800| Eur
150LA56894 | SIMMENS AG | Germany | Eisengießerei GmbH| Germany| Machinery 20000 | EUR
389XF67134 | Asos Plc. | UK | H &M GmbH & Co. KG | Germany | Textiles | 15500 | EUR
256LF74612 | asos | UK | Zara S.A. Italy | Home decoration | 9450 | EUR
546HK65145 | ASSOS PLC. | UK | Eisengießerei GmbH | Germany | Rawmaterial comp. | 14000| EUR

The first four records are for the same customer (Siemens), but they were described differently. This is due to spelling mistakes like SIMMENS instead of SIEMENS.
In the second example (Asos), the first two records are the same company, but the last record is a different company. How can one ensure here that e.g. when using a fuzzy matching, ASSOS PLC. is not identified and assigned as ASOS PLC. but as another company?

Thanks for your help in advance.

Cheers,
Canan

Hi Canon,

you can find an example for fuzzy string matching on the EXAMPLES Server

08_Other_Analytics_Types/01_Text_Processing/09_Fuzzy_String_Matching

Maybe this helps already?

Cheers,
Kathrin

1 Like