address deduplication, string similarity and fingerprinting (a collection)

address deduplication, string similarity and fingerprinting (a collection) A few links and ressources I collected about address deduplication and string similarity and fingerprinting A meta collection of KNIME ressources for address deduplication or ‘fingerprinting’ https://forum.knime.com/t/namensabgleich/19232/2?u=mlauber71 --------- Mr. Wiswedel is the man if it comes to address dedupe ... https://forum.knime.com/u/wiswedel/summary https://hub.knime.com/knime/spaces/Examples/latest/50_Applications/13_Address_Deduplication/01_Deduplication_of_Address_Data https://hub.knime.com/knime/spaces/Examples/latest/02_ETL_Data_Manipulation/05_Indexing_Searching/03_Example_for_Fuzzy_Address_Matching https://forum.knime.com/t/approach-fuzzy-match-or-supervised-learning/10900 Fingerprinting for addresses https://forum.knime.com/t/rule-based-filter-question/13419/7?u=mlauber71 --------- Compare strings by their similarity https://forum.knime.com/t/comparing-strings/12939/8?u=mlauber71 You have to install Palladian to do that https://nodepit.com/product/palladian (is a special installation) You need this repository https://download.nodepit.com/palladian/4.2 --------- Additional Python ressources - not yet transfered into a KNIME workflow Super Fast String Matching in Python https://bergvca.github.io/2017/10/14/super-fast-string-matching.html Python - Adress matching I https://github.com/dedupeio/address-matching Python - Adress matching II https://github.com/RobinL/AddressMatcher libpostal: international street address NLP https://github.com/openvenues/libpostal https://datascience.stackexchange.com/questions/10810/how-to-do-postal-addresses-fuzzy-matching Fuzzy String Matching in Python https://marcobonzanini.com/2015/02/25/fuzzy-string-matching-in-python/


This is a companion discussion topic for the original entry at https://kni.me/w/MpPcaokcf9j8TF0j