Hi @qqilihq,
I’d like to propose a few optimizations for the Phone Number Formatter Node:
- Some area / city codes are recognized, some are not (I got a few land line numbers in Germany I can share in private)
- Ability to not assume a Default Region
- Ability to extract the Country name / ISO Code based on the recognized country code like 0049, +49, (0049)
- Ability to identify validity of recognized country and area / city code
The primary goal would be to identify part of the data, like the country and area / city or if it’s a land or mobile number (or unknown?).
I also implemented a few optimizations to tackle poor data quality which you might consider adding too:
- Replace o by zero
- Remove HTML Characters using
&[^;]+;
- Remove duplicated country codes
^(\+\d+)\s?\1
to$1
- Remove duplicated country codes
^\+(\d+)\s?00\1
to+$1
- Harmonize country codes (##) to +##
^\((\d{2})\)\s?
to+$1
- Harmonize country codes (00##) to +##
^\(00(\d{2})\)\s?
to+$1
- Harmonize country codes 00## to +##
^00(\d{2})\s?
to+$1
- Fix wrong country code not starting with 0
^([^0+])
to+$1
Note: I am not fully confident this is not causing false positives as some area / city codes, like in the US, might not start with a zero - Remove (0)
\s?\(0\)
by - Replace [-/] by space
[-/]
by - Replace multiple whitespaces by one
\s{2,}
by
If you like I can send you the part of the workflow.
Best
Mike