2 questions textprocssing

ecsdaehn · December 26, 2016, 3:35pm

Hi All,

I have a list of addresses I need to (pre-)process.

Here I have 2 things I need to solve.

1. I want to remove diacretics (like CZ, German, Swedish and so on) special characters.

I am aware of the string manipulation node allwoing to remove the diacritics, yet I do not want to nest the function and just using it on one row by the time is quite tideous.

Is there a better way (tried some dict. replacer nodes on character level (like ("Ä,A") but no luck.

Can I remove the diacritics on document level at once in some way?

2. I want to link a table to 2 nodes, a cell replacer and a dictionary tagger.

I always get the error "WARN Cell Replacer 2:140:112:120:142:145 Duplicate search key "?"".

What is wrong and what does the error mean..

As I update the table I want it to be the same for both nodes.

Thanks a lot,

J.

karelman · December 28, 2016, 4:11pm

For your first question. Have you tried the "String Manipulation" node?

It has a "Replace" Category, that could help you.

ecsdaehn · January 1, 2017, 5:47pm

Hello Karelman,

yes I know this one, yet I need more than one argument-i.e. more than one character to replace.

AFAIK I need to nest them and cannot list them seoparated with a " ; " - always get an error message.

J.

kilian.thiel · January 3, 2017, 3:01pm

Hi,

1. to remove diacritics simply use the Diacritic Remover node that has been released with 3.3.0.

2. The Cell Replacer has two inputs. One input is a dictionary containing search and replace strings. The column containing the search strings should not contain any duplicates. It seems that you have missing values (= "?") in your data. You should remove or replace missing value and remove duplicates in that column you want to use as search column.

Cheers, Kilian

system · June 2, 2023, 9:47pm

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.