extract text between two double quotes from data

Could you please send an example of what solution is needed to extract the text between two double quotes from the data.

Within double quotes, the text character is ambiguous.

sample texts below

example 1 АКЦИОНЕРНОЕ ОБЩЕСТВО “ЗАВОД ЭНЕРГЕТИЧЕСКОГО ОБОРУДОВАНИЯ ЭНЕРГОПОТОК”, Россия, 607328, Нижегородская область, Нижний Новгород, 607328, ОБ
**** edit
“ЗАВОД ЭНЕРГЕТИЧЕСКОГО ОБОРУДОВАНИЯ ЭНЕРГОПОТОК”

example 2 Общество с ограниченной ответственностью “АЭРОФИНАНС”, Россия, 633104, Новосибирская область, Обь, 633104, Россия, Новосибирская область,
**** edit
“АЭРОФИНАНС”

Hi,
Will there be only one section in double quotes? And is it double quotes like this: “text” or like in your example “text”? It might be that the latter comes from our forum software reformatting your double quotes.

If it is the former, you can use the String Manipulation node with the following expression:

regexReplace($columnname$, "[^\\"]*(\\"[^\\"]+\\")[^\\"]*", "$1")

This command looks for a pattern that is basically the following: “first any number of non-double-quote-characters, followed by double quotes, followed by one or more non-double-quote-characters, followed by double quote, followed by any number of non-double-quote-characters”. We capture the relevant part by surrounding it with parentheses () and this captured part is available in the replacement string via the placeholder $1. So we replace the whole string by the part that is in double quotes.
Kind regards,
Alexander

2 Likes

Hi; @AlexanderFillbrunn
thanks for the answer. Yes, I only want to get the text inside the double quotes “text”.

“text” is in the data set, it is not sourced from knime. I tried what you wrote but it gives an error.
Is there anything I’m missing or doing wrong?

regexReplace($Documents$, “[^\”](\“[^\”]+\“)[^\”]", “$Documents$”)


Example data

АКЦИОНЕРНОЕ ОБЩЕСТВО “ЗАВОД ЭНЕРГЕТИЧЕСКОГО ОБОРУДОВАНИЯ ЭНЕРГОПОТОК”, Россия, 607328, Нижегородская область, Нижний Новгород, 607328, ОБ

Общество с ограниченной ответственностью “АЭРОФИНАНС”, Россия, 633104, Новосибирская область, Обь, 633104, Россия, Новосибирская область,

data extraction
ЗАВОД ЭНЕРГЕТИЧЕСКОГО ОБОРУДОВАНИЯ ЭНЕРГОПОТОК
АЭРОФИНАНС

Hi,
Apologies, there should have been only single backslashes in the code. I also think when copy-pasting you may have copied wrongly formatted characters (might be our forum software that renders them that way). Please find attached an example workflow. That should clear things up.
Kind regards,
Alexander
Quotes.knwf (7.1 KB)

3 Likes

@AlexanderFillbrunn
This is great, exactly the solution I wanted. Thanks so much for your support :trophy: :+1:

2 Likes

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.