Check similarity within a column

Hello,

I am pretty new to KNIME and I am searching for a proper workflow regarding the following requirements:

In the Excel file below there are two columns “Material” and “Serial Number”. I would like to search within the rows of the same Material (e.g. row 2-21) and compare the single Serial Numbers in the second column by similarity.

check_similarity

For instance each Serial Number of the first Material contains “SO4M” as prefix, followed by a seven-digit number.
I want this similarity to be copied to the next column.

Thank your very much in advance!

Check_Similarity.xlsx (11.3 KB)

Hi Fabio,

Welcome to the KNIME forum! Is the common part of serial numbers always a 4 length string? That would make extracting it very simple.

Best,
Temesgen

2 Likes

Hi @fabio_sie

Oeps, this wasn’t easy. But maybe I overlooked something. See this workflow check_similarity.knwf (71.2 KB)


gr. Hans

5 Likes

Hi Temesgen,

thank you for your response!

The common part of serial numbers is not always a 4 length string but most frequently.
So I really would appreciate your solution!

Best,
Fabio

Hi @HansS ,

thank you for that great solution!!

Do you know why there are no similarity results next to the material A5E00125721, although it is obvious that it must be “SJTA”?

Best,
Fabio

1 Like

Hi @fabio_sie, glad I could help

The reason whu I doesn’t find any similarity is probably because of the “+” sign.
image

gr. Hans

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.