I have two string who is made up of different substring.
I need to check whether the two string is same or not
String 1: “pl12, 78we, fg45” String 2 “78we, fg45, pl12”
Answer: true (because all the substring of string 1 is in string 2)
Use
with list sorted aggregation.
Then use
to mark similar strings based on aggregated columns.
Hi @PankajChaudhary ,
I think regex is easier solution. Check this:KNIME_project300.knwf (51.9 KB)
GL,
Mehrdad
I tried out using the Column Aggregator but could not get it to turn “e, b, d, c, a” into a sorted list [a, b, c, d, e], which is odd as it feels like it should be capable. Not sure what I was doing wrong.
I also took a look at the following post, which on the face of it is the same problem, and may contain the answer
but in the end, I thought I’d also just have a go. (You can never have too many variations and ideas!! :-))
This workflow also produces the result.in a somewhat convoluted way…
Broadly the task is
(1) Turn both Strings into sets of data A and B that can be compared
(2) Find where an element of set A is not in set B
(3) Mark string A as “is subset” where all of its elements are in set B
My initial stab at this had a question mark. What was the desired outcome if an element in String A is repeated. Can that occur? I had to assume that “c,c,d,e” was NOT considered a subset of “c,d,e” , and that meant I had to add some more nodes to do counts of elements and handle that bit too. So my result wasn’t as small a workflow as I’d like, but it’s another option …
KNIME_test_is_subset.knwf (51.2 KB)
Hi @PankajChaudhary,
This can be solved with KNIME in very different ways as nicely showed by @izaychik63, @mehrdad_bgh and @takbb.
Here goes my contribution based on aggregation, considering both case (allowing repetition or not of substrings)
20210501 Pikairos Compare Sets Example.knwf (61.0 KB)
Hope this helps
Best,
Ael
Hi @mehrdad_bgh
In my case, the no of elements in the string is different for every row. How can we generalise it for that?
Hi @mehrdad_bgh
Actually, both columns are in a different table like you have taken in the first workflow.
@Daniel_Weikert Enjoying your python solutions. Am I right in thinking that your equal() function is returning exactly what it says, as in True if the sorted collections are the same?
So if we want to return True if Column A either equals B or is a subset of B, then your method could be changed to this:
def equal(a,b):
return all(x in b for x in a)
p.s. my mate Google told me this… hope it’s right!.
I think you are correct.
Speaking of solutions. Whenever I visit the forum I found at least 10 new solutions by @takbb
Excellent work Brian, the only problem is I am not able to catch up reading all of them
Best regards
… thanks @Daniel_Weikert . I discovered the best way to learn Knime was to pretend I knew what I was doing and just dive in head first attempting solutions… which might be why some of my solutions are perhaps a little off the wall!
Here’s another thread with a similar theme, looks like this had a few other options to add to the list!
This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.