If this were EXCEL I'd use =COUNTIF(range,column header)

I created dummy variables for 583 codes, however there is a possibility that each row could have “1” for more than one column/code. The code is the column header. If this were excel, I would just use COUNTIF function (ok, this isn’t perfect cause I’m looking for an exact match). But an example of the cell I want to search is the following:

A-45, M-20, M-21, M-27, M-30, M-36, M-37, MA-30, MA-36, MA-46, WA-29

Now, if the column header matches any one of these codes that are all contained within one cell within one specific column, I want the cell with that specific header column to be 1, otherwise 0.

Any advice on this would be GREATLY appreciated. Thanks!

Hello @rinaldiinjapan

‘String Manipulation’ node with regexMatcher() function can do this job returning Boolean.

regexMatcher($column1$, 
 	join( 
 		"(.*?("
 		, $${Scolumn_header}$$
 		, ")[^$]*)$"  
 	) 
)

You would probably will need to call your column header as variable, then prepare it in advance.

I’m assuming this is the content of a cell, but if this is arranged in a column, the same code can do the job as well.

BR

For testing intentions you can replace the proposed variable with one hardcoded text code.

regexMatcher($column1$, 
 	join( 
 		"(.*?("
 		, "MA-36"
 		, ")[^$]*)$"  
 	) 
)

I hope this is clear enough.

BR

PS.- A Rule Engine node can do the same work with the following code:

$MA-36$ LIKE "*MA-36*" => 1
TRUE => 0

or optionally…

$MA-36$ MATCHES "(.*?(MA-36)[^$]*)$" => 1
TRUE => 0

BR

Thank you so much gonhaddock. As I look over your response and the configuration page of the regex node am I correct in that this code will need to be entered for every column? Is there are a way to create a loop to avoid having to re-enter code for each column? For clarification, there is one column with anywhere from 1 to 87 codes with the header “Code List”. Then I have 583 columns with a code such as the example above,“MA-36” as a header so the goal is to have either a 1 or 0 entered in the MA-36 column dependent upon whether MA-36 is included under the “Code List” column of that same row. Thanks again for helping me out with this. If there is a KNIME online course that you think will help please let me know.

I can suggest to use

node to get pares
Header_code1, Value1
Header_code1, Value2

Header_code2, Value1
Header_code2, Value2
so on
Using rule engine like suggested by @gonhaddock, you will get
Header_code1, 1
Header_code1, 1

Header_code2, 0
Header_code2, 1

Add RowiID
Use

node with “first” grouping function to get original structure with headers.

2 Likes

Hello @rinaldiinjapan
As your question was in the abstract about a cell case, my answer was pointing to the concept…

Now, in a workflow @izaychik63 is giving you another interesting concept to apply, long format pivoting all the columns and working with just 2 columns $ColumnNames$ and $ColumnValues$; this type of solution will avoid messing up with variables. Said that, the other approach -as you pointed- is the Loop End (Column Append) with variables.

About the training… you have the Level 1 Certification I think it is for free; and It is up to date.

There was another good start called KNIME Beginner’s Luck book , it is some years old, I can see the PDF online. You may get a promotion code contacting the team.

BR

1 Like

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.