dear all,
could you please help me fix this issue.
My table c. 300k raws.
some cells contain question marks, but the number of them are different and sometimes, there are spaces between them.
I’d like to filter these rows in order to correct these strings but do not know the syntax to do it.
thank you for your help
Esmé
Hey there,
can you provide some more information? Do you mean “red question marks” that indicate “missing values”?
I think a screenshot as a minimum - better even some example data can help people to help you.
In general you can handle Missing Values with the Missing Value node:
E.g. you can replace missing with blanks or you could remove the rows…
How to best handle your case will depend on additional details you’d have to provide
Dear Martin,
sorry, I should have been more precise.
It is not the red question mark Knime is putting when the value is missing. It is the value itself I can find in the original table. The hard value entered by the end user.
these values could be :
??? ???
???
?CHICO’S FAS INC
?AND
I would like to filter all the column raws containing this question mark and remove them.
If the raw contained only question marks → remove them all
e.g. ??? ??? → leave the cell blank
if it contained a string and a question mark → remove the question mark and keep the character string
e.g. ?CHICO’S FAS INC → CHICO’S FAS INC
Thank you
I see - thanks for clarifying!
Sounds like a job for string manipulation (multi column):
You can use this “formula” to replace all “?” with “”:
replace($$CURRENTCOLUMN$$,"?" ,"" )
Workflow:
stringreplace.knwf (73.1 KB)
Many thanks Martin, the string manipulation is working well.
Now I want to exclude these rows from the output table.
I used a row filter, selected my column and the “is not missing” operator. But I still have the same number of rows. could you please tell me how to proceed?
many thanks
Esmé
Hi @Esme61.
KNIME considers “” as an empty string, not a missing value.
You can try rules base row filter node with expression
your column = “” => FALSE
TRUE=>TRUE
And exclude FALSE values from output
Br
If you remove the ? from a string like this “??? ???” its still not blank. The resulting string still has the space between the ? marks. Here’s a workflow with two methods to filter the blank rows. The top branch is more flexible. The bottom branch has a deterministic number of spaces although the number of spaces can be changed in the Column Expressions node.
stringreplace rev 1.knwf (99.6 KB)
Input

Output (both methods - assuming max two spaces for Column Expressions)

thank you to all of you for your help.
@rfeigel, I implemented your solution, the one with the string cleaner et the row filter (the second one is far too complex at my stage of expertise of Knime
Many thanks to you all
Esmé
You can wrap toNull() around the “replace” in the string manipulation. Then the empty looking cells should also be exported as logical NULL with the KNIME typical red question mark. If this is too complicated, just add another String Manipulator after the one you already have an call "toNull([The column(s) you want to NULL]) and go for it, e.g. with an “Empty Row Filter”. Happy KNIMEing
This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.