Exclude missing values from the end of a dataset

Any ideas of how I could exclude missing values that exist only at the end of a dataset, and keep other missing values existing in between? The variable is a double (numeric with decimal points). I have multiple variables/columns, so manually applying this using a row filter by row number is not interesting.

Could you just determine the last rowid, split the database, handle the last row and bring it back together?

kn_forum_last_row_missings.knwf (9.2 KB)

I have clarified my question with a screenshot, and I have included the workflow (basically a dataset). Also, please keep in mind that my original dataset has 124 variables/columns to process.
Hope it helps.
QUESTION_HOW_TO_Exclude_Only_Last_NA_values.knwf (6.6 KB)

Hi @amars

See this workflow Exclude_Only_Last_NA_values.knwf (66.2 KB) . Hope it helps.


I also built something, it might be too complicated :slight_smile:

Exclude_Only_Last_NA_values2.knwf (48.9 KB)


Under which extension is the “missing table row to variable”? I cannot find it in order to install it.

It is in the KNIME core (since version 4.x). There is a deprecated one from former versions. You might use that one or update to the latest version of KNIME.

1 Like

Thank you so much it does work! However, I had to reinstall knime to make it work, and I lost my previous workspace. I am an R-user and I am used to not loosing the workspace. Is there any way that I can retrieve it back?
Any help would be greatly appreciated.

The workspace should be there and you should be able to tell KNIME where it is as long as you did not have it in the same folder as the software.



I did found it at another location. Thanks


Again, thank you, and it does work but with some limitations:
I see that it reads only strings and not numeric value. I bypassed it using a “numeric to string” node and after finishing i reversed the data using “string to numeric” node. Perhaps, you know how to modify it to work with numeric as well without the above forementioned nodes modification.
In addition, is it possible to not freeze when it reads columns starting with “?” values? It would be nice to keep the beginning “?” values as well on each column.

For strings question is if you would have to modify it treating blanks as missing. I am not exactly sure. I would have to try.

I am not sure about the ‘freezing’ part. Do you mean the first value is a ? (that is missing) or the column name has a missing (that might affect the regex). One could try and employ the column list loop start node although I am not familiar with that.

Maybe you could provide an example that represents the challenges you mentioned.

Let me revise/amend my previous comment according to new observations and tests:
Your workflow is fabulous!
It does work with numeric or string.
It does not freeze.
It does work with columns starting or ending with “?”.
The problem is when column names have parenthesis for example: "Temp_C_(mean)"
Is there a way to upgrade this workflow to accommodate those exceptional names?
Below, I have a screenshot and the workflow with the column name that includes parenthesis.
ColumnNameWithParenthesis_issue_MISSINGVALUEStopic.knwf (42.7 KB)

A post was split to a new topic: Missing Values Configuration

Besides keeping the missing values at the end, is there any way to replace the missing values at the beginning as well?

OK here comes the latest instalment. It should handle:

  • column names with brackets
  • it uses @HansS idea of column list loop start (much more comfortable)
  • stores the Missing value rules in (non-standard) PMML and applies them to strings and numbers
  • uses first the following value then the previous one if there are still missings in the non-empty lines

Please check if everything works and check if the order of Previous and Following is the one you want (otherwise switch the order).

Exclude_Only_Last_NA_values3.knwf (46.7 KB)


This topic was automatically closed 182 days after the last reply. New replies are no longer allowed.