Column Expressions Guidance: Dynamically set Column Type

Hi all,

while working with the Column Expression Node I struggle a lot, maybe because of the lack of documentation, to dynamically define the column type (i.e. string, int etc.). I overwrite via a flow variable expressions > element 0 > outputType > cell-class but to no avail.

I tried with the:

  1. exact type declaration like “Number (integer)” or “String”
  2. an array index (starting at 0) like 12 for “Number (integer)” or 20 for “String”
  3. a cleaned version or the type name like “number_integer” or “string”

All to no avail. I then tried to locate the actual code of the node in GitHub or Bitbucket, to possibly derive the required values, but to no avail either. I hope anyone has an idea.

Many thanks in advance
Mike

Hi @mwiegand

You need to use the actual data class. For string it’s org.knime.core.data.def.StringCell , for integer it’s org.knime.core.data.def.IntCell.

image

You can derive this information from the Flow Variable section by assigning a temporary field to it. How to do this:

Wherever you create this, if you pass this value to the Column Expression, cell_class flow variable like you have found already, it will work.

Hope this helps!

3 Likes

Awesome :slight_smile: Thanks a lot @ArjenEX. Do you happen to know a place where I can find all data types? Iteratively going through each individual type ist tiring to say the least.

PS: Trying to find that information in the GitHub Repo doing a grep -rin core.data.def . … WIP

All of the ones I’m not sure right now, need to dive into the SDK for that. This is part of it indeed. knime-core/org.knime.core/src/eclipse/org/knime/core/data/def at master · knime/knime-core · GitHub but your approach probably gets a bit more.

There we go. But, based on the Test Data Generator – KNIME Hub node there are substantially more data types present than presumably declared :confused: Sure I made a mistake … do you spot it?

Data Types from Test Data Generator

Boolean value
Date and Time
Duration
List (Collection of: Boolean value)
List (Collection of: Date and Time)
List (Collection of: Number (double))
List (Collection of: Number (integer))
List (Collection of: Number (long))
List (Collection of: String)
List (Collection of: String)
List (Collection of: URI)
Local Date
Local Date Time
Local Time
Number (double)
Number (double)
Number (integer)
Number (long)
Period
Set (Collection of: Boolean value)
Set (Collection of: Date and Time)
Set (Collection of: Number (double))
Set (Collection of: Number (integer))
Set (Collection of: Number (long))
Set (Collection of: String)
Set (Collection of: String)
Set (Collection of: URI)
String
String
String
URI
Zoned Date Time

Data Types extracted via grep from Knime GitHub Repository

core.data.def.BooleanCell
core.data.def.ComplexNumberCell
core.data.def.DefaultCellIterator
core.data.def.DefaultRow
core.data.def.DefaultRowIterator
core.data.def.DefaultTable
core.data.def.DoubleCell
core.data.def.FuzzyIntervalCell
core.data.def.FuzzyNumberCell
core.data.def.IntCell
core.data.def.IntervalCell
core.data.def.JoinedRow
core.data.def.LongCell
core.data.def.StringCell
core.data.def.TimestampCell

Cheers
Mike

1 Like

PS: I updated the workflow to manually extract all data types from the Column Expression node per your guidance. Interestingly, though, all retrieved values are of type org.knime.core.data.def.StringCell.

Previously it worked. am I stupid or is this a bug?

I cannot open it due to the bash so someone else has to jump in. I’m not sure what you’re using but it’s giving an error here.
image

You probably got to enable a repo. I assume it’s:

KNIME Community Extensions (Experimental): https://update.knime.com/community-contributions/4.6
Palladian: https://download.nodepit.com/palladian/4.6

Added Palladian as well since it’s an awesome extension!

Something completely different.

But anyway, mistake is on your end. You forgot to select the proper output type for all the column expressions, they are indeed all set to string :wink:

1 Like

:+1: Thanks for pointing that out. My mistake falls into the same category as missing semicolons or quotes xD

Btw, I managed to identify presumably all data type definitions by poking around the repo and recognizing that a more fuzzy search for “.data.” seems to do the job. I will fine tune my workflow and update that later on.

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.