Singleton DataCell?

dnaki · April 1, 2021, 9:28pm

Hi. I created a custom cellfactory which produces potentially many new columns (~100).
Many of the cells will be populated with empty strings (instead of missing values… customer preference).
In the interest of reducing memory consumption, is it OK to create a single instance StringCell emptyStringCell = StringCell("") and reuse that instance for all cells that require an empty string?

I implemented it this way and it seems to work fine, but I’m wondering whether there might be any consequences I’m not aware of, or whether this might be discouraged for some reason.

Thanks

izaychik63 · April 1, 2021, 10:33pm

This node designed for the purpose you described.

dnaki · April 1, 2021, 11:51pm

Thank you for the suggestion, @izaychik63. The Missing Value node would indeed be one way to address the issue. However, we would like to avoid having to always follow our custom node with a Missing Value node, so I’d still be interested in knowing the answer. Kind regards, -Don.

qqilihq · April 2, 2021, 6:03am

Hi,

singleton instances don’t seem to be a problem. See e.g. the source code here which does the same: org.knime.core.data.MissingCell.INSTANCE or org.knime.core.data.def.BooleanCell.TRUE.

– Philipp

MarcelW · April 2, 2021, 10:46am

Hi @dnaki,

One thing you may want to consider is that, as soon as a table has been written to disk and read back in, your singleton cell will be replaced by ordinary string cell instances. BooleanCell and MissingCell have custom (de)serialization logic in place to prevent that from happening (see here, for example).

Marcel

dnaki · April 2, 2021, 3:16pm

Ah, okay, that’s an important consideration. Thanks. -Don

system · April 9, 2021, 3:16pm

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.