Excel Reader: Issue with data types of input columns

wurz · October 26, 2022, 9:31am

Hey everybody,
I have an issue with the configuration of the Excel Reader node and I’m not sure if it’s on me or the node. I want to describe the issue with an example:

I want to import an Excel sheet containing sample data. There is a column called “Sample Name”. The user enters usually a string type sample name like “my_sample_name”. Now I got an Excel sheet with numeric sample names like “1”, “2”, “3” etc. For me, these are still samples names and they should be extracted as strings (as specified in the Transformation tab of the Excel Reader configuration). But the nodes extracts them as integers, which leads to fails downstream.

I always have a lot of troubles with the data type transformation in the Excel Reader node. The output data types change every time there is a small difference, although I specifically set the data type to string and check the Enforce types box. The node seems to prioritize its automatic data type detection higher than the manual data type configuration by the user.

I would like to force the Excel Reader node to extract everything in string format and do the parsing on my own as a workaround. How can I do this?

Thanks in advance.

Best,
Johann

knimediger · October 26, 2022, 4:29pm

@wurz

I’m not aware of a config of the xls reader.
As workaround you might use the Number-to-String-Node Number To String — NodePit after treading the xls to convert the number column as you like.

mlauber71 · October 26, 2022, 6:58pm

I once built this to force all columns from an excel file to be imported as strings … using R:

system · January 24, 2023, 6:58pm

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.