Execute failed: Illegal char <:> at index 4 KNIME 5.3.3

Marthael · December 4, 2024, 11:20pm

After updating KNIME to 5.3.3 I’m getting the following error for a Transfer Files node and also for some CSV Reader nodes.

Execute failed: Illegal char <:> at index 4: file:/C:/Users/angel/AppData/Local/Temp/knime_Multiple_Return_17182/knime_container_20241204_12886502894504530348.parquet

This mention a .parquet file but this workflow does not use any parquet (reader or writer).

If I try the same configuration but in a new workflow it works, but if I move the nodes from the new workflow to the original it breaks again. It’s like if the workflow is corrupted or something like that.

If I try to update the Transfer Files node, it won’t let me choose a folder. If I try the same in a new workflow, then it works.

Marthael · December 4, 2024, 11:31pm

Same error for a rule-based row filter node in another workflow:

Execute failed: Illegal char <:> at index 4: file:/C:/Users/angel/AppData/Local/Temp/knime_daily_actum_rec_17184/knime_container_20241204_862886561881133400.parquet

hotzm · December 5, 2024, 9:19am

Hi @Marthael,

are you using “KNIME Column Storage (based on Apache Parquet)” as your Row-based Table Backend implementation? It’s not actively maintained anymore and should not be used. The default Row-based Backend or Columnar Backend (which does not use Parquet), should be used.

Does using the default Row-based Backend implementation or uninstalling the extension outright solve the problem? In both cases, I think you have to reset the full workflow, though .

Don’t worry, the Parquet Reader and Writer are part of a separate extension “KNIME Extension for Big Data File Formats” and are unrelated to your error.

hotzm · December 5, 2024, 10:52am

Also, can you check the KNIME Log for some more info around the error message you are seeing here? This could help to pinpoint the problem quicker.
You can find it via “Menu” → “Show KNIME Log in File Explorer”.

mlauber71 · December 5, 2024, 12:22pm

OK interesting. Mabye there should be some sort of warning if there are workflows that still use it. From my recollection there always have been problems with the parquet based data containers. Parquet itself is a fine format but maybe not all KNIME types and functions are well suited to be stored with that.

hotzm · December 5, 2024, 12:50pm

To be clear, there’s nothing wrong with the extension in terms of correctness. At least we’re not aware of any data integrity/correctness bugs resulting from using it. It’s just that it has no material benefit that I know of over the default KNIME table storage (custom + snappy compressed) for the Row-based backend, so I advised against using it. My intention was to isolate the cause for the error message and to see if the mention of .parquet is coincidental and would also occur with .bin.snappy or .knable (if other Table backends are used).

Marthael · December 5, 2024, 5:52pm

I was using the Columnar Backend. Already tried changing to Row-based Backend but still getting the same error.

Uninstalling the KNIME Column Storage (based on Apache Parquet) solved the problem

Marthael · December 5, 2024, 5:57pm

For anyone who wants to know if you have the extension installed go to Menu > Install Extension > Click Already Installed? > and there you can check.

Thank you @hotzm

hotzm · December 5, 2024, 9:37pm

I think your particular workflow still used the Row-Backend with Parquet storage. It’s accessible via the gear icon when you view the workflow metadata.

The setting you can change via the preferences is only the default backend setting, but not the storage used in the row backend.

That’s why it worked for new workflows but not as soon as you pasted the node into the affected workflow, because that one was still using the parquet storage.

I’m glad it’s resolved, but I’ll still try to figure out what the root cause is. Thank you for responding that it was the problem

To be honest, it’s a bit confusing that the one thing is named „Columnar backend“ and the other is „Column Storage based on Parquet“, where the latter is just the on disk format for the Row-based backend. The on-disk format for the Columnar Backend is actually Arrow (another excellent data format).

system · December 12, 2024, 9:37pm

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.