DataTableSpec.getName() not finding a column that really is there

Here is the code

private DataTableSpec getJoinSpec(DataTableSpec leftSpec, DataTableSpec rightSpec) {
	DataColumnSpec[] cols = leftSpec.stream()
		.filter(s -> rightSpec.containsName(s.getName()))
		.toArray(DataColumnSpec[]::new);
	return new DataTableSpec(cols);
}

The left columns are { S#,P#,SNAME }, the right columns are { COLOR,P# }. The function finds no match, so cols is an empty array and the function returns a table spec with no columns. This is wrong!

Debugging right into the Java library routines shows that the P# value really is there, but is not found in the lookup on the DataTableSpec hashmap. I can’t see why, the code is too optimised.

Is there some way the hashmap can get damaged? Some oddity I haven’t thought of? Anything?

Here is a screenshot from the debugger:

image

You can see that P# in leftSpec has an id=14978 and P# in rcols has an id=666. Why is that?

If I read that correctly, it means that these two string values are the same value but different objects. They will compare equal() but not ==. It means they may (will?) have different hashcodes, and hash maps will not work right. Which is exactly what I’m seeing.

What on earth is going on here? How do I fix it?

Found it!

It’s a bug in the Knime File Reader, which mishandles a CSV file with UTF-8-BOM encoding.

See bug report.

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.