Knime 4.4.1 does work with my 4.3.4 workflow

Hi @mlauber71 and @gab1one

I am importing a CSV file that is the output from a preceding workflow.

As some feedback ,

I set up a VM with a clean install of Knime 4.3.4 and ran the same flow with the same CSV file input and the output worked exactly as expected.

I want to do 2 further tests before I call foul.

  1. Set up a new VM with a clean install of Knime 4.4.1 and do the test with the same flow and the same CSV file and confirm the result. This will ensure that there is not some issue with the install on my computer that has had a number of Knime upgrades in the past 2-3 years.
  2. Rebuild the same flow from scratch on a clean 4.4.1 build. It may be that the flow built in 4.3.4 has some issue somewhere and clean 4.1.4 nodes resolve the issue.

If both of the above tests fail I will export the flow with anonymised data and hopefully, with great community and Knime Team support, figure out what is happening and what the solution might be.

Thank you for all the support.

tC/.

3 Likes

@TigerCole CSV typically is an insufficient way to store information (preserving data types), especially if the data is from another KNIME (?) workflow. KNIME .table, Parquet or ORC might be much more stable as well as local H2 or SQLite database. If you absolutely must have a TEXT file ARFF could be another choice preserving structure and data types.

3 Likes

Hi @mlauber71

Thanks for the suggestion on alternative formats to save data for forward flows. The SQLlite idea sounds good because I am using it for another application. I will give it a try.

tC/.

1 Like

@TigerCole SQLite offers a lot of benefits and it does integrate with KNIME. And also with R and Python BTW.

The driver used by KNIME is not the latest one but you can change that easily:

2 Likes

Indeed, CSV is an insufficient format, in a similar way than JSON. On the other hand, CSV is widespread and non-proprietary. Using SDMX or a proprietary format is not always an option, and let us not even mention the unpredictable Excel format.

If type safety is a concern for CSV, one way to deal with it is to insert a “dummy row” just below the header row (so basically row number 1), in which, for each column, there is a value of either abc or 123 to push that column into either string or numeric format. For date columns, using ISO date formatted strings is probably the best choice. After importing, simply filtering the said row will accomplish the trick.

Congratulations btw for having included more options for enforcing type safety in the readers of the recent KNIME version.

1 Like

Hi @gab1one

I found the problem… I have a java snippet in the Knime flow that takes the values in a Latitude array and Longitude array and creates a WKT polygon.

The values in the arrays are double with “.” as the decimal. For some reason in 4.4.1 when node java snippet creates the WKT polygon, which is a string, the “.” in the double is changed to a “,” is incorrect.

I don’t know why it does this in Knime 4.4.1, this does not happen Knime 4.3.4. I have not changed any of the code in the snippet.

Any suggestions why this might be happening?

tC/.

language and regional settings if you are on Windows. You have set “,” as decimal separator. It’s possible in 4.3 it just used “.” by default and in 4.4 it was “fixed” to use what you have as OS setting.

2 Likes

Hi @kienerj

I wish it was that simple, it was the first thing that I checked. The default decimal for my computer s definitely “.” … checked, confirmed, and tested.

tC/.

Hello @TigerCole,

Maybe due to change in Java version with KNIME v4.4.0? (see here). Don’t know your Java code and if you can address it there (or is it related to OS settings) but can’t you replace comma with dot using String Manipulation node for example?

And please, don’t open multiple topics for same issue.

Br,
Ivan

1 Like

Hi @TigerCole , not sure why it is happening, but may be you can force it to be “.” instead of “,”. There are a few ways to do this, I think it could even be done in Java. If you can show what you are doing in your java snippet, someone for sure can help.

Alternatively, you can always manipulate the results as @ipazin suggested via String Manipulation.

1 Like

Hi @ipazin

My apologies about the new topic. I assumed a new issue and a new topic.

It does not look like a change in the Java version is an issue.

I probably can change the “,” with a String Manipulation node but it is an extra step in a flow that is processing 12 million rows that I didn’t need in 4.3.4 and I would like to figure out why it is happening. Adding the extra node is like a bandaid on a bleeder.

I don’t think that the Java code in the snippet node is complicated.

// Your custom imports:
import java.util.List;
import java.util.ArrayList;
import java.io.StringWriter;
import java.util.stream.Collectors;
import java.text.NumberFormat;
import java.text.DecimalFormat;
// system variables
public class JSnippet extends AbstractJSnippet {
  // Fields for input columns
  /** Input column: "latArr" */
  public Double[] c_latArr;
  /** Input column: "lngArr" */
  public Double[] c_lngArr;

  // Fields for output columns
  /** Output column: "the_geom" */
  public String out_WKT;

// Your custom variables:


// expression start
    public void snippet() throws TypeException, ColumnException, Abort {
// Enter your code here:

//Target: MULTIPOINT ((LON1 LAT1), (LON2 LAT2), (LON3 LAT3), (LON4 LAT4), (LON5 LAT5), (LON6 LAT6))
NumberFormat fmtr = new DecimalFormat("###.###############");      
List<String> points = new ArrayList<>();
for (int idx=0;idx<c_latArr.length;idx++){
	double lat = c_latArr[idx];
	double lng = c_lngArr[idx];

	StringWriter segment = new StringWriter();

//	segment.append("(");
	segment.append(fmtr.format(lng));
	segment.append(" ");
	segment.append(fmtr.format(lat));
//	segment.append(")");

	points.add(segment.toString());
}

//add the 1st lati and longi  to the tail to close out the multipoint spec.
{
	double lat = c_latArr[0];
	double lng = c_lngArr[0];

	StringWriter segment = new StringWriter();
//	segment.append("(");
	segment.append(fmtr.format(lng));
	segment.append(" ");
	segment.append(fmtr.format(lat));
//	segment.append(")");

	points.add(segment.toString());
}

out_WKT = "POLYGON (("+points.stream().collect(Collectors.joining(", "))+"))";

// expression end

Is there anything above that may be the cause. or a change that may resolve the problem?

If an export of the node will help, I can send it with a data sample.

Thanks.

tC/.

1 Like

Hi @TigerCole , since I don’t have the proper data and column type, I just used a single record such as this:
image

And of course, since I’m not dealing with list/arrays, my code is very simple, but I’m still using the format that you used:

NumberFormat fmtr = new DecimalFormat("###.###############");      

double lat = Double.valueOf(c_column1);
double lng = Double.valueOf(c_column2);

out_WKT1 = fmtr.format(lng);
out_WKT2 = fmtr.format(lat);

out_WKT = "POLYGON (("+out_WKT1.toString()+", "+out_WKT2.toString()+"))";

I decimals stay as “.”:

Running on Knime 4.4.1.

1 Like

Hi @bruno29a … I almost want to say that this is specific to my computer but I have replicated it on other computers as well.

What happens if you change Column1 and Column 2 to double?
Would you mind trying that?

tC/.

Hi @TigerCole , sure I can do that. The reason why I did not use double originally is because I don’t know how many precision will Knime keep.

I will need to modify some of the code, so give me a few mins.

Also, you can try these on your side, just create a table with dummy values and type, and just copy my snippet.

Here you go @TigerCole .

I added 2 more columns:
image

So, that’s what Knime shows. However, the values are there.

Code modified as (basically added the last 2 lines):

NumberFormat fmtr = new DecimalFormat("###.###############");      

double lat = Double.valueOf(c_column1);
double lng = Double.valueOf(c_column2);

out_WKT1 = fmtr.format(lng);
out_WKT2 = fmtr.format(lat);

out_WKT = "POLYGON (("+out_WKT1.toString()+", "+out_WKT2.toString()+"))";

out_WKT3 = fmtr.format(c_column3);
out_WKT4 = fmtr.format(c_column4);

Results:

Hi @bruno29a

Here is my output…

“Broken” output after I did a copy and paste of your code. I think that I am going to have to speak to a developer and find a way to force the “.” when doubles are converted to strings.

It does not happen when I use a “number to string” node so it must be something in this particular snippet.

Thanks for all your effort and help.

tC/.

1 Like

Hi @TigerCole,

You need to look at the region and locale settings of your computer. I recalled a similar issue on Windows computers that had a US / UK locale but German number format configured. Java 8 ignored this setting, while Java 11 does not.
https://support.microsoft.com/en-us/office/change-the-windows-regional-settings-to-modify-the-appearance-of-some-data-types-edf41006-f6e2-4360-bc1b-30e9e8a54989

You can force a specific locale during conversion, to ensure consistent results, by setting it explicitly in your code, like in this example from stackoverflow:

DecimalFormat df2 = new DecimalFormat("#.##");           
df2.setDecimalFormatSymbols(DecimalFormatSymbols.getInstance(Locale.ENGLISH));

best,
Gabriel

5 Likes

Hello,

All fine @TigerCole. But doesn’t look like a new issue. More like a cause is found that now needs to be sorted out. And I’m sure it will be :wink:

Br,
Ivan

1 Like

I agree with @gab1one , this kind of behaviour is usually because of the region setting. For example, I know that comma is used instead of dot in French.

@TigerCole I think the setDecimalFormatSymbols(DecimalFormatSymbols.getInstance(Locale.ENGLISH)) suggested by @gab1one should work as it looks like it will force the Locale to English.

3 Likes

My apologies for the delayed response to the comments. I am not sure why, but for some reason having my computers regional settings set to “South Africa” seemed to be the problem. I changed to UK and configured it to work for me and “voila” my workflow works.

I have spoken to our developers who are really experienced Java developers and they could not figure out why South African regional setting caused a problem. It may be that Java is reading the default for region and not picking up the changes in my local configuration.

Thanks for all the support. It is much appreciated.

tC/.

1 Like