Java Snipplet (maybe)

AbbTheresa · May 16, 2020, 6:12am

Hi there,

I am stuck with my data analysis on following problem:

I have one variable “car” and another called StartLongitude (and latitude). I now want to calculate the EndLatitude (and Latitude) for the car variables, at least for as many as possible.

An example of an extract of the variables would look like this:
Car StartLongitude
1 48.3678
1 48.377798
1 48.1009
2 48.000099
2 48.98220
2 48.876232756453
2 48.5367
2 48.729486182375

These are already sorted according to start time. That means that car 1’s second position of StartLongitude is its first position of EndLongitude. This means that for EndLongitude, it is the data of StartLongitude just one row up, expect the last value of each car.

Car StartLongitude EndLongitude
1 48.3678 48.377798
1 48.377798 48.1009
1 48.1009
2 48.000099 48.98220
2 48.98220 48.876232756453
2 48.876232756453 48.5367
2 48.5367 48.729486182375
2 48.729486182375

That means that I can calculate (in theory) all EndLongitude values, except the last end-position of all cars.

I know this is a quite specific problem. I have no idea which nodes to use, and since I am not familiar with java, would have also no idea how to put that into a java code.

I would very much appreciate it, if there is anyone that can help. Thank you so so much to all of you in advance!!

Best
Theresa

pigreco · May 16, 2020, 6:32am

Hi,
if your records are previously sorted, you can use a LagColumn Node to create a new column with same data shifted by one row.
You can repeat two times (lat and long)
Bye!

HansS · May 16, 2020, 7:10am

Hi @AbbTheresa

I don’t know much about JavaSnippets.But this is a way to solve your problem. calc_difference.knwf (31.1 KB) . It uses a Lag Column node inside a Group Loop (car). But to have the Lag Node work properly you need to sort your file in descending order.

gr. Hans

AbbTheresa · May 16, 2020, 8:21am

Hi @pigreco and @HansS,

thank you so much for your answer!

@HansS: I used your workflow and it worked perfectly, thank you!!

Maybe, you could also help me with an additional question:
I tried to calculate the endLat because the endLat column I have has lots of missings. The endLat I calculated now - thanks to your workflow - does also have some missings (1 for each car). If I now want to insert the value from the endLat column I had in the first place into the column that I have just created for the values that are missing, which node would you recommend me to take?

In cases where I don’t know how to proceed, I always think of a javasnippet, but maybe there is another way?

Thank you again so much in advance!

Beste
Theresa

quaeler · May 16, 2020, 3:20pm

You can do this with a Column Expressions node - it lets you do conditional cases without needing to produce real Java code.

calc_diff_v2.knwf (35.2 KB)

ipazin · May 18, 2020, 2:41pm

Hi there @AbbTheresa,

check here options how to simulate lag column with negative value in order to avoid using Sorter node twice (on the larger data set it can take a while): Lag Column with Negative value?

To deal with missing values I would recommend Rule Engine node. Fast and simple to use

Br,
Ivan

AbbTheresa · May 19, 2020, 8:43am

This helped! Thank you!!!

system · November 17, 2020, 8:52pm

This topic was automatically closed 182 days after the last reply. New replies are no longer allowed.