Sequence classification by deep learning to predict taxi driver path

natanzi · March 4, 2023, 1:08am

Dears Knimmer,
I’m going to to create a model that can predict the driver responsible for a given trajectory within a set of GPS data. The data set includes the daily driving trajectories of five taxi drivers over a period of six months. The trajectories to be classified contain all GPS records of a driver in a single day.
I use data collected from five drivers over five days, resulting in 25 records and feeding neural network. I have used Pytorch to implement to code and got the result. in parallels I want to use knime and get result as well to find out which model is better.

Dataset Description

plate	longitute	latitude	time	status
4	114.10437	22.573433	2016-07-02 0:08:45	1
1	114.179665	22.558701	2016-07-02 0:08:52	1
0	114.120682	22.543751	2016-07-02 0:08:51	0
3	113.93055	22.545834	2016-07-02 0:08:55	0
4	114.102051	22.571966	2016-07-02 0:09:01	1
0	114.12072	22.543716	2016-07-02 0:09:01	0

Now, I need some tips to start this job with KNIME, I would appreciate if somebody can get me a clue.

jinwei_sun · March 7, 2023, 6:36pm

Hey natnazi,

That’s a great implementation. KNIME currently has more integration for Keras instead of PyTorch. To get started with, you can view here for some documentation of Keras in KNIME. To view a list of nodes and sample workflows, you can click here.

Hope this help.

Best,
Jinwei

natanzi · March 8, 2023, 4:08am

Dear @jinwei_sun Thanks for your replay, in order to do it, I need to meet the below tasks:

Merge the CSV files to create a single dataset. ( Done by Knime Loop file reader )
Preprocess the dataset by dividing the GPS locations into grid cells. (How?)
Further preprocess the dataset by dividing a trajectory into two sets of sub-trajectories, seeking and service ( based on the status which is 1 means taxi is occupied and 0 means a vacant taxi.) How?
Would you please get some advice how to implement 2 and 3?

BR,
Milad

jinwei_sun · March 8, 2023, 7:42pm

Hey Milad,

If you need to convert GPS locations to grid cells, you can use the Column to Grid node. By selecting both longitude and latitude columns, this node will create grid cells according to your specifications. However, some modifications may be necessary depending on your specific use case.
To split your dataset based on a specific column, you can use the Row Splitter node. To use this node, select the “status” column as the basis for the split and specify “0” as the pattern to match.

Feel free to leave a reply if you have any further questions.

Best regards,
Jinwei

system · June 6, 2023, 7:43pm

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.