Merge two datasets and have new column output

Hi there,

I’m new on KNIME so hopefully I am putting my topic in the right place.

I have two csv. datasets. The data file “[data.csv”] contains the electricity load each 30 minutes in years 2014 and 2015. Each row corresponds to the data of one day. Columns 1, 2 and 3 indicate the year, month and day of the measurements, respectively. The remaining columns show the electricity loads in 30 minutes blocks on that day. More specifically, T1 denotes the load of period 00:00-00:30, T2 00:30-01:00, and so on, and T48 is the load of the last 30 minutes of the day, i.e. period 23:30-00:00.

The file “[Holidays.csv]” contains dates (in year-month-day format) of public holidays of years 2014, 2015 and 2016.

I want to combine these two datasets in KNIME and create a new column. In this column I want to have an output of 1 when the date of the two datasets are the same ( so when one of the days is a holiday) and I want to have an output of 0 when the specific day is not a holiday.

I will add the two datasets within this topic.

I hope someone can help me with this

Two datasets.zip (22 Bytes)

Hi @StudentEt and welcome to the forum.

First of all, the zip file you posted is empty. But that’s OK, as I think the following general approach will work.

You’ll want to load in your CSVs with a File Reader node, and then use a String Manipulation node to combine the year, month, and day fields from the first file into a single column. You can then use a String to Date&Time node to convert to a KNIME date format. You’ll want to do a similar conversion for the holidays file. Then, use a Joiner node to combine the files based on the date. If you want to create a holiday flag, use a Rule Engine node.

Give that a shot, and if you have questions, feel free to post your example workflow here for additional help. :slight_smile:

3 Likes

Hi,

I was able to do everything until the joiner node. I will try to explain once more what I want to achieve here:

I have a full dataset (data) with all days of the year of 2014 and 2015, and a holiday dataset from 2014 till 2016. What I want is the full dataset (data) in one column and in a second column (lets call it holidays) a binary value (0/1) where 1 is that that specific day is a holiday and for all other days I want to have a 0 in that column. I will post my current workflow and I will try to upload the ZIP file once more. KNIME_project.knwf (17.9 KB) Two datasets.zip (48.1 KB)

Hi @StudentEt and welcome to the KNIME forum,

I think you have 2 options:

1- Merge the first 3 columns in table 1 and convert it to date column or split the date column in table 2 to have identical columns in both tables. Then join tables based on date using left join where table 1 is connected to the top port and uncheck “remove joining columns from bottom input” option. Then use the Rule Engine node with this expression where column 1 is the date column from table 2.

MISSING $column1$ => 0
TRUE => 1

2- Using Rule Engine (Dictionary) to create rules based on holidays.

Your zip file is empty.

:blush:

4 Likes

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.