Split data into separate Excel with one column, then to separate sheet with another column?

Hi, I want to split my data table into many Excel with multiple sheets in it based on two different column. For example, If I have something as follow:

image

I want to split the data first by Vendor into separate Excel, so the Excel name would be the Vendor name (1, 2, 3, …). Then for each Excel, each tab would be based on each Country group. So for Excel sheet “1”, there is a tab for Canada and a tab for USA.

I’m not sure how it can be done, and stuck at Group Loop Start. Please advice how I can make this possible. My data is around 120,000 rows with over 500 vendors so it might be a lot.

Can you upload a portion of the data for us to work with?

I’d recommend taking a look at the Solutions to Just Knime It Challenge 2 for some pointers. The challenge was to split a CSV file into several different CSV files and store them in different folders.

The thread is here. My solution is here

Your problem is similar. You’d need to use flow variables to control tab creation.

2 Likes

Hi @elsamuel. Thanks for linking me up to the Challenge.

What I’m trying to do is very similar to the challenge, but instead splitting them into Excel, then to Sheet in Excel. I’m not sure how this can be done. Could you elaborate on the flow variables to control tab creation?

If you are looking of the data you can just use the data for the challenge, and split Year into Excel, months into Excel sheet.

Try this @billknguyen:

Pay key attention to the Excel Writer settings and the way the loops are nested.

1 Like

This is exactly what I needed, thank you so much! Just for learning purpose since I am very new, how did which part in the Excel Writer indicated what the Excel name would be? I see that for Sheet it is through the Flow Variable part. And also what does Create File/Folder Variable do?

1 Like

The Create File/Folder Variables node creates a path variable using the Year variable generated by the first GroupBy node. In this workflow, this new path variable is called base_folder and contains the filepath, filename, and extension.

image

In the Excel Writer node, you can specify that you want to use this base_folder variable for the path by clicking on the v icon next to the Browse button and choosing the appropriate variable:

1 Like

@billknguyen tried recently to illustrate how this works with a screenshot

Thank you guys for the explanation!

Just another question, when splitting the Excel, I’m trying to keep the formatting of the input Excel (ex: highlighting color of certain cells, etc.). Seem like it’s not currently doing that right now. Do you know if there are any solution to this?

@billknguyen it might be best to open a new thread for this question. But there has been a thread about excel and keeping formats here. Maybe you check the suggestions there.

With the Excel Cell Updater – KNIME Community Hub there is a dedicated node for that.

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.