from 3 files copy to 1 flie and different sheet

Hello KNIME Uers,

I want from 3 Excel files copy to 1 Excel flie and different sheet, but I created the workfiow is fail.

and why my workflow file_name and sheet is same, they shoud be different.
image


Hi @Banksy,

that’s because your URL to File Path (Variable) uses the flow variable Sheet to build the file_path. The Table Row to Variable Loop Start delivers the flow variable Path as type Path not as type String. But the URL to File Path (Variable) can only handle type String not type Path.

Best regards
Andrew

1 Like

Hi @Andrew_Steel ,
Thank you for your explain.
I changed my workflow, but it not succeed. Multiple Sheet are copied into one Sheet.
I don’t know what went wrong.


Hi @Banksy, you appear to have now moved your excel writer outside of your loop, so it’s now only writing once. To write to different sheets it will need to be inside the loop again with the sheet name controlled by a variable.

2 Likes

Hi @takbb ,
Thank you for your suggest.
The workflow can write to different sheets, but everything in the sheet is the same, same as frist sheet.
Did I configurate node wrong?

Hi @Banksy , Its difficult to know if you have something wrong with your configuration without being able to see you configuration :wink: But… there probably is.

I can’t tell exactly how you are doing your processing but here is an example workflow using (almost) the same nodes that you are, and it works, so hopefully this will give some pointers.

If you are still having problems, please upload your actual workflow so we can take a look and tell you where it needs to be changed.

My workflow finds XLSX files in the data\source_sheets folder located within the workflow’s folder. It then writes to the data\destination_sheets folder. It assumes there is only one sheet per source xlsx. It uses the source xlsx filename (without path or extension) as the output sheet name. These are all written to the AllSheets.xlsx file.

I used a Java Edit Variables to derive the output sheet name, but you can use other nodes to do the same job if you require.

Hope that helps
KNIME_Copy Multi Excel to One.knwf (58.3 KB)

3 Likes

Hi @takbb,
Thanks a lot for your help.
I configurated flow your node, but still not success.
So I upload my workflow and data source and wrong result here, Could you please help me find the problem?
Thanks.
KNIME_data collection.knwf (28.2 KB)
Cabinet 现场产出记统计-202104(Data source).xlsx (762.4 KB)
DC 现场产出统计-202104(Data source).xlsx (2.7 MB)
SINAMICS 现场产出统计-202104(Data source).xlsx (2.8 MB)
Data Result-every sheet is same-it’s wrong.xlsx (32.8 KB)

1 Like

Hi @Banksy, thanks for uploading the workflow and data sources. I’m struggling with the character set, as my setup isn’t well equipped for handling non-English characters, and I got some errors while running, but I think that’s a local thing to me.

However I think I’ve found a couple of things with your config, which was probably easier for me to find as I had to change the file locations to make it run on my system,

  1. Read Excel Sheet Names node
    image
    You haven’t got the path being set by variable, so it is reading the same file each time:

  2. Excel Reader Node
    image

You have the path being set by a variable, but not the sheet to be read, so it will be reading the first sheet of the excel file each time.


You need to tick the “Select sheet with name” and then have the sheet_name set by flow variable
image

So when you’re done it should look like this:

  1. Path to String (Variable) and URL to File Path (Variable)
    image
    Having seen now how the workflow is configured, I couldn’t now see a purpose for these. The suggestion to use them seemed a good one at the time, but actually the variables created by these are not now used, so they can be removed I think. They’re harmless if you leave them there, but potentially confusing.

Lastly, I’m sure you realise that a couple of the sheets in each xlsx have the same name, so when they copy, you’ll just be getting the sheet from whichever was the last file to be processed. If that matters, you’d need to change the sheet names in each spreadsheet, or else have a process to deal with it such as adding an iteration number to the output sheet.

I put through some new test files of my own (that my system could handle) after the making the above changes , and it appears to work.

Hope that helps.

2 Likes

Hi @takbb,
Thank a lot for your patience and caeful answer.
And I am very sorry for my data sources bothers you.
I changed my workflow flow your advice, but it still have some problem.

Excel Reader node an error occurs near the end of the loop.

My Excel Reader node is different from you, your node “select sheet with name” is gray, but my node is same as usal, it’s not gray.

Data result have two sheet(first and second) is wrong.
Data Result-two sheet is wrong.xlsx (2.2 MB)

My workflow
KNIME_data collection update.knwf (30.9 KB)

Hi @Banksy ,
I tried the workflow that you posted and initially it worked on my system, so I tried it using a network share (and the use of the URL) to see if that was the cause of the problems, but that didn’t throw up specific issues. However, I did find I got some conflicting results when I was stepping through, so there are two more things that I hadn’t considered before and I have included these in a modification to your workflow.

Firstly, I realised that both the outer and inner Table Row to Variable Loop Start were creating a flow variable called Path. I wondered if that was causing some potential name clash or strange results, so I introduced a column rename so that the second one is called “FilePath” instead of “Path”. That way we can guarantee there won’t be any kind of conflict. (both paths should of course be the same value at this point anyway, but it’s best to make sure they are different variables).

The second issue was on the Sheet Selection option. At some point I found that the Sheet Selection option in the Excel Reader reverted to “Select first sheet with data” instead of “Select sheet with name”. I don’t know why it did that, but maybe that is also a potential problem. To prevent that happening, I introduced a new variable “SheetSelectionOption” using a String Manipulation (Variable) node. The setting for “Select sheet with name” is “NAME” so that ensures that this selection won’t cause trouble.

Attached is the new workflow. You will of course need to set your file locations again in both the List Files/Folders node and in Excel Writer

I have also attached the output from running it locally. My system is now working ok with the character set, so I’m not sure why it caused issues when I first ran it! This looks like it is probably ok to me, as the contained sheets are all different.

KNIME_data collection update2.knwf (33.3 KB)

Hope this now works for you, but let us know if you still have problems.

tset loop.xlsx (2.3 MB)

3 Likes

Hi @takbb ,
You are wonderful ! Did me a great favor, thank you so much !
Your wrokflow worked on my system, it’s highly successful.
But I want to ask you one more question,
How to get rid of unwanted sheets in the loop?KNIME can do it?

1 Like

hI @Banksy, yes the loop could be written to not copy unwanted sheets. You just need to decide how to tell it which are wanted and which are not.

Attached is the previous workflow with some additional parts to demonstrate one way this could be achieved.

At the top you will see a flow which should be run by itself to gather the list of all files and sheet names. Manually take the output from the Loop End of that top flow, and paste it into the Table Creator that is just below it. Modify the “Remove” column, and put a “Y” (capital letter) in that column for each sheet that you will want removed.

Then run the lower flow. (I’ve added a Delete Files/Folders node, to first delete the output xlsx file, so you’ll need to edit the filename in that) . What this lower flow now does is it does a table join using filepath and sheet name with the “selection table” that you edited in the Table Creator.

Only those sheets without “Remove=Y” will be allowed through.

This means that you have a potential for an empty table during some iterations if you are removing ALL sheets from a particular file. I have therefore added an Empty Table switch so that the inner loop only occurs if it has sheets to process.
KNIME_data collection update3.knwf (64.3 KB)

If you have a simpler way of determining which sheets to include (a simple rule perhaps), you could do away with all the new top flow, and potentially the Table Creator/Joiner, and instead in place of the Joiner put in some kind of a Rule Engine (or similar) and a Row Filter, or a Rule Based Row Filter. Provided you keep the Empty Table Switch and End If, you should be good to go!

3 Likes

Hi @takbb ,
You are amazing and a great man!!!
Your workflow so successful on my system and solved my problem.
I admire you very much. You are my master.
Thank you a lot for help me.

1 Like

Hi @Banksy, glad it worked for you and happy to help. Thank you for the lavish praise, but there are many others on this forum who know Knime to a much greater level than I do. I am still a relative newcomer to this amazing tool, but through helping others I have been able to accelerate my own learning.

2 Likes

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.