put an xml and csv file togehter

Paddymaster · May 7, 2019, 10:21am

Hello,

i’m new at Knime. I try to put 2 files together.

1 file is a CSV file with many data.
For example

Col1
A
B
C

2 file is a xml file with only column names.
For example
Alphabet

Now i want to use the column names of the second file and replace the name of the first file
For example

Alphabet
A
B
C

How can i do this ?

ipazin · May 7, 2019, 10:39am

Hi there!

Welcome to KNIME Community Forum!

There is a Insert Column Headers node which can do the trick

https://hub.knime.com/knime/nodes/Insert_Column_Header*jcLGoxXUmUY4lE6u

Try it out and if any problems feel free to ask.

Br,
Ivan

Paddymaster · May 17, 2019, 8:56am

Well, the link didn’t help me.
My Solution is in the Picture.

The Code for Java-Snippet is: return "Column " + $$ROWINDEX$$;

Well.

ipazin · May 17, 2019, 9:05am

Hi there,

the link was supposed to be helpful in a way to familiarize you with node that actually does the job. Thing that was missing was creating a dictionary table for which you found a nice approach.

Br,
Ivan

Paddymaster · May 23, 2019, 9:38am

It’s OKay . I found a solution. But I think my solution is a little bit to long. Maybe someone could do this a little bit easier

I must do this with many datafiles; I’ll try a loop and post it next time.

ipazin · May 23, 2019, 9:56am

Hi there,

feel free to post your workflow her and I’m sure someone will take a look and try to speed things up

Br,
Ivan

Paddymaster · June 7, 2019, 8:53am

Well. The Workflow above is a little bit long. Nevertheless I’ve a new Challenge. My Workflow works with 1 CSV File and matches XPATH of 1 XML File.

How Can I Do this with 20 CSV-Files and Different XPATHS on 1 XML File.

I’ll test to add this Workflow for Reading CSV-Files (https://hub.knime.com/knime/workflows/Examples/06_Control_Structures/04_Loops/02_Example_for_Reading_a_List_of_Files*UPm1nNWmew1uu2L0).

But how i can match the Filenames with the XML-Files.
The Matching is “Filename” and XPATH Childnote = Filename.

BR
Paddy

Paddymaster · June 11, 2019, 7:46am

Hi,

did someone know, how i can read multiple excel files in one final excel file?

ipazin · June 11, 2019, 9:56am

Hi Paddy,

regarding reading multiple Excel files in one Excel files here is example with .txt files but it is pretty much same as one you linked. Note that all files need to have same structure/columns otherwise you need extra data manipulation/transformation for this to work.

2019_06_11_Reading_List_of_Files_and_Concatenating.knwf (19.0 KB)

In workflow attached I have extracted file name so you can use it to join with data from XPATH or to filter that data and then insert appropriate column headers. I would try to this logic inside a loop but not sure what is your use case so maybe you can do it after concatenating all Excel files

Br,
Ivan

Paddymaster · June 11, 2019, 1:09pm

Well thx, you helped me with your template.

Current i want to loop the csv files. Its easy with Loop/CSV-Reader and URL as Input-Location.

But the really hard Problem is to get a dynamic XPATH. Like the picture above i’ve a xml reader. In this Reader i must do a dynamically loop, The file looks like

<Table>
<URL>Dateiname.csv</URL> // IMportant because dataname
<Name>Dateiname</Name>
<Description>Description</Description>
<DecimalSymbol>,</DecimalSymbol>
<DigitGroupongSymbol>.</DigitGroupingSymbol>
<VariableLength>
<VariablePrimaryKey>
<Name>Spalte1</Name>//important  because column name
<Description>wayne</Description>

…
Next uRL

Function is. Select URL with Name = Dataname
Select all VariablePrimaryKey/Name from this URL.

ipazin · June 11, 2019, 2:04pm

Hi,

glad I helped. Not an expert on XML files and XPATH node… Maybe @armingrudd can help on this one?

If you can provide some XML file example that would help in any case.

Br,
Ivan

armingrudd · June 11, 2019, 6:25pm

Hi,

Do you mean you have always 2 files, one is a CSV and the other one is an XML and the CSV file name and the column name are specified in the XML and they may change each time you run the workflow?

If the XML structure is always the same then what is the problem? You can use the same XPath every time.
If the XML structure changes, it may be possible to edit the XML file with a String Manipulation node first to produce the same XML file each time you run the workflow and then extract the names.

Please explain the situation a bit more and if possible provide a sample data so we can investigate your issue much better.

Paddymaster · June 12, 2019, 2:36pm

Well. You see in Workflow that the CSV-Files are in one List.
I’ve also an XML File. In this File are the Headers. I want to add the headers from the xml file to the CSV-Files.

There are just 1 XML-Files with all named Files and all Headers.

Thats Why the Xpath is changing. Example below

  <Table>
<URL>Dateiname.csv</URL> // IMportant because dataname
<Name>Dateiname</Name>
<Description>Description</Description>
<DecimalSymbol>,</DecimalSymbol>
<DigitGroupongSymbol>.</DigitGroupingSymbol>
<VariableLength>
<VariablePrimaryKey>
<Name>Spalte1</Name>//important  because column name
<Description>wayne</Description>
<VariablePrimaryKey>
<Name>Spalte2</Name>//important  because column name
<Description>wayne</Description>
<VariablePrimaryKey>
<Name>Spalte3</Name>//important  because column name
<Description>wayne</Description>
<VariableColumn>
<Name>Spalte4</Name>//important  because column name
<Description>wayne</Description>

  <Table>
<URL>Dateiname2.csv</URL> // IMportant because dataname
<Name>Dateiname2</Name>
<Description>Description</Description>
<DecimalSymbol>,</DecimalSymbol>
<DigitGroupongSymbol>.</DigitGroupingSymbol>
<VariableLength>
<VariablePrimaryKey>
<Name>2Spalte1</Name>//important  because column name
<Description>wayne</Description>
<VariablePrimaryKey>
<Name>2Spalte2</Name>//important  because column name
<Description>wayne</Description>
<VariablePrimaryKey>
<Name>2Spalte3</Name>//important  because column name
<Description>wayne</Description>
<VariableColumn>
<Name>2Spalte4</Name>//important  because column name
<Description>wayne</Description>

armingrudd · June 12, 2019, 4:03pm

The XML format is not correct:

You have to add closing tags.
You have the <Name> tag both for the file names and the headers in the same level.

I think these problems do not exist in your original XML file.
So I created a new XML file based on your example:
names.xml (903 Bytes)

And also 2 example CSV files:
file_headers.zip (582 Bytes)

And then built a workflow in which the CSV files are read and the column headers from the XML are inserted for each file.
And here is the workflow:

xml_headers.knwf (47.6 KB)

Here is the final Excel file where the sheet names are the file names:
book1.xlsx (4.0 KB)

ipazin · June 13, 2019, 9:11am

Hi,

nice one @armingrudd

One general note from my side. You can put all files associated with workflow in workflow directory (create folder named data for example) and use relative paths to import data from files in your workflow. This makes things easier for you (less uploading) and people who are interested in your workflow (less downloading and most importantly no need for configuring paths!)

Br,
Ivan

armingrudd · June 13, 2019, 10:03am

Yes, you are right.
I didn’t do that because:

The List Files node does not support relative paths yet. (Edited: Wrong! Check the P.S. part in this post)
Regarding the last point, I don’t want to reveal my main drive paths.

So I didn’t put files in workflow directory.
But of course I do that for my projects in general. Thanks.

ipazin · June 13, 2019, 10:18am

Hi,

It does. It is even in the node description

How are they revealed with relative paths?

We are a bit off topic…

Br,
Ivan

armingrudd · June 13, 2019, 10:34am

Untitled

Am I missing something there? (Edited: yes, I am! check the P.S.)

I meant: Since I cannot use relative path, I have to input the absolute path but I don’t want to reveal the absolute path.

P.S. I missed this:
The relative path for this node works only in this format:
knime://knime.workflow/…
Untitled

I didn’t try this format for this node. Thanks for the hint @ipazin.

Paddymaster · June 13, 2019, 1:01pm

Hi,
thank you very much for your example. I test it and its really good.
But my File-Structure is a little bit different. I updated the xml file. Its little bit difficulter.Preformatted text

updatenames.xml (1.3 KB)

And the number of variableprimarykey and variablecolumn is different

armingrudd · June 13, 2019, 1:47pm

Your XML still has several errors in closing tags. Here is the modified version:
updatenames.xml (1.4 KB)

And here is the modified version of the workflow regarding the XML structure:
xml_headers.knwf (59.4 KB)

The change here is that we have to create 2 XPath queries. One for the <VariablePrimaryKey> tags and the other one for the <VariableColumn> tags. And then we have to concatenate the columns we have. I did that using a loop construct with a Column List Loop Start node and a Loop End node.
If you have a more complicated XML structure, just add XPathes like how I did. I have created the XPath queries in the Variable Expressions node.