Using an Excel map file to replace concepts in Excel source file

Hello guys,

I have a source file that looks like this:
image

I also have a map file:
image

I would like to use the map file to change existing concepts (in the source file) into the codes from the map. Just like this:
image

I would assume I need to merge Excel Readers nodes somehow - but I don’t know how to do this properly.
image

Could you please help me? Thank you!

Here’s a simple workflow, just based on what you’ve provided, that should get you started:

image

image

image

I suspect that your real-life application might require a more complex solution.

5 Likes

Thank you, elsamuel.

So I need to put my concepts from the source file in headers, right?
Like in my example: “Total”, “Men”, “Women” are in row number 3 - so I have to remove these two rows first and then proceed with your solution?

Hi @Stanislaw , on the Excel Reader there are a couple of places where you can give it information about what parts of the worksheet to read, so you shouldn’t have to physically remove the earlier rows. hope that helps.

4 Likes

Thank you, @takbb
Is it possible to determine rows-to-exclude dynamically?
Maybe something like this: find a cell with value “Year” - take the row of this cell (if it is found), and exclude all rows before the row.

I can imagine that the source can add an extra row of empty cells or split table’s title into two rows - and if something like this happens - my whole workflow will crash.

Hi @Stanislaw

If the initial rows are just empty, then the simplest setting is on the Advanced tab
image

but in your case obviously that doesn’t help as you have additional wording above the header row.

If there is a “settings” way of achieving this, I have not found it! Maybe this should be a feature request… :wink:

The only way I can see to do this currently is something like the attached flow.

Here, an Excel Reader is set to “pre-read” the entire file, with no knowledge of column headings

I suspect you’ll already have a variable containing the file name, but if you don’t, I would have this set to CREATE a flow variable so that the name of the file opened here, can be automatically passed to the subsequent Excel Reader and so doesn’t have to be edited in two places.

This first reader also needs to be told NOT to skip empty or hidden rows, and to support changing file schemas, since the schema on one iteration may not conform to the way it reads on a subsequent occasion.
image

Having created a “counter” identifier for each row read, which is one-based rather than zero-based, a given search String, such as “Year” in your case, will be looked for in Column A of each row. The earliest occurrence of that will have its “counter” stored in a flow variable.

That flow variable needs to then be passed into the “real” Excel Reader for your workflow, in two Flow Variable settings:

Firstly, it needs to be told which row number contains the column names, and secondly it needs to be told to treat only that row onwards as data “read_from_row”.

On the “Settings” tab for this Excel Reader you will need to have ticked the “Table contains column names in row number” box, as it can then use the row number supplied by the flow variable.
Likewise you need to select the “Read only data in…” so that it can change the “Rows from” value as required

Also make sure the file name is specified using either your own Flow Variable, or the flow variable created by the “Pre reading” Excel Reader


In the example workflow, there are 3 sample excel sheets, which appear to the “pre reader” as follows:

SampleSheet1.xlsx
image

SampleSheet2.xlsx
image

SampleSheet3.xlsx
image

You can manually changed the file name in the Pre-Reader

These sheets are all read by the second Excel Reader as:
image
Find Header Row for Excel.knwf (37.6 KB)

4 Likes

Hey @elsamuel , one trick to avoid having to remove column Year and re-add it, is to add Year as part of the mapping - just map Year to itself :slight_smile:

1 Like