Extracting Multiple Cells with One Node

Is there an easier way to extract certain cells from an excel file at once instead of using multiple Cell Extractor Nodes as in the following image? There’re 150+ cells I need to extract and concatenate in one table. Should I create a cell extractor node for each cell?

@OzgunOzden maybe you can tell us more about your data and maybe even provide a sample file representing your taks without spelling any secrets?

1 Like

You can create a table with 150+ cells specification and use Chunk Loop to combine them.

@mlauber71 Basically I need to extract some certain cells from multiple excel workbooks which have the same format. But these cells cannot be filtered out as all of these cells are located in various rows and/or columns. Within the new excel workbook each line should belong to each excel workbook that the cells are extracted from. I will try to demonstrate the work I need to do in the image below.

If there’s no value in the extracted cell then it should return an empty value an put target excel table 0 or leave it empty. The important thing is that each cell in the source workbooks should be in the same column of the target table. For example A2 cell of each source file should be in the Col.1 (or A) column of the target table.

@OzgunOzden maybe you could provide the data so they represent your challenge.

Also you might specify the rule that guides this endeavour. What would happen if some values are not there. Would the first workbook be the pattern?

@mlauber71 I’m sorry I cannot provide more data than I already did in my previous reply.

@OzgunOzden with a screenshot I cannot build a sample workflow unfortunately

@mlauber71 Sorry for attaching a screenshot. I attached 4 workbooks.
I need following cells to be extracted from each file and put them into a new table.
A2, B2, A5, B5, D5, A9, B9, C9, D9, C13, D16

In this table each row should belong one excel workbook and first column should be the name of the related workbook (workbook1.xlsx, workbook2.xlsx, …)

Workbook1.xlsx (9.0 KB)
Workbook2.xlsx (9.0 KB)
Workbook3.xlsx (9.0 KB)
Workbook4.xlsx (9.0 KB)

1 Like

@mlauber71 Sorry I just corrected the cells I want to extract. They should be as followling.

A2, B2, A5, B5, D5, A9, B9, C9, D9, C13, D16

These are the cells that contain value in my example workbooks but in the original ones other cells have values too. So I cannot use “if not empty” query.

@OzgunOzden OK. And what would constitute a column in the collected data? Is it the Row Number or all values from a column?

Edit: Ok so it is that each of these cells will form a Column in that order:
A2, B2, A5, B5, D5, A9, B9, C9, D9, C13, D16

@OzgunOzden I came up with this solution where you would provide a list of the cells you want to extract and the column they should be assigned to (stored in a meta.csv - you can edit that according to your needs) - assuming the structure is the same in each file/sheet:

An few things to notice about the workflow:

Use of Regex to split the Excel style addresses into Column and Row:

When importing Excel make sure to use the Excel column names A, B, C:

image

And also do not skip empty cells. This might compromise your system:

image

And then the extracted value from the cell must be renamed to the ‘target’ column.

image

The Workbook name will function as a RowID:

The inner loop collects the column (for each RowID = Workbook). The outer loop then collects the rows.

image

image

Excel - Import selected Columns and Cells - KNIME Forum (78689).knwf (152.3 KB)

6 Likes

Hi @mlauber71 , it seems Excel has dropped an extra temp file in the data folder that causes a problem on the Read Excel Sheet Names.

image

I don’t know if you’ll run out of time to edit it, but otherwise, @OzgunOzden the simplest thing to do for the purposes of the demo flow would be to edit the filter options in the List Files/Folder config, so that this file is ignored

5 Likes

Hello @mlauber71
Thank you for your solution. I couldn’t check it yet as I was on a trip and didn’t have access to KNIME but I’ll check as soon as possible.
Thanks again for you effort.

Hello @takbb Thanks for your answer too.

1 Like

@mlauber71 Hello again, sorry once again for trying this so late.

I think this is gonna work for me. I made a couple of changes of course (my data also has strings, not only integers). But I’m stuck on one thing and couldn’t fix it by myself as I’m pretty new to KNIME.
I couldn’t filter specific sheets in the section shown below. I only need data from one sheet (only visible sheet in the workbook). There’re 10 sheets in total and 9 of them are hidden. I don’t need to extract cells from hidden sheets.

If you can help me to solve this one that would be great!

Edit: I found it. But now there’re two errors for the workbooks with multiple sheets.
When I delete other sheets (hidden ones) file name changes on the error messages to another one with multiple sheets.

image

image