How to extract information from one excel sheet based on every entry on another excel sheet

Hello all,

I’m exporting programme date from Asta powerproject. This programme can be filtered out based on codes that we have assigned to tasks. The codes can change depending on project but we’ve got standard code libraries that will be consistent on all programmes. The issue I have is I’m not sure how to extract programme data based on the code library used for that programme. My initial idea was to use a loop start and end node but that is as far as I got :man_facepalming: Wasn’t sure if that was the correct way of solving this

So the code Library has folders - in this case its called GBL Cost Code. In this Library there are three folders (architectural, mechanical and electrical) and my programme will have activities with those codes in the three sections assigned to it. I’d like to extract activities with ‘UPS’ assigned to it, perform some manipulation, then get extract ‘BMS and Controls’ and perform the same manipulation and then join each of those sections into 1 single sheet of S curves for each of those codes.

I’ve attached a dummy programme export. For ease of use, please look at ‘Code Library’, ‘Code Library Entry’ and 'Task. ’

One manipulation is S curve creation of which I’ve got a sequence of nodes for as attached if required. Trust that makes sense. Any help is greatly appreciated

Dummy Programme data.xlsx (964.3 KB)

S curve.knwf (152.0 KB)

Hello !
Complex data !! Where do you want to identify “UPS assigned to it” ? In task sheet I supposed but where ?

Best,
Joel

Hello @JPollet,

If we look at the ‘Task’ sheet, this shows all the tasks in the programme. Each of these tasks have a code assigned to it. I will take line 9 as an example (ID - 131229). The activity is ‘Preparation of UPS technical submittal.’ In the ‘Code 4’ Column (Column R on excel) there is the term UPS. This is the code that activity has been given. This can be related to the sheet ‘CodeLibraryEntry’ And ‘UPS’ comes under the ‘Electrical’ sub folder and GBL Cost Code Folder. Hoping that makes sense.

I’m hoping there is a way KNIME will run through all the GBL Cost Codes and identify which activities have a specific code.

I will try to answer but this is very complicated due to the data : maybe it would be easier to help you with some more simplier data !!
The workflow below parses codeLibrary and for each parses codeLibraryEntry and for each parses task on code 4. Then it applies yours calculation (transform on metanode for readability) and write an Excel file (with a determined name) on the data space of the workflow.
I add some Row Filter for testing.

S curve.knwf (872,1 Ko)

Hope I understand your problem and give some hint for the solution : I think it gives a way to solve what is suggested in the title of this post !
Best,
Joel

1 Like

I appreciate that, let me have a look later, as I don’t have access to KNIME at this moment. I’ve submitted simplified excel sheet, whilst keeping the headings the same and keeping the relevant information - if this helps.
KNIME Dummy.xlsx (20.9 KB)

1 Like

Thanks !
Take a look at my proposal and we can continue the discussion.

Best,
Joel

Thanks Joel!!

That is pretty much what I’m after. Is there a way to automate the filter based on all entries in a particular code library? So instead of just UPS, we filter out the data for each name entry and each time it exports a smaller file for which I can perform manipulation.

I came across a reference row filter but I don’t think that is the correct way of doing this?

Ultimately, I’d like to have S curve data for each code that has been used on a single sheet, but this is something for me to think about to see if it is the best way of presenting it.

Regards

Lahiru

Hi !
I’m glad to hear it :slight_smile:
For the filter : yes I insert Row Filter for testing. You can either suppress the node or make the node do… nothing. For example choose “wildcards” and put “*” in value (this means : take all).
But there are some problems : entry can be missing and the workflow fails for examples. I notice that columns names can be “ID…” : it would be better to use as many identifiers as possible. Alternativly, some tests can be done before calcultion etc etc.
It’s a question of data, meaning of data and quanlity.

Best regards,
Joel