How to collect several outer / inner html elements for the same table

Hi there KNIMErs,

I was recently thinking of how to use KNIME and the amazing Selenium Nodes to improve my social network connectivity.

Let me explain: I am part of a quite a few interesting groups on LinkedIn and every once in a while I go through the groups and connect (manually, by clicking) to people that are not direct connections to me.

While doing that I thought that I follow a specific set of rules here, meaning I click “connect” for everyone who is NOT a first tier connection and who is member of a specific group.

Rulesets equals KNIIME (at least for me) so I started to build a workflow (see attached). However, I ran into a problem which I simply cannot solve. It seems to me that I’m not able to add to the previously collected data.

Have a look at this screenshot

I want to collect in a table the name, connection level (1st, 2nd, 3rd) and the title to further process then in KNIME.

I was able to establish a connection and log in and also collect the name, however when I start to look for the second “element” - the connection level - the resulting table only shows me the name of the first entry. But I want to find (and fetch) the connection status for the 1st element, then for the 2nd etc.

Here’s what I built so far with Selenium Nodes (you have to provide your own login details to LinkedIn if you want to use it).

Any help would be highly appreciated…

1 Like

Hi there,

I just tried running the WF, but unfortunately I do not have access to the said group (OT rant: Every time I dare to log into LinkedIn every half a year, I’m totally horrified by all those screaming red notification bubbles shouting at me what I have missed and I usually run away again quickly).

Never mind, I just switched to a different example group and I got the idea, so straight to your question:

What you should do is to use a cascade of “Find Elements” nodes:

  1. On the first level you extract the wrapper element which contains one piece of all the information information (in your case name, connection level, title). Visually (by looking at your screenshot) this is all the information between the a pair of horizontal lines. In case of the LinkedIn page it is represented by the <li> element. You can also get to this by clicking the “Select” button in the Find Elements node and hovering with your mouse over the browser window:

    By clicking, you will get a CSS expression pointing to one element. From there on you will need to generalize the expression to cover all of the rows. This involves a bit of trial and error and routine, I just came up with ul.groups-members-list__results-list li which works well. By entering this in your first Find Elements node, you will get a list of all the rows.

  2. Next, you will want to add several follow-up nodes which extract the desired piece of information (i.e. name, level, title) based on the result from the first Find Elements.

    For that, add further Find Elements nodes, and make sure to properly configure the “Find In” setting to not use the Web Driver, but the previously extracted Web Element instead:

    image

    This means, you will be searching within the context of the result previously extracted. For getting e.g. the name of the user, you can use this CSS selector: .artdeco-entity-lockup__title (fyi: the one automatically generated is too specific, and will not generalize well enough to work for the entire page)

  3. Follow the same steps as in 2. for the further information.

  4. (pro tip): In the latest Selenium Nodes release 4.7 we have turbocharged the “extraction” nodes to include the “Find Elements” functionality. This means, that you can build the entire workflow with just one single “Find Elements”, and for step 2 (and 3 and 4) use just and “Extract Text” node where you enter the proper CSS selector – this means less node clutter, faster execution :slight_smile:

Hope this helps!

– Philipp

PS: I have been working on the other WF you sent via Email and will reply to this by beginning next week.

3 Likes

Wow, thank you @qqilihq

I wasn’t aware of the functionality the “select” button in the find elements node / option. It’s always good to speak to the creator of the software, I think :wink:

Also having only one Find Elements node makes the workflow much leaner, great improvement!!!

Thanks again.

This workflow for LinkedIn is basically grown out of the other workflow for roleplaying, the business version so to say :slight_smile:

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.