Selenium Nodes: How to scroll within a section?

Hi KNIMErs (and webcrawlers hahaha),

I am working on a workflow to screen the job postings on LinkedIn automatically. I managed to log in and search for results and I also manage to extract the required properties using XPath (thanks to @qqilihq )

But this specific page has a scroll bar within a section (see screenshot below). I tried to use the “facebook approach” that is share on the Selenium Nodes website (Execute JavaScript with scroll to bottom) but this does not seem to work when the scroll bar is within a “section”.

Here’s the link: https://www.linkedin.com/jobs/search/?keywords=Lego

And this is the part / scrollbar I want to move to the bottom:

Any suggestions? Thank you in advance!

Hi Phil,

as discussed yesterday, some input here. First straight to your question, then describing the rabbit hole into which I fell when playing with the data. Maybe it sparks some more ideas :slight_smile:

Scrolling Sections

As you highlight in the screenshot, the scrollbars on this page only apply to specific sections (technically, these are <div> elements which are vertically scrollable).

In the Facebook example, we could scroll the entire window, which works as follows (using the Execute JavaScript node):

window.scrollTo(0, document.body.scrollHeight);

In contrast to that, we first need to narrow down the scrollable element on LinkedIn (i.e. the section which shows the scroll bars). I do this with a Find Elements node where I get the element with the .jobs-search-results class. Then I pass it to Execute JavaScript, where I scroll this element (instead of the document):

/* This is the element passed from the previous Find Elements node;
 * I have selected it in the left column here. If I select a second,
 * third, … element, they would be available as arguments[1], 
 * arguments[2], … 
 */
const element = arguments[0];

/* Determine the hight of the element (includes height exceeding
 * the current screen height 
 */
const amountToScroll = element.scrollHeight;

/* Use the previously determined height to scroll */
element.scrollTop = amountToScroll;

Loading More Results

This works fine. At least the scrolling :slight_smile: Unfortunately we’ll not get more results this way :frowning: Instead we have to keep clicking a “Load more” button to load more results. So instead to what I described above, I built a loop which would continuously click that button to load more data (below example is rather dumb, it will just try to keep clicking, even though there are no more results – but never mind it works!)

image

Adding some more Extract Attribute and Extract Text nodes, and some post processing using a String Manipulation (Multi Column) node I end up with a nicely extracted and structured job listings table with Lego jobs:

(Bonus) So, Where Are All These Jobs?

Looking at the “Location” column, I thought that this would be a great use case to do some spatial analysis. So let’s show the job offers on a map! The Palladian Location Extractor will allow us to transform the location strings to latitude/longitude coordinates (and it even has some magic, aka. “disambiguation” built-in for properly detecting, if “Paris” is about Paris in France or Paris in Texas – a while ago I even wrote a dissertation about this topic, but this is yet another rabbit hole which is fortunately closed now :stuck_out_tongue: ).

To use the Location Extractor, it’s necessary to set up a “Location Source” in the preferences (this is the database which is used for looking up the location data). You can use the free “Geonames” which allows 30,000 requests per day for free. (more information is shown in the node documentation, and we even offer a paid alternative for people who don’t feel comfortable sending their data to a public web service.)

After running this, and doing some filtering to only get the city names (e.g. not the regions or countries), I can then visualize the companies offering Lego jobs on the map.

I have shared the workflow on my public NodePit space (can definitely still improved, consider it a PoC for now :slight_smile: ):

Have a good weekend,
Philipp

1 Like

O-M-G!!! :flushed:

Speak about rabbit holes

400px-rabbit-hole

That is fantastic!!! Thank you so much. I can see so many interesting ideas one could realize using the spatial analysis you added, @qqilihq with some of the jobs.

  • filter by location, to see where certain jobs appear (on the map image you shared, you can see a lot of the jobs are in the area of “Rheinland” in Germany)
  • do a competition research (my initial idea) to see where your competitors look for jobs (or in my case, as a procurement professional, do this for suppliers)
  • do some text mining with the jobs details link to get a heat map / word cloud about the skills the companies one is researching are looking for

etc. etc.

I will soon post an example workflow here using your fantastic Selenium and Palladian nodes (they are AWESOME!)…

2 Likes