Hi,
I am trying to collect some data from LinkedIn for a further analysis, but LinkedIn requires a login. Any clue how I can set up a workflow in Knime with a login procedure?
Thanks!
Ralph
Hi,
I am trying to collect some data from LinkedIn for a further analysis, but LinkedIn requires a login. Any clue how I can set up a workflow in Knime with a login procedure?
Thanks!
Ralph
Hello @Ralph2605,
what data are you collecting from LinkedIn? Check here a workflow that gets data from LinkedIn using GET Request node. In this case authentication is not needed but there is tab where input your credentials:
Additionally here is a bit more about crawling LinkedIn data. Seems not a trivial task.
Br,
Ivan
Hi,
Iâm trying to use GET REQUEST to collect data from Linkedin.
I used a simple Request: https://api.linkedin.com/v2/ugcPosts?q=authors&authors=List(urn%3Ali%3Aorganization%3A1415)&start=0&count=100 but it didnât work. In my request header I put Authorization (with the value of my Bearer Token) and X-Restli-Protocol-Version.
Is something missing ?
Regards,
AYmen
Hello @Aymen,
donât think I can help much as I have never tried collecting data from LinkedIn. However whatâs the response, errorCode, status you are getting from GET Request node?
Br,
Ivan
Thank you hope you can help me.
ERROR GET Request 3:2 Execute failed: Wrong status: 403 Forbidden
Hello @Aymen,
that means access denied. See here what 403 means and what to check:
Also see some comments/solutions here:
Br,
Ivan
I already saw this websites. I checked everything, but I still have the same problem.
I can do the same request with POSTMAN and it works âŚ
Is there something that I should configure in the node ?
Hello @Aymen,
unfortunately canât help you more. Hopefully someone else knows more and will join topic. Also maybe youâll get help following this suggestion from above docs:
âIf you continue to see the error, reach out to your partner technical support channel or https://developer.linkedin.com/support.â
Br,
Ivan
Hi,
the strange thing is that other API requests work⌠(For Example get Author)
Howdy partner, quick question: what are you trying to analyze, whatâs the link, and then lets go from there. If you need to use selenium to walk through a login process, be sure to check out selenium nodes. IMO you can do a lot of that stuff with 2-5 lines of python and i think itâs important to keep those things where they are easiest. Maybe what youâre doing will require this direction to be explored further
Thereâs two different kinds of HTML requests.
Sometimes requests library works OR in knime, sometimes the GET REQUEST node works too.
Other times you gotta wait for javascript to load, youâre hitting the WRONG URL, and a bunch of other smart stuff i dont really understand.
Next layer of scraping could be selenium, which has a really awesome HTML grab built into the library. So maybe thatâs what youâre trying to do and because that extra pause and wait to get the information is necessary, the get request youâre trying in knime is coughing up a nopes.
You may want to explore the selenium method of grabbing ALL the html. I would suggest avoiding trying to play with beautifulsoup or requests library if the get request isnât getting you the HTML you desire.
example;
requests library is requests.get(etc), im cool and dump it into a text file on my desktop because i know i can parse over a directory of files utilizing knime in various ways.
below that is driver.page_source, thatâs selenium library grabbing the same data and making a text file too, and notice how im not stressing about making this in a âsuper cool databaseâ which i hope will make this more adoptable if you decide to go this route.
r1 = requests.get(url3)
h1 = r1.text
t1 = time.strftime("%Y%m%d-%H%M%S")
f1 = open(âC:\Users\tyler\Desktop\scrape\keyword-â+t1+â-â+x+â.txtâ, âwâ)
f1.write(h1.encode(âutf8â))
y2 = driver.page_source
time.sleep(random.randint(9,10))
t2 = time.strftime("%Y%m%d-%H%M%S")
f2 = open(âC:\Users\tyler\Desktop\scrape\keyword2-â+t2+â-â+x+â.txtâ, âwâ)
f2.write(y2.encode(âutf8â))
There are several really good selenium knime-ers in this forum, and im sure you will find their info about their selenium usage of knime. I personally dont do any of that in knime because i think itâs easy to do in python and can now use knime as a tool to maybe⌠write my python VS doing it myself, which helps me orchestrate these pieces at a deeper level.
Good luck, hope this helps.
T