Download csv file from the web (markets.businessinsider.com)

Hello everyone,

The company I work for started using knime and they are exploring the possibility of getting the server to automate many processes in the organization. I received the following request :

I need to create a flow that downloads csv containing the historical price of an economic index ( MSCI WORLD INDEX) from businessinsider.com.

I’ve retrieved that type of data from yahoo finance in the past via web query but this site does not enable me to do the same.

It seems there’s a way to do it via the “Webpage Retreiver” and “Xpath” nodes but I am not knowledgeable in HTML. I checked all the examples I could find on this forum but none of them are precise enough to compensate for my lack of knowledge in the matter.

there is the information :

website : markets.businessinsider.com

Link to the website page : MSCI World INDEX TODAY | 990100 LIVE TICKER | MSCI World QUOTE & CHART | Markets Insider (businessinsider.com)

I am confident that one example with that precise website would enable me to reproduce the process.

The data I am looking to retrieve in any format lays in this table :

Thanks in advance for reading.

Hi @Thierry_Collins , and welcome to the Knime Community.

This would be a huge amount of work. I am able to identify the table in the HTML, but the work goes beyond this.

For example, you would also need to convert the html table into a table. Also, I’m not sure that the data is actually in the html, but rather might be generated via javascript dynamically (which it probably is as you can sort dynamically, and change the date range which will change the results dynamically)

And you would face the same issue regardless if you are using Knime or any other webpage retriever application.

You need to check with them if there isn’t an API on their side that you can use to retrieve the data.

1 Like

HI,

if you use a Page Inspector like Chrome Developer Tools (control+shift+I) you can see the data is being sent as a JSON feed to the page.

The address is something like https://markets.businessinsider.com/Ajax/Chart_GetChartData?instrumentType=Index&tkData=189,169323,189,333&from=20210726&to=20220926 (open this link to see the JSON)
Using this information you can easily build a flow in KNIME to GET the endpoint and parse the JSON. I’ve attached a workflow to show you how.

Business Insider Markets – KNIME Hub

7 Likes

@Thierry_Collins , @bobpeers , @bruno29a

Before downloading any data you must check the terms and conditions of the website.

ALL data is copyrighted, and you will need a license from the data owner before you can use it.

In the case of the website that you are accessing, the data owner is very clear in their terms and conditions that you cannot download and use the data.

@Thierry_Collins , you may want to seek out a provider of the data you need from an organisation that will license it to you. There will probably be a fee.

Apologies for seeming like a killjoy, but better to be aware of intellectual property rights than in court and paying a hefty fine.

2 Likes

@Thierry_Collins You could also try to use the MSCI API listed here, https://developer.msci.com/apis/index-api

No idea if it’s free but you could register to find out.

1 Like

@DiaAzul Good point. I’d argue that you may download for personal use since they actually provide a Download button on the page he links to.

2 Likes

Hi @DiaAzul , there is a Download button which allows you to download the data as csv (I used it when I was looking for an alternative, and was also trying to get the link from the button).

So, it should be ok to download.

4 Likes

It worked ! thank you all for these presious advices

3 Likes

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.