I´m trying to use the File Reader Node to load thos series into KNIME, but have no clue how to transfer the link into something, the file reader can handle.
After 2 hours of reading Forum topics and surfing the web I hope someone of You gusy
has a hint for me.
First of all: Thank you for that workflow. Great visualisation of the COVID-19 pandemic.
One remark to this workflow: I have seen that to some reason the last day is kicked of from the data in the node “COVID-19 overview” (done by the row filter documented as “remove last day”). So as of today yesterdays data is not visualised. Any clue why this is done?
Furthermore to the visualisation I would like to compare countries with regards to the growth rate. So I would like to align the data to (typically diferent )starting date of these countries. Would it be possible to add that feature?
First Question: …So as of today yesterdays data is not visualized. Any clue why this is done?
This API is updating every hour by checking for new rows in yet another source: a GitHub repository maintained by Johns Hopkins University. It often happens that the last day in the dataset is missing new cases / deaths / recovered by some countries while it does not from some other. When this is the case I think the visualization of the last day is deceiving as it shows only partial data and we preferred to take it out. If you want to display also the most recent day feel free to remove the row filter.
Second Question: …So I would like to align the data to (typically different) starting date of these countries. Would it be possible to add that feature?
We are already doing this in the last line plot in the last view. Check out this twitter post to see how it looks like:
We use as the start date the first day with at least 20 cases, but if you want to change that find the row filter in the last component on top of that Line Plot node “line shifted” and change “20” to “1”.
@paolotamag Thank you for your fast support to add this feature to the visualisation.
Now it is much easier to see the pandemic bahaviour in the different countries.
But getting requests resolved creates new ideas: What about a normalisation of the cases to the country size (population)? If I think about China (1.4 Billion people) vs. Italy (60 Mio people). That should make a difference in the number of total cases, but unfortuntely it does not.
Hi @knimediger, I did not add anything, it was already there
Regarding normalization on country population I do agree it would make things more proportionate.
Feel free to:
Download the Workflow
Download a table from the internet with population of each country (I do not think it’s provided by the sources I have been using but you can easily find a csv via Google)
Blend this new source with the data rows in the workflow using a Joiner node on Country right before the first Component
Divide each double column by the value of the new column “Population” using this Math Formula (Multi Column)
Make sure the new table header is unchanged
Visualize in the line plot the new normalized data by simply rexecuting the components
Reshare the enhanced workflow (mentioning Nomalization in the title) on your KNIME Hub space and give us the link here!
My first problem is that I was not able to understand the license which this data is based on. So I’m not sure whether it’s allowd to use this data for this purpose.
Nevertheless I tried to follow your instructions. Thank you very much for guiding a novice.
But due to the data I’m struggeling already in the first step of joining the two tables.
The UN data is using country names which do not appear in the same way in your ISO table (e.g. the UN table uses just “Afghanistan” instead of “Afghanistan, Islamic Republic of”). I’m sure there a quite easy way to manage this issue. But I’m already at my wit’s end.
Regarding the license problem of this dataset with country population take a look here.
" Terms of Use: All data and metadata provided on UNdata’s website are available free of charge and may be copied freely, duplicated and further distributed provided that UNdata is cited as the reference."
Just add the link of UNdata in the workflow metadata using the description panel from the KNIME Analytics Platform. To learn how to do that go here and scroll to “Workflow Metadata Editor”. This way you reference them in the Hub page and you are on the right side.
Regarding the joining operation… I had the same issue to find the continent names for each country. Country names can differ quite a bit. Do not use country names then, use their codes made of 2 letters! “IT” stands for “Italy”! Join on such code columns and find another population by country table with such codes if yours does not have any!
The UN uses 3 digit numerical country codes, named M49.
These can be converted into readable data using a table that can be downloaded here: https://unstats.un.org/unsd/methodology/m49/overview/
in that table are also the ISO-alpha-3 codes, but NOT the alpha-2 codes that most people think about as country-codes due to their use in domain extensions.
Edit: I see now that a similar table is already used in the workflow, this one stemming from datahub.io.