Another Wednesday, another Just KNIME It! challenge!
Did you know that August 2nd is International Beer Day? Let’s celebrate the history of one of the world’s oldest beverages with a challenge on REST APIs and JSON processing!
Here is the challenge. Let’s use this thread to post our solutions to it, which should be uploaded to your public KNIME Hub spaces with tag JKISeason3-12.
Need help with tags? To add tag JKISeason3-12 to your workflow, go to the description panel in KNIME Analytics Platform, click the pencil to edit it, and you will see the option for adding tags right there. Let us know if you have any problems!
Here’s my solution. It produces a list of all of the breweries. There’s also a Choropleth map of the USA. I have a mind block about JSON so I converted it to XML so I could parse the data with XPath. I added another output which allows selection of brewpub, micro or large brewery types.
I wrapped it into a component that allows to configure:
Items per page to be retrieved
optional delay between iterations (might be necessary to avoid running over rate limits although in my tests it also worked with 0 delay )
My approach was to use the meta data endpoint to get the total number of breweries.
Based on that and the user input of breweries per page I calculate the number of requests that are necessary to fetch all items and then generate a dynamic URL incl. page and items per page to get the data and turn it from JSON into a table.
Here’s my attempt for this challenge and let me break it down for you.
1. Set the base url and decide how many results you want per pages in each iteration
2. Create a custom url with page. Here I used https://api.openbrewerydb.org/v1/breweries?page=1&per_page=200 with the page number changing wit h each iteration and the “per_page” value set using flow variables ( Note: Fun fact! It turns out pages 0 and 1 served up the same results. So, I spiced things up a bit by adding +1 in the string manipulation to keep the results unique. )
3. Create a custom loop to interact with the JSON response and check if it contains any results or is just an empty array “[ ]”
4. Use the Variable Condition Loop End node to stop the loop (where the json is only contains “[ ]”
5. Delete the unused column “status” and “content type”
6. Clean the data by changing the missing value
7. Show the result in Table View or connect to Excel Writer/CSV Writer node to save the data
And voila! You’ve got the dataset. This dataset contains a information that’s really interesting to explore, whether through dashboard visualizations, statistical analysis, or other insights!