String to JSON help to fix the error - code 83

Hi; Can you help to fix the error in the picture? Workflow is attached.
KNIME_project2xx.knwf (619.1 KB)

Hi @umutcankurt , looking at your screenshot, I’m not sure what you are converting to JSON. The String to JSON node will simply convert a JSON string to JSON, that is you need to have a valid JSON string for it to work.

From what I can see from the screenshot, the string that you are trying to convert to JSON is just an HTML document. That’s not a valid JSON string. You string should basically be in the format of:
{"variable" : "value"}

Of course, you can have more data than just this, such as:
{"variable_1" : "value 1", "variable_2" : "value_2", etc}

Or with multi-levels. The bottom line is that the string you are trying to convert to JSON should be a valid JSON string.

2 Likes

Hi @umutcankurt

Your 3 first string cells in “JSON” format are fine. However the 4th is faulty. You can check it by doing a copy and paste of a cell content on the following web page

to verify and even better “prettify” your cell content in JSON format :wink:

For instance, the first one is fine and when copied and pasted gives:

but the 4th row is faulty and it tells you where:

This may help you to see why they are faulty and maybe implement an automatic correction.

Hope this helps.

Best

Ael

3 Likes

Hi; @aworker and @bruno29a
Thanks for the answers, but since I will try to get regular data every day, it is not possible for me to check and edit manually. I need to find a more precise solution.

I don’t quite understand why some are right and some are wrong. data from the same source as a result

1 Like

Hi @umutcankurt

Just to partially get out of the problem, you could handle every row separately using a -chunk loop-, feeding the -string to JSON- node one row by one, and checking with a -Try and Catch- if it worked or not. At least you will get done those that work and thus you can treat the others separately later. This is just an idea to get out of the local minima and progress :wink:

Best wishes,

Ael

3 Likes

Tell me about it! If you do not have control on the input data, then welcome to the world of ingesting data :smiley:

You cannot assume that the data that you get will always be good or well formatted. You have to assume that you will get the kind of issues you are facing. Obviously, you cannot know in advance what type of issues you can face, but you can decide what to do. For example, do you skip it and continue? Or do you want it to fail because you want to fix it and re-ingest?

You will keep adding rules as you encounter these issues.

2 Likes

Hi @umutcankurt , I must have been writing while @aworker was posting, but we’re basically saying the same thing :slight_smile:

2 Likes

@bruno29a and @aworker Your enlightening ideas have been important details for me to find solutions and for my behavior in the problems I will encounter later. :slight_smile:

The important thing for me is to get the data in the correct link and skip the incorrect ones. Because I want to get new data that I can access that is up to date every day.

Can you edit the workflow I added so that it skips the incorrect jsons? That would be great for me.

json url example
https://sam.gov/api/prod/sgs/v1/search/?random=1642699126203&index=opp&page=0&sort=-modifiedDate&size=50&mode=search&responseType=json&is_active=true&notice_type=o

more…
opp&page=1
opp&page=2
opp&page=3

current workflow ( I would appreciate if you could edit the error you mentioned so that it can bypass )
KNIME_project2xx.knwf (619.1 KB)

Hi @umutcankurt , so that URL returns a JSON string. You may not need the HTTP Retriever, nor the HTML Parser, nor the XPath. You can work directly with what the GET Request returns.

From what I can see, that JSON String is already a valid JSON string. Can you plug your String to JSON directly to your GET Request node?

EDIT: I did a test on my side, so the GET Request returns it as binary. Just need to convert binary to string first then, then do String to JSON.

Here’s a quick test using your URL:
image

As you can see, it ran without any issue, and here’s the JSON column:

Here’s the workflow: String to JSON help to fix the error.knwf (8.9 KB)

2 Likes

Hi @umutcankurt , I’ve finally downloaded and imported your workflow.

If you are going to hit the website multiple times, I suggest you add some delays in between each request. You can do this directly in the GET Request:
image

It takes values in milliseconds (ms), so in my example, I’m setting a 3 seconds delay between each request (3000 ms).

From your Table Creator, I see that you have 6 pages only (0 to 5). Are you only interested in the first 6 pages? I checked the website, and it has 200 pages in total (0 to 199). If you want all, and you want to create the URLs manually, there is a simpler way to do this:
First of all, I want to define some variables:
image

For the URL_template variable, I define this:
https://sam.gov/api/prod/sgs/v1/search/?random=1642699126203&index=opp&page=##PAGE##&sort=-modifiedDate&size=50&mode=search&responseType=json&is_active=true&notice_type=o

As you can see, I define a place holder called ##PAGE## for the page values.

I also define a variable called page_end. I assigned the value 6 to reflect that you have (0-5). But if you want to get all 200 pages, just change this value to 200, and it will create URLs with page from 0 to 199 (200 pages).

We run the Variable Creator and we check the values:

After that, create an empty table using the Empty Table Creator, but with the 0-5 Rows:
image

I then generate the URLs using the ROWID of the empty table and the URL_template via the String Manipulation:

And just for fun, let me show you how easy it is to generate 200 URLs:
Just change the value in the Variable Creator to 200:
image

And that’s it. Just run the String Manipulation, and BANG!:

After that, just connect them to the test that I did. Just need to configure the GET Request like this:

And here are the results for 6 pages:

You can then apply your JSON Path, etc.

Here’s what the workflow looks like:
image

Here’s the updated workflow: String to JSON help to fix the error.knwf (13.9 KB)

A word of advice: If you are building your workflow, run this with only ONE page. Once your workflow is properly built, only then you run it with multiple pages.

6 Likes

Hi; Thank you very much for all the detailed explanations and the workflow example…
I think you have given me a great answer and solution.

:100: :wink: :100: :knime: :medal_military: :trophy:

1 Like

No problem @umutcankurt , happy to help

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.