Columns to JSON node is removing quotes in tags

Hi everyone I am building a JSON file that needs to include tags with double quotations in this case “PERSONA”. The problem is when I transfer the information from table to JSON the node changes the “PERSONA” tag to \“PERSONA\”

I have tried to ways to add the quotes: (I) in the string manipulation node
> join("\"",string("PERSONA"),"\"") and with the string input node, but it seems that the table to JSON step is not supported.

It is important that I want to build a JSON that looks like this:

> train_set = [
>     ("Methylphenidate is effectively used in treating children with epilepsy and ADHD.",
>      {"entities": [(0, 15, "DRUG"), (62, 70, "DISEASE"), (75, 79, "DISEASE")]}),
>     ("Patients were followed up for 6 months.", {"entities": []}),
>     ("Antichlamydial antibiotics may be useful for during coronary-artery disease.", {"entities": [(0, 26, "DRUG"), (52, 75, "DIS")]})
> ]

JSON Question.knwf (164.9 KB)

Thank you

Hi @mauuuuu5 , the reason why they appear as \" is because the quotes are being escaped, and they’re escaped because the whole string is within quotes, so in order to preserve the quotes within the quotes, it has to come as \". But the content itself contains the proper quotes - hopefully I’m explaining this properly.

The node is properly converting from string to json.

Thank you, but the tags cannot contain the \

image

as they appear in the text editor (after writing the JSON file). Secondly, if I try to remove them manually seems that I am corrupting the file:

image

I do not have to much experience with JSON files, perhaps I am not properly building the file?

Thank you

Hi @mauuuuu5 , the tags do not contain \ per se, it’s just escaping the quotes so they can be encapsulated within quotes.

For example, if you want to assign PERSONA, you can do:
"entities" : "PERSONA"

which is basically quoting PERSONA like this "PERSONA". You actual data is PERSONA

But to assign "PERSONA" (with quotes), you can’t just do “"PERSONA"”, that’s where you have to escape the quotes inside the encapsulated quotes: "\"PERSONA\"". This is just for the sake of being encapsulated, the \ is not physically there. Your actual data is "PERSONA" (including the quotes)

That being said, reading what you want to do, you need to create a list/set before converting, that’s where it will create the array in the json ([ ]). With the list, you may actually not need the escape, and you may end up with "entities": [(0, 15, "DRUG"), (62, 70, "DISEASE"), (75, 79, "DISEASE")] for example.

But just keep in mind that the escape is only because it’s encapsulated between quotes.

Hi @mauuuuu5 , so the example that you provided "entities": [(0, 15, "DRUG"), (62, 70, "DISEASE"), (75, 79, "DISEASE")] is actually an invalid JSON (you can check on online json validators).

The correct way is still with the quotes since you have strings there (starting with the braces “(” and “)”), which is:
"entities" : [ "(0, 15, \"DRUG\")", "(62, 70, \"DISEASE\")", "(75, 79, \"DISEASE\")" ]

Here’s a quick example how you can create a collection:
image

Input:
image

Generating the collection:
image

You can see the type of the column is a collection with the [...] on the column header.

And the generated JSON:
image

1 Like

Thank you for your help, as you said the Json is valid

but for some reason after several attempts and debugging in python the input needs to have this format

original file is here

I see that some {} are exchanged with “[ ]”. I wonder if I can generate a JSON from Knime using the second example because I am getting an error when I try to load the first file in python.

Thanks in advance

It’s not that they are “exchanged”, the “[ ]” basically denotes an array, so if they’re exchanged, it’s because you are dealing with arrays. And in this case, they have array of arrays (arrays within an array).

You need to build collections if you want to work with arrays in Json.

2 Likes

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.