Read JSON File from s3 bucket

Hi All,

I am trying to read JSON file directly from s3 bucket using JSON reader node, but the when I give the URL and execute the node it throws error -“Execute failed: Unexpected character (’<’ (code 60)): expected a valid value (number, String, array, object, ‘true’, ‘false’ or ‘null’)
at [Source: org.knime.base.node.util.BufferedFileReader@60760af9; line: 3, column: 2]”
even though there is no such character ‘<’ is present in the URL.Any suggestions what could be wrong here and how can we fix this?

This sounds like the URL doesn’t deliver a JSON but something different (maybe HTML due to the < character). What happens if you paste the URL in a browser without being logged into any AWS account?

Hi Thor,

I have tried , it gives me below error:
“Invalid file system path: Illegal char <:> at index 2: s3://pathofjson”
when I paste the URL- “s3://pathofjson” and I have tried giving the full URL-https://s3.console.aws.amazon.com/s3/buckets/jsonpath, but in this case it is reading some XML and not the JSON object . I have updated my user variables in environemnt variabled with AWS_Access_key_ID and SECRET_ACCESS_KEY_ID. Am I missing some configuration? or is there any other way to read json file from S3 location.
Thanks in advance:)

The JSON Reader node will only understand https-URLs but not s3. You have to provide the https-URL of the S3 object, e.g. https://my-bucket-name.s3.eu-central-1.amazonaws.com/my/file.txt In case the bucket requires authentication use the Amazon S3 Connection nodes in combination with the Amazon S3 File Picker in order to create a pre-signed https-URL. Or you provide such a pre-signed URL directly.

1 Like

Hi @thor,

I have used Amazon s3 connector node then Amazon s3 file ,and assigning the pRe-signed URL to flow variable , feeding this variable as in put to json reader now after execution I get error-“Execute failed: Expected end of input, but there were content: START_ARRAY” do we need to download the file somehow before reading it through json reader node.I am attaching the screen shot of the workflow.Please let me know if I am missing something.

image

Its sounds like you are not reading a valid JSON file. The error indicates that there is more data after the root object.

Hey @thor, I managed to read the file I wanted from S3 but now want to re-upload the modulated file to S3 (i.e. with the CSV writer). I can’t seem to manage to give the CSV writer a valid pre-signed S3 URL that points to a folder rather than a file (which I would get from the S3 File Pcicker).

Can you help me out here?

Hi there @Shore23,

welcome to KNIME Community!

Currently it is not possible to write to S3 with a CSV Writer node (for example with Parquet Writer it is). But there is a ticket for it and will add +1 to it.

Br,
Ivan

Thanks @ipazin!

Is there a list of nodes that are able to write to S3 apart from the Parquet writer?
Otherwise, what do you think could be a timeframe for the CSV writer ticket to be developed?

Best,
J.

Hi there,

you are welcome.

Well as a workaround you can use following approach:

WritingToS3

Create Temp Dir node has option to Delete directory on reset so that takes care of deleting file. Not ideal but should work.

Only Parquet for now but this is area under active development so there should be new functionalities with every release. No time frames, stay tuned :wink:

Br,
Ivan

This topic was automatically closed 182 days after the last reply. New replies are no longer allowed.