Error when I'm trying to read a geojson file

Hi, I’m trying to read this geojson file from a kaggle dataset here:

link to the dataset

The file is called cbg.geojson. I have tried to read it with json reader node and I have also tried by using the streaming execution option.

But the result is still the same. Knime freeze and and I need to force closing.
The machine has 16GB ram.

I just want to read it and convert it in a table/csv format.

Does anyone have an idea of how to solve it?

Thanks in advance.
~g

Any help?..

Hi gujodm,

Yup, looks like parsing that 3 GB json requires more than 16 GB of memory (On a sidenote, I tried with 28 GB and it still wasn’t enough). I suppose the JSON Reader node in its current state is not meant to parse files of that size. As a workaround, you could read the file line-by-line using a Line Reader node and then do the parsing of json objects manually. I’ve attached a workflow that should do that legwork, but it won’t exactly be fast. It can probably be heavily optimized.

Best,

Marc

json_row_by_row.knwf (18.3 KB)

2 Likes

Hi @marc-bux,
thank you for this interesting workaround.
It requires several hours for complete the process, but it seems working well.
I have just tested it with a limited amount of rows.

~gujo