Gzipped CSV reader without saving to HDD

@Iwo you can create .tar.gz files and also extract CSV from them like in this example:

import knime.scripting.io as knio
import tarfile

import numpy as np
import pandas as pd

csv_file = knio.flow_variables['File path']
output_file = knio.flow_variables['context.workflow.data-path'] + knio.flow_variables['File name'] + ".tar.gz"

knio.flow_variables['v_tar_gz_file'] = output_file

with tarfile.open(output_file, 'w:gz') as tar:
    tar.add(csv_file)

You could give the .tar.gz file as “input_file” and the code would then extract all CSV files into the “output_directory” you have defined. You can adapt these settings:

Last example would extract the first CSV file found in a pandas dataframe and give that back to KNIME. Please note the sample CSV file as pipes has columns separators (|) you might want to adapt that.

kn_example_python_tar_gzip_csv.knwf (76.3 KB)

2 Likes