I’m encountering a batch size error in Python that has been reported on here before. It’s not clear to me if it’s supposed to have been fixed or not. I just updated my knime to version 5.5.2, but the error is persisting.
Quite annoying to encounter because it seems to be an issue with Knime’s integration with pandas, or just exporting the data to Knime. Though my other Python Scripts are working fine using the same line of code (Code #1), but in this script it doesn’t work.
I attempted to implement the solution recommended here, but the error still occurs. Code snippet #2 below shows my attempt.
I put an X next to the line that is noted in the traceback in both options.
(I cannot format these as code blocks because my browser crashes whenever I click that option in this forum posting box.)
Error:
ValueError: all batches of the table must have the same size, but batch 1 has size 42 (expected: 26)
Code #1:
X knio.output_tables[0] = knio.Table.from_pandas(final_df)
Code #2:
table = pa.Table.from_pandas(final_df)
batch_size = 1000
num_rows = table.num_rows
pa_batches =
for start in range(0, num_rows, batch_size):
end = min(start + batch_size, num_rows)
pa_batches.append(table.slice(start, end - start))
batched_table = knio.BatchOutputTable.create(row_ids=“generate”)
for batch in pa_batches:
X batched_table.append(batch)
knio.output_tables[0] = batched_table