Hmm from my experience OpenPyxl will add significantly to every operation since it will have to heavily manipulate the excel structures depending on what you want to do (good new though the package is integrated into the KNIME Python extensions).
If you use OpenPyxl to import or export Excel files you might end up with a (well) Pandas or Arrow dataframe nonetheless …
Another hyped datatype currently is Polars. Another one is feather - which I am not really familiar with … but they all are additional packages.
As I said: if you work with KNIME and Excel and Python you will face some sort of data transfer in any case. Otherwise this might very well be a Python question.
But since we are at it I put your question to ChatGPT
data = [
{"Name": "Alice", "Age": 30, "City": "New York"},
{"Name": "Bob", "Age": None, "City": "Paris"},
{"Name": "Charlie", "Age": 25, "City": None},
{"Name": "David", "Age": None, "City": "London"}
]
# Dropping a Column
def drop_column(data, column):
for row in data:
row.pop(column, None)
return data
# Example: Drop the 'City' column
data = drop_column(data, 'City')
# Filling NA Values
def fill_na(data, column, fill_value):
for row in data:
if row.get(column) is None:
row[column] = fill_value
return data
# Example: Fill NA in 'Age' with 0
data = fill_na(data, 'Age', 0)
# Forward Fill
def forward_fill(data, column):
last_valid = None
for row in data:
if row.get(column) is not None:
last_valid = row[column]
elif row.get(column) is None and last_valid is not None:
row[column] = last_valid
return data
# Example: Forward fill the 'City' column
data = forward_fill(data, 'City')