DELETE ON HADOOP

I’m encountering an issue while attempting to delete folders/files in HDFS through KNIME. Here’s the error message I’m receiving:

ERROR Delete Files/Folders 5:2120 Execute failed: The file/directory ‘/app/mkcho/bpcrm/dt/in/table/TMK056.parquet/DATA_BATCH=2024-02-14’ couldn’t be deleted and the execution failed due to user settings.
It’s worth noting that this path is associated with the reading of an external table in Hive. I’ve observed this error intermittently.

Could someone provide insights into why this error occurs and how to resolve it? I’d appreciate any suggestions or guidance on troubleshooting steps to address this issue effectively.

Thank you !

Hi @And,

Maybe delete them via Hive instead. The KNIME console or HDFS logs might contain more details about the failure.

Cheers,
Sascha

2 Likes

@And you might want to check if this is the best way to delete parts of a (Hive?) file directly. It might be better to drop a partition.

If this is an external Hive table (consisting of Parquet, ORC or CSV files) on HDFS it might be necessary to set this parameter to make sure that the data is actually been deleted when deleting something in the table (if this is what you want):

TBLPROPERTIES ( 
    'external.table.purge'='true'
)
1 Like