Download from HDFS

This is very similar to what has been reported in “HDFS Connector, Download not working, List files does

The problem is Download node. In this case we have already tried both HDFS and WebHDFS connections.

It is a Hortonworks HDP 2.6.3 platform. We need to run a KNIME workflow with:

  1. HDFS Connection node (ok)
  2. List Remote Files node (ok)
  3. Download node (error)

The Download node is in running state for a while (BTW: no way of stopping it), and errors out in the end.

Thanks in advance

Can you post your workflow pls?

make sure you connect via WebHDFS to the right port: image

Hi @peleitor

could you provide some more details on your cluster? Is it a “real” cluster or a HDP sandbox VM? How do you connect on a network level to the cluster, e.g. are there firewalls that do NAT (*) inbetween?

Due to the way that the HDFS and WebHDFS protocols work your client needs to be able to make direct network connections to the NameNode service and all DataNode services in your cluster. The ports for those connections are documented here:

https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.3/bk_reference/content/hdfs-ports.html

  • Relevant for the “HDFS Connection” node are the IPC protocol ports.
  • Relevant for the “webHDFS Connection” node are the HTTP protocol ports.

Best,
Björn

This is a real cluster. Both HDFS and WebHDFS are enable, and I am using standard ports.