I’m trying, without success, to initialize the “Create Spark Content (Livy)” in Knime 5.1 with Amazon Web Services.
I keep on getting the following message: “Connection timed out: no further information (ConnectException)”.
These are the options I use:
Spark version: 3.4, same as the Amazon EMR cluster I created
Livy URL: “http://ec2-XX-XXX-XX-XXX.eu-west-3.compute.amazonaws.com:8998/”
Authentication: None (I’m correctly autheticated with the “Amazon Authenticator” and the “Amazon S3 Connector nodes” that work perfectly)
Spark executor: Default allocation
Advanced tab:
Set staging area for Spark jobs: “/”
I’ve read several tutorials but I can’t figure out where’s the problem. Any clue?
Connection timed out: no further information (ConnectException)
This sounds like Livy is not reachable. You can verify that Livy is reachable by entering the Livy URL in your browser. If Livy is not reachable, you can add a custom rule to the Security Group or use an SSH tunnel:
Spark version: 3.4
You have to use an EMR version that has a matching Spark version in KNIME. The latest supported Spark version in KNIME is 3.3 and this means emr-6.11.1 or lower.
I created a new Cluster in EMR running Spark 3.3, but the problem is still there.
Nope, entering the Livy URL in my browser I get an error.
I’ve read the AWS documentation and tried to add some rules in the EC2 firewall settings. For example:
TCP / TCP / 0 - 65535 / My IP / “My rule”
No results.
Is there any chance to have a correct rule example to add? This is too messy for me, I guess.
I figured out how to configure the firewall rule. I think I was making some mistake in inserting the range of ports. And, by the way, the 0 - 65535 range is the one that works.