Movie Recommendation Engine with Spark Collaborative Filtering

1. Create local Spark Context; 2. Read ratings.csv and movies.csv from movie-lens dataset into Spark (https://grouplens.org/datasets/movielens/); 3. Ask user for rating on 20 random movies to build user profile and include in training set; 4.Train Spark Collaborative Filtering Learner (Alternating Least Squares) algorithm https://www.infofarm.be/articles/alternating-least-squares-algorithm-recommenderlab; 5. Apply model to all other movies unrated by user; 6. Display recommendation results for user


This is a companion discussion topic for the original entry at https://kni.me/w/2Bdh4SjLZRGugknG

I was trying to use the workflow, but it was showing error in creating local big data environment.

Will you please guide what might be the problem?

That’s hard to know without seeing the error, or anything else about your workflow. Can you post a bit more detail, please?

I want to know how to configure "Create Local Big Data Environment’.

I was able to run CSV Reader, Row Sampling for ratings and build the current user profiles but could not run the above node.

How do i fix this where I am getting error - ‘No connection available. Execute the connector node first.’

The error message we need more detail on is the one prior to that, about the Create Local Big Data Environment node itself. It says “Execute Failed” with some other info that gets cut off by your window. Can you copy and paste the full error message, or possibly upload your log?

1 Like

WARN CSV to Spark 3:222 No connection available. Execute the connector node first.
ERROR Create Local Big Data Environment 3:221 Execute failed: org/knime/filehandling/core/connections/base/BaseFSConnection
WARN CSV to Spark 3:222 No connection available. Execute the connector node first.

This is the error ,

Untill and unless i go through https://knime.learnupon.com/catalog/courses/1125041, it will be little bit unclear to me.

Thank you

What happens if you drag and drop the node from the Hub directly, and execute that?

1 Like


How can I overcome this?

Hi @santoshmd , the node you’re trying to execute had been configured to take on an absolute path to access a file.

From the screenshot image you uploaded, you were able to access the file from the alternative route already, which is the CSV Reader>Row Sampling>Table to Spark route.

As @badger101 mentioned, you need to manually change the path to point to the correct directory, since C:\Users\roberto.cadilli\... doesn’t exist on your machine.

1 Like