Big Data Extensions version compatibility

Hi there,

I wanted to run a test with Local Big Data Environment and Parquet to Spark.
image

Unfortunately, I get this error message:

ERROR Parquet to Spark     0:4159     Execute failed: Could not find required Spark job for Spark version: 3.0. Possible reason: The KNIME Extension for Apache Spark that provides the jobs for your Spark version is not installed. Job id: org.knime.bigdata.spark.node.io.genericdatasource.reader.GenericDataSource2SparkNodeModel

However, KNIME Extension for Apache Spark seems to be installed.
Can anybody advise how I can fix this?

These are the Knime extensions, but I don’t see any obvious mismatch.

image

Thank you very much!

Hi @Mol1hua,

does the KNIME console/log contain any other errors or warnings after startup?

Cheers
Sascha

1 Like

Good hint! These are the messages:

WARN  DBTypeRegistry                  Uninstantiable default attribute definition supplier for the database type of ID databricks in extension:org.knime.bigdata.databricks
ERROR SparkProviderRegistry            Problems during initialization of Spark provider with id 'org.knime.bigdata.spark.core.databricks.DatabricksNodeFactoryProvider'. Exception: Plug-in org.knime.bigdata.databricks was unable to load class org.knime.bigdata.spark.core.databricks.DatabricksNodeFactoryProvider.
ERROR SparkProviderRegistry            Extension org.knime.bigdata.databricks ignored.
ERROR RepositoryManager               Node 'com.vernalis.knime.mmp.nodes.loop.CutTypeLoopStartNodeFactory' from plugin 'com.vernalis.knime.chem.mmp' could not be created. The corresponding plugin bundle could not be activated!
ERROR RepositoryManager               Node 'com.vernalis.knime.mmp.nodes.uniquifyids.UniquifyIdsNodeFactory' from plugin 'com.vernalis.knime.chem.mmp' could not be created. The corresponding plugin bundle could not be activated!
ERROR RepositoryManager               Node 'com.vernalis.knime.mmp.nodes.fragutil.maxcuts.rdkit.RDKitMMPMaxCutsNodeFactory' from plugin 'com.vernalis.knime.chem.mmp' could not be created. The corresponding plugin bundle could not be activated!
ERROR RepositoryManager               Node 'com.vernalis.knime.mmp.nodes.fragutil.filter.rdkit.RDKitMMPFilterNodeFactory' from plugin 'com.vernalis.knime.chem.mmp' could not be created. The corresponding plugin bundle could not be activated!
ERROR RepositoryManager               Node 'com.vernalis.knime.mmp.nodes.fragutil.filter.rdkit.RDKitMMPSplitterNodeFactory' from plugin 'com.vernalis.knime.chem.mmp' could not be created. The corresponding plugin bundle could not be activated!
ERROR RepositoryManager               Node 'com.vernalis.knime.mmp.nodes.fragutil.fragment.rdkit.RDKitMMPFragmentNodeFactory' from plugin 'com.vernalis.knime.chem.mmp' could not be created. The corresponding plugin bundle could not be activated!
ERROR RepositoryManager               Node 'com.vernalis.knime.mmp.nodes.fragutil.fragment.rdkit.RDKitMulticutMMPFragmentNodeFactory' from plugin 'com.vernalis.knime.chem.mmp' could not be created. The corresponding plugin bundle could not be activated!
ERROR RepositoryManager               Node 'com.vernalis.knime.mmp.nodes.pairgen.frag2pair.Frag2Pair3NodeFactory' from plugin 'com.vernalis.knime.chem.mmp' could not be created. The corresponding plugin bundle could not be activated!
ERROR RepositoryManager               Node 'com.vernalis.knime.mmp.nodes.pairgen.frag2pair.ReferenceFrag2Pair3NodeFactory' from plugin 'com.vernalis.knime.chem.mmp' could not be created. The corresponding plugin bundle could not be activated!
ERROR RepositoryManager               Node 'com.vernalis.knime.mmp.nodes.fragutil.render.rdkit.RDKitMMPRenderMatchingBondsNodeFactory' from plugin 'com.vernalis.knime.chem.mmp' could not be created. The corresponding plugin bundle could not be activated!
ERROR RepositoryManager               Node 'com.vernalis.knime.mmp.nodes.fragutil.render.rdkit.RDKitMMPRenderCuttableBondsNodeFactory' from plugin 'com.vernalis.knime.chem.mmp' could not be created. The corresponding plugin bundle could not be activated!
ERROR RepositoryManager               Node 'com.vernalis.knime.mmp.nodes.transform.rdkit.RWMolApplyTransformNodeFactory' from plugin 'com.vernalis.knime.chem.mmp' could not be created. The corresponding plugin bundle could not be activated!
ERROR RepositoryManager               Node 'org.knime.bigdata.dbfs.node.connector.DBFSConnectionInformationNodeFactory' from plugin 'org.knime.bigdata.databricks' could not be created. The corresponding plugin bundle could not be activated!
ERROR RepositoryManager               Node 'org.knime.bigdata.dbfs.filehandling.node.DbfsConnectorNodeFactory' from plugin 'org.knime.bigdata.databricks' could not be created. The corresponding plugin bundle could not be activated!

Hi @Mol1hua,

can you try to reinstall the Databricks Integration? There is an uninstall button in the Installation details dialog (where you made the screenshot above).

Cheers
Sascha

@Mol1hua you could check if the Spark cluster is up and running

And then with the Upload of Parquet files you will have to make sure that you have all the paths working. Is there any other errorf message or could you share a log with debugging activated?

Also you might try and reinstall the whole software and extensions. I assume you have tried restart. If you use several instances of local big data environment, they would not be separated but the first one would also used by the others which might mess with any paths you might have set.

2 Likes

Hi @sascha.wolke and @mlauber71,

Thank you for your answers and suggestions!

I have now tried to re-install the Databricks extension, no luck.
I still get the same error.

  • Spark cluster is also up and running
  • I have also tried changing the context name of the local big data env, no change

Is there maybe a specific order in which I have to install these interlinked packages?
I have attached the console log with debug level logging:
knime_version_compatibility.log (356.2 KB)
.

Hi @Mol1hua,

can you try to run KNIME with the -clean option?

Open the knime.ini in a simple text editor, add -clean in the first line and restart KNIME. (Be sure to remove it afterward as it makes the startup time longer)

Does this help? Are you using a fresh KNIME 4.4.2 build or is this an update installation?

Cheers
Sascha

3 Likes

Hi @sascha.wolke ,

Wow, running it with -clean worked!!
Parquet to Spark node ran without any issue.
What can I do to get the same behavior without the -clean option?

The KNIME build is fresh, from self-extracting zip.

3 Likes

Hi @Mol1hua,

great that it works and it should be safe to remove the -clean option now.

From the eclipse documentation about the clean option:

This will clean the caches used to store bundle dependency resolution and eclipse extension registry data. Using this option will force eclipse to reinitialize these caches.

Cheers
Sascha

3 Likes

Hi @sascha.wolke,

As you said, it now works without the -clean options.
Thank you very much for your support!! :slight_smile:

3 Likes

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.