I wanted to run a test with Local Big Data Environment and Parquet to Spark.
Unfortunately, I get this error message:
ERROR Parquet to Spark 0:4159 Execute failed: Could not find required Spark job for Spark version: 3.0. Possible reason: The KNIME Extension for Apache Spark that provides the jobs for your Spark version is not installed. Job id: org.knime.bigdata.spark.node.io.genericdatasource.reader.GenericDataSource2SparkNodeModel
However, KNIME Extension for Apache Spark seems to be installed.
Can anybody advise how I can fix this?
These are the Knime extensions, but I don’t see any obvious mismatch.
WARN DBTypeRegistry Uninstantiable default attribute definition supplier for the database type of ID databricks in extension:org.knime.bigdata.databricks
ERROR SparkProviderRegistry Problems during initialization of Spark provider with id 'org.knime.bigdata.spark.core.databricks.DatabricksNodeFactoryProvider'. Exception: Plug-in org.knime.bigdata.databricks was unable to load class org.knime.bigdata.spark.core.databricks.DatabricksNodeFactoryProvider.
ERROR SparkProviderRegistry Extension org.knime.bigdata.databricks ignored.
ERROR RepositoryManager Node 'com.vernalis.knime.mmp.nodes.loop.CutTypeLoopStartNodeFactory' from plugin 'com.vernalis.knime.chem.mmp' could not be created. The corresponding plugin bundle could not be activated!
ERROR RepositoryManager Node 'com.vernalis.knime.mmp.nodes.uniquifyids.UniquifyIdsNodeFactory' from plugin 'com.vernalis.knime.chem.mmp' could not be created. The corresponding plugin bundle could not be activated!
ERROR RepositoryManager Node 'com.vernalis.knime.mmp.nodes.fragutil.maxcuts.rdkit.RDKitMMPMaxCutsNodeFactory' from plugin 'com.vernalis.knime.chem.mmp' could not be created. The corresponding plugin bundle could not be activated!
ERROR RepositoryManager Node 'com.vernalis.knime.mmp.nodes.fragutil.filter.rdkit.RDKitMMPFilterNodeFactory' from plugin 'com.vernalis.knime.chem.mmp' could not be created. The corresponding plugin bundle could not be activated!
ERROR RepositoryManager Node 'com.vernalis.knime.mmp.nodes.fragutil.filter.rdkit.RDKitMMPSplitterNodeFactory' from plugin 'com.vernalis.knime.chem.mmp' could not be created. The corresponding plugin bundle could not be activated!
ERROR RepositoryManager Node 'com.vernalis.knime.mmp.nodes.fragutil.fragment.rdkit.RDKitMMPFragmentNodeFactory' from plugin 'com.vernalis.knime.chem.mmp' could not be created. The corresponding plugin bundle could not be activated!
ERROR RepositoryManager Node 'com.vernalis.knime.mmp.nodes.fragutil.fragment.rdkit.RDKitMulticutMMPFragmentNodeFactory' from plugin 'com.vernalis.knime.chem.mmp' could not be created. The corresponding plugin bundle could not be activated!
ERROR RepositoryManager Node 'com.vernalis.knime.mmp.nodes.pairgen.frag2pair.Frag2Pair3NodeFactory' from plugin 'com.vernalis.knime.chem.mmp' could not be created. The corresponding plugin bundle could not be activated!
ERROR RepositoryManager Node 'com.vernalis.knime.mmp.nodes.pairgen.frag2pair.ReferenceFrag2Pair3NodeFactory' from plugin 'com.vernalis.knime.chem.mmp' could not be created. The corresponding plugin bundle could not be activated!
ERROR RepositoryManager Node 'com.vernalis.knime.mmp.nodes.fragutil.render.rdkit.RDKitMMPRenderMatchingBondsNodeFactory' from plugin 'com.vernalis.knime.chem.mmp' could not be created. The corresponding plugin bundle could not be activated!
ERROR RepositoryManager Node 'com.vernalis.knime.mmp.nodes.fragutil.render.rdkit.RDKitMMPRenderCuttableBondsNodeFactory' from plugin 'com.vernalis.knime.chem.mmp' could not be created. The corresponding plugin bundle could not be activated!
ERROR RepositoryManager Node 'com.vernalis.knime.mmp.nodes.transform.rdkit.RWMolApplyTransformNodeFactory' from plugin 'com.vernalis.knime.chem.mmp' could not be created. The corresponding plugin bundle could not be activated!
ERROR RepositoryManager Node 'org.knime.bigdata.dbfs.node.connector.DBFSConnectionInformationNodeFactory' from plugin 'org.knime.bigdata.databricks' could not be created. The corresponding plugin bundle could not be activated!
ERROR RepositoryManager Node 'org.knime.bigdata.dbfs.filehandling.node.DbfsConnectorNodeFactory' from plugin 'org.knime.bigdata.databricks' could not be created. The corresponding plugin bundle could not be activated!
can you try to reinstall the Databricks Integration? There is an uninstall button in the Installation details dialog (where you made the screenshot above).
And then with the Upload of Parquet files you will have to make sure that you have all the paths working. Is there any other errorf message or could you share a log with debugging activated?
Also you might try and reinstall the whole software and extensions. I assume you have tried restart. If you use several instances of local big data environment, they would not be separated but the first one would also used by the others which might mess with any paths you might have set.
I have now tried to re-install the Databricks extension, no luck.
I still get the same error.
Spark cluster is also up and running
I have also tried changing the context name of the local big data env, no change
Is there maybe a specific order in which I have to install these interlinked packages?
I have attached the console log with debug level logging: knime_version_compatibility.log (356.2 KB)
.
Open the knime.ini in a simple text editor, add -clean in the first line and restart KNIME. (Be sure to remove it afterward as it makes the startup time longer)
Does this help? Are you using a fresh KNIME 4.4.2 build or is this an update installation?
great that it works and it should be safe to remove the -clean option now.
From the eclipse documentation about the clean option:
This will clean the caches used to store bundle dependency resolution and eclipse extension registry data. Using this option will force eclipse to reinitialize these caches.