Jupyter Notebook Node

For me combining (default) KNIME nodes with Jupyter Notebooks become more and more the standard. I am very happy with the recently released python nodes and conda environment configuration nodes!

Now, would it be possible to create a node - like the python node - that runs a full Jupyter Notebook in configure mode? I imagine opening the configure window and getting a folder view pointing to my default notebook save path (preferences), allowing me to open an existing notebook, or creating a new one. On opening a Jupyter kernel is started and the hosted notebook is shown inside the KNIME node configure window. On execution the selected conda environment is used directly.

Would there be others interested in such node?

2 Likes

Hi @Avit,

That is an interesting idea. Sorry for the delayed response. I wanted to see first what the community thinks about it?

I got some questions for you, I need to know exactly the requirements before creating a feature request. What would be the output of this node? A table? An image?

What do you mean by running a Jupyter Notebook in configure mode? Do you mean you see the note book as you would see it in the browser? … and have the option to execute the commands?

Best,
Temesgen

Hi @temesgen-dadi,

Thank you for investigating the idea.

In my view it would be allowing to combine data, variable and binary/image inputs and outputs. Perhaps, similar to variables for components, allowing the designer to select what ports to send to the notebook and/or to receive back.Variables could be 1-to-1 set in the notebook environment, data port as DataFrame (same as with regular python) and binary/image maybe as a list of references.

The look-n-feel would be - in my view - the notebook page, with all types of cells, embedded in a KNIME configuration window. Indeed allowing the per-cell execution and inmediate outputs.

Regards,
Arvid

The idea is just great.

And I think it is very necessary. I spent a lot of time looking for different solutions - IDEs, Editors, platforms. All of them have some drawbacks that repel from their use (or you have to measure yourself with them).

For example, I was looking for a solution with the following components:

  1. “All in 1” - no need to use third-party solutions or import / export:
  • All tools are available in one environment.
  • At the same time, it should not be just a fact of existence (for example, open a file of a certain extension), but the expansion of other capabilities and the addition of new capabilities by integrating this capability into the environment. One good example is the addition of languages ​​to powerful IDEs, where in addition to being able to simply “open a file”, various capabilities for syntax analysis, templates, links, etc. are added.
  • Extension due to plugins, extensions, modules, loading third-party libraries, virtual environments.
  • Minimization of conflicts and dependencies.
  • Creation of your own modules, templates, extensions at all levels (except for the engine, most likely) - graphical interface, modules, libraries, tools, etc.
  1. Convenience of writing code:
  • Uploading documentation for each language and for all libraries (in R and Python, for example).
  • Syntax check.
  • Substitutions.
  • Templates, constructor based on templates and modules.
  1. Convenience of “research”:
  • Dynamic code execution with preservation of intermediate results. Notebooks are the best example.
  • Using different languages ​​in one file, notebook or project. (Notebook itself can be a sequential list of nodes).
  • Dynamic visualization, widgets.
  1. Possibilities of scaling and parallelization of tasks.

  2. Opportunities to share your best practices, the possibility of packaging into a ready-made solution or project.

  3. Support and updates for all of the above items.

I would really like to see points 2 and 3 being better developed. A high-quality, extensible, universal code editor is badly lacking (its engine with the appropriate plugins can be loaded in all forms of code and script editing).
And given the popularity and convenience of dynamic programming on notebooks, it would be very nice to see the built-in notebook in the environment itself. Perhaps he will not look like others, but the main thing is that he fulfills his task.

The task of the notebook, in my opinion, is fast and productive research (for example, the ability to start training a machine learning model, and then experimenting with visualizing the results obtained, while without the need to configure any “transitions” in the form of recorded logs or saving parameters (if experiments for often this is not necessary)). So it is possible that the best solution would be to make the notebook a separate window with a separate session with its own variables, a core (better a universal core with the ability to connect several languages ​​at the same time (and the ability to communicate between them)), some capabilities inherent in all notebooks - and most importantly , the ability to export one or more cells to one or more nodes.

It is possible that a laptop can even be inserted as one large node that can be run like any regular node. But at the same time it can be opened and edited in the notebook mode. And with the final solution, it can be exported to the present nodes with a choice of node types from the existing ones (with the addition and setting of the corresponding properties).
It is possible that then it is worth making all the basic script nodes in the form of “notebook cell nodes”.

But these are all ideas, and the implementation is completely different …