01_Create_ESCO_network_with_SQLite

Create ESCO network with SQLite This workflow creates an ESCO network with SQLite. It pulls data from text files - ESCO dataset - places them in a database, extracts and manipulates graph data and visualizes the results in OrientDB, the open source NoSQL database management system. The workflow shows the capabalities of OrientDB nodes. It demonstrates ETL processes from SQLite to OrientDB with the help of OrientDB nodes; the Connection, Command and Execute nodes are used. The Connection node is a general node to start working with OrientDB; the Command node executes uploading operations and the Execute node performs batch uploads


This is a companion discussion topic for the original entry at https://kni.me/w/VTF6vkt0DHqXejPu

Great example!, however some adjustments are needed for this workflow to avoid fatal errors:

  • First and most important, is that the file occupationSkillRelations.csv file is missing in the Knime server. Fortunately you can find it in this github repo and replace it locally.
  • Second is that some edge classes need to be declared (at least if you’re running OrientDB 3.x.x). You can do that by pasting this in console or inside an OrientDB Command node:
    CREATE CLASS IS_A IF NOT EXISTS EXTENDS E
    CREATE CLASS REQUIRES_ESSENTIAL EXTENDS E
    CREATE CLASS REQUIRES_OPTIONAL EXTENDS E
    
    Additionally, I encourage to create these vertex classes (and use them instead of V) to avoid other minor errors:
    CREATE CLASS ESCO IF NOT EXISTS EXTENDS V
    CREATE CLASS Skill IF NOT EXISTS EXTENDS V
    CREATE CLASS Ocupation IF NOT EXISTS EXTENDS V
    
  • Finally, the last ingestion that links skills with occupations, is around 10 million rows. It can blowout memory since OrientDB Execute node always returns a json object with each document created. This can be solved with a Chunk Loop arrangement (batch size between 100 - 1000 rows) that disposes the json result before closing the loop.

I can gladly provide this workflow updated (with these adjustments and other updated nodes), but don’t know how to post in Knime Hub

1 Like

Hi @pistolilla and welcome!

It would be great if you could post an updated version. You can do that by checking out the Hub help page here: https://hub.knime.com/site/about

Also I will tag @Redfield so they know.

1 Like

Hello @pistolilla and thank you for your interest in OrientDB extension. And I am sorry for a late response.

I carefully read your notes on the workflow. Regarding the first two points: it seems that you have not read the blogpost, where the schema creation process is described. However it is possible to create schema with Knime nodes, not just OrientDB studio, so it is better to add these additional step to the workflow.

And regarding your last point - I did not know about this trick with the Chunk Loop node, so I will do some tests, this might be very helpful.

I believe it is better to update this example workflow. @ScottF is it possible to update the workflow as well the link to it in the blogpost? If not I think I can prepare another version of the workflow with the fixes that @pistolilla suggested.

Best regards,
Artem.

1 Like

Hi @Redfield -

I would leave it to you to update the workflow in whatever way seems best to you, but we can certainly update the link (and screenshots, if necessary) in the blogpost to keep it current. Just let me know. If it’s easier you can reach me via email at scott.fincher@knime.com to coordinate.