How to manage Byte Vector columns with SQL storage?

Hi guys,

How can I store a column with a “Byte Vector” format in a SQL database? Is there an ad hoc SQL format for this type of data?

By default this type of column is written in the SQL database with a string format type (using the Database Writing node). In this case it seems that it is correctly stored in the database. Anyway when I later read it from there, it seems I cannot re-use it with, for example, the Similarity Search node, as it is in string format. And I cannot find any node to re-convert the string formatted column to Byte Vector. Is there a way to do it?

Thanks a lot

Gio

You can change the SQL type for columns in the second tab of the database writer. Try using BLOB or BYTEA or similar (depends on the database).

Hi Thor,

Thank you so much for your prompt response! I'm using postgreSQL database.

If I try BLOB type it tell me:

ERROR     Database Writer                    Execute failed: ERROR: type "blob" does not exist

I'm not an expert on this but think that this type of data do not exists in postgreSQL.

Instead if I try BYTEA (that for sure exists in postgreSQL) it creates the DB table but this is empty. And a series of warning messages (like the following) is triggered for any table row:

WARN      DatabaseWriterConnection           Error while adding row #10 (Row9), reason: ERROR: column "byte_vector" is of type bytea but expression is of type character varying
  Hint: You will need to rewrite or cast the expression.

Please, do you have additional hints?

There is a Vernalis node scheduled for release soon which will convery the any of the fingerprint types to string representations and back again - if nothing more direct comes up, then that might be your best bet.

Steve

Hello again,

I'm coming back to this issue.

I discovered that the content of a KNIME Byte Vector column is of type "DenseByteVectorCell".

Does anybody know which kind of SQL standard type can be used to store this data type?

Thanks in advance,

Gio

Hi Steve,

Thank you for your feedback. I'm curious to try the release of the nodes you mentioned.

In the meantime I reach a workaround to manage Byte Vector columns with SQL storage. In practice after using the Create Byte Vector (or Create Bit Vector) node we can store it in a SQL DB using the “text” SQL type. After we re-read it from the DB we can reconvert it to Byte Vector (or Byte Vector) simply using 3 nodes:

  • String Manipulation node to remove the “{“ and “}” symbols at the beginning and at the end of byte vector column
  • Cell Splitter node to split the byte vector column into the single byte containing columns, also trimming white space characters
  • Create Byte Vector node in order to re-create the byte vector

This is just a workaround but it perfectly work.

Cheers!