Unpacking arrays in Knime

Hi there,

I’m transforming data and saw that some geometry data in my oracle database is being displayed as an array when read into knime as:

oracle.sql.ARRAY@(various numbers - i removed for confidentiality)

Does anyone know how to unpack arrays in Knime? Essentially in Oracle, it is displaying what I would interpret as x and y coordinates but is displaying in such a way in Knime that I can’t work with.

I read some vague solutions that Java Snippet can be used but I am not familiar with Java Snippet node.

Cheers,

Hi,
Do you get back a KNIME collection column with an icon like this: […]? Then you can use Split Collection Column or Ungroup to unpack the array.
Kind regards
Alexander

3 Likes

Hi Alexander,

In KNIME, I just see a sequence of numbers so I can’t even see what’s in the brackets but in Oracle, I can see it contains x & y coordinates).

So for example, the field in the KNIME table once I read it out of oracle looks like oracle.sql.ARRAY@ahUtB2Nk78

So I’m unsure of how I can convert it.

Kind regards,

Hi @summer_le

This looks like an object reference, rather than the data itself, is being returned from the database.

What does the data look like in the database, and what is the sql query that is returning this?

1 Like

@summer_le

Suppose I have the following demo table on Oracle:

CREATE TABLE bb_dm_testarray (
  id NUMBER,
  my_string_vals SYS.ODCIVARCHAR2LIST
);

INSERT INTO bb_dm_testarray (id, my_string_vals)
VALUES (1, SYS.ODCIVARCHAR2LIST('Value 1', 'Value 2', 'Value 3'));


INSERT INTO bb_dm_testarray (id, my_string_vals)
VALUES (2, SYS.ODCIVARCHAR2LIST('Value 4', 'Value 5'));


INSERT INTO bb_dm_testarray (id, my_string_vals)
VALUES (3, SYS.ODCIVARCHAR2LIST('Value 6', 'Value 7','Value 8','Value 9'));

commit;

I’ve just used the built in sys.odcivarchar2list type for the purposes of a demo, to avoid creating a user-defined array type on my database.

selecting from this table, directly in Oracle Sql developer it would look like this:

select * from bb_dm_testarray

Querying that table in KNIME, using DB Query Reader I see this:

which I think is what you are seeing.

Each of those array items is an object reference and is of no use to us in this form so we need to deal with them to return their data to KNIME.

I found two ways to achieve this.

1. Using a loop with a query

The Table Row To Variable Loop Start node takes each ID in turn from the original query and places it in the ID flow variable.

The DB Query Reader node inside the loop uses the Oracle TABLE function to return the array as a table for the currently selected row (identified by ID, using the “ID” flow variable).

SELECT $${DID}$$ AS ID, COLUMN_VALUE AS STRING_VAL
FROM 
	TABLE(
		SELECT my_string_vals FROM bb_dm_testarray
			WHERE id = $${DID}$$
		)

The loop repeats for each of the returned IDs from the first query, and builds up the data table:

image

This unfortunately could take quite a lot of time on a very large data set as it requires a database call per row returned, which is where method 2 could be better…

2. Using LISTAGG function along with the Oracle TABLE function
Here we can return the data in a single select statement by using the Oracle LISTAGG function to turn the array into a delimited string:

image

SELECT id, LISTAGG(COLUMN_VALUE, '|') 
	WITHIN GROUP (ORDER BY COLUMN_VALUE) AS delimited_values
FROM (
  SELECT id, COLUMN_VALUE
  FROM bb_dm_testarray, TABLE(my_string_vals)
)
GROUP BY id

You can see from the preview results in the above that it returns a delimited String. You could then follow this with something like a cell splitter, creating a “List” and then ungroup:

image

image

This second option should be much faster on larger datasets because it is a single query rather than looping through multiple database calls.

Hopefully there are sufficient screen shots and information here for you to be able to replicate something similar with your query.

5 Likes

Kudos to your work and the detailed explanation. You seem to have quite a lot of experience with Oracle?
All I know about Oracle is that it is not MSSQL Server :sweat_smile:
br

1 Like

Thanks @Daniel_Weikert … well I first used Oracle when it was still just about the tail end of v6, back in around 1990-91, so hopefully something has rubbed off over the time :wink:

Mind you those were the exciting days of trying to make “Oracle Glue” work on Windows 3.0 and in the hope of getting data into something called Microsoft Excel v3.

ODBC and VBA were yet to be invented and Excel macros weren’t always (ever) pretty! :slight_smile: Meanwhile, I understand that Microsoft had been helping IBM build an operating system called OS/2 and had also done some DB work with a database company called Sybase and rebadged it as SQL Server, but I don’t think I would hear about that for at least another 5 or so years.

Actually looking back on a world before the interoperability that we have today… I really wouldn’t want to live that over again, not if I knew that it would be 30 years before I would start using KNIME. :open_mouth:
(apologies, I think I strayed a little off topic!)
Cheers!

2 Likes

Hi @takbb

You are correct in that in Oracle, I am seeing ‘MDSYS.SDO_ORDINATE_ARRAY(16785702.7150333, -3988822.47135409)’.

Thank you for this ideas, I am very excited to try and will come back and you know of my success (or lack of). I am a very slow worker because I go through every node to understand (I took 30 mins to work through a Column Expression node with nested if statements just to give you an idea :P).

In the meantime, have a really great weekend.

1 Like

Hi @summer_le , thank you. Hopefully it will help you get your data into KNIME.

As your array data is numeric, there may need to be some additional data conversions required when your data is returned to KNIME, if using the LISTAGG function, as obviously this will create a String containing your delimited data, but that should be relatively straightforward.

Please do let us know how you get on, and I hope you have a good weekend too. :slight_smile:

Hi,
I am not familiar with Oracle, but it does seem to have similar JSON functionality like Postgres and there is a JSON_ARRAYAGG function. Maybe that would be even better because we can easily parse the returned data using KNIME nodes?
Kind regards,
Alexander

1 Like

Good call @AlexanderFillbrunn , yes that could well be a good more generic solution.

Further to @AlexanderFillbrunn 's suggestion, the following query in a DB Query Reader would unpack the oracle array into a table and return it as a json collection. This arguably gives better options for handling the data as it is returned to KNIME:

SELECT id, json_arrayagg(column_value) AS json_string
FROM bb_dm_testarray, TABLE(my_string_vals) t
group by id

image

in fact, the earlier LISTAGG query I posted could be similarly simplified:

SELECT id, LISTAGG(COLUMN_VALUE, '|') AS array_element
FROM bb_dm_testarray, TABLE(my_string_vals) t
group by id

image

Going back to the JSON version… that comes back as a string, so it could be converted to a json object and then after that there are a variety of ways to pull the data out:

e.g. JSON to Table:

or

JSON Path:

both of which would create a collection object that can then be ungrouped as before.
image

Thinking of other possibilities, another option that would just provide the “denormalised” data, you could put the following in a DB Query and get each element back in a row of its own:

  SELECT id, COLUMN_VALUE
  FROM bb_dm_testarray, TABLE(my_string_vals)

This would return:
image

2 Likes

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.