Hi! I am new to KNIME platform and I am trying to test it on a workflow involving clustering certain text items with DBSCAN. I am reading data from a csv file, filter some row and ditching some column and then I am using the Text Embedder node with OpenAI Embeddings Connector, to obtain the embeddings of the text items. Because the embeddings come as a list I am using the Split Collection Column to obtain the embeddings value as separate columns for feeding them in the Numeric Distances node which I turn need to feed the DBSCAN Node. The OpenAI embeddings come with 1532 dimensions. Everything seems to work just fine until I am running the Numeric Distance node, which, although does not raise any error, seems that it cannot determine the distances and make DBSCAN to raise an error of type: Missing columns: “Split Value 1”, “Split Value 2”, “Split Value 3”, “Split Value 4”, … <243 more>. In the Output panel of Numeric Distance node I have folder-like structure where folder distance-characteristics is array-size [xint] → 1 and 0 [xstring] → METRIC
I am not very sure what does mean, but it seems that the node cannot calculate the distances necessary to be fed in to the DBSCAN Distance Model Port. So I am kind of stuck, as there is no error message when Numeric Distances node is executed. Does anyone know more about what could be the problem. and how can be fixed? Thanks!