DB insert node is taking 1.5 minutes to load 200 rows! Could you please let me know how can I make it faster?
Source and destination are MariadDb tables
I tried below two options:
Added below in knime.ini config file:
-Dknime.database.fetchsize=50000
-Dknime.url.timeout=9000
Explicitly mentioned batch size in node configuration.Please see snapshot.
Please note, I am passing destination table name using flow variables. PFA. Should I be mentioning batch size also through flow variables? How to do that? Please help.
Something is not right for sure when taking so long.
First regarding options in knime.ini file - don’t think both of them have any impact in inserting data into databases. Go Help --> Help Contents --> KNIME --> KNIME runtime options for more information.
Regarding flow variables - whether you define something explicitly or using flow variables that doesn’t have impact on speed execution so I suggest you write schema and table explicitly while developing and use flow variables later, if needed of course. Writing 100000 next to batch size means you will create a flow variable named “100000” with value batch size has. Here is video where you can learn more about flow variables: https://www.youtube.com/watch?v=Y8hVkTvHR9I
Now to your problem. This seems at first like a network issue. Where is your DB and where is KNIME? Can you try with some other tool from same computer where KNIME is to insert these 200 rows?
Hi Ivan, I tried with Talend tool from same machine and it completed insert in just 2-3 seconds. So looks like Knime has some issue. Appreciate some pointers.
Not sure if there is way to check if KNIME is inserting row by row or in batches. I would say it is batches if you defined is so and problem lies somewhere else.
Did you maybe test it a bit more? For example did you try any other table? Did you play with batch size - lower it down? Did you try DB Writer node? You can create a dummy table with one column and then insert 10 rows into it and see how that goes…
Additionally you can always check KNIME log for more info. Log is located in your knime workspace under .metadata folder and then knime folder.
I tried DB writer node and also tried the performance for other tables with some variation but behavior was same. However the performance is fine when I executed it on machine which is near to databases. (India vs US location)
So looks like the issue was related to network.
Though I have found a workaround for this problem, I am curious why knime only has this problem when other tool could do it pretty fast? Should I check some knime configuration on my machine? E.g. memory etc?
Hi ddnyaneshwar,
KNIME is using batch execution. You can change the size of the batch via the “Batch size” option in the node dialog. A higher batch size will increase the amount of memory that is consumed but will lower the number of network round trips. However this doesn’t seem to be the problem here since you insert only 200 rows with a batch size of 200. Inserting 200 rows shouldn’t take this long in general. So maybe it is a network related issue.
Another thing you can try for large data is the DB Loader node. Which is doing bulk loading. For example for MySQL the node uses the load data command. As far as I know you can connect to MariaDB with the MySQL driver so maybe you can try the DB Loader with the built-in MySQL 5 driver to load the data into your table. The only drawback of the load command is that it uses CSV files for data exchange so it doesn’t perform data validations and working with string columns that contain new lines an other special characters is tricky.
Bye
Tobias