Reading Mysql dump(.sql)

Hello~!

I downloaded the MYSQL dump file from https://github.com/datacharmer/test_db.
I want to read the downloaded dump file from KNIME.
I want to do this without accessing the Mysql database.
I ask how to read MTSQL dump file directly from KNIME AP.

Any suggestion would be highly appreciated.

Thanks!

Youngjin

This blog describes an approach using Unix tool awk and a script. It might be possible to set up such a solution with KNIME although it is not straightforward:

This high ranked stackoverflow entry describes an approach using R. There are also some limitations mentioned.

Or you could try and set up new MySQL database and import the dump.

2 Likes

mlauber71, thanks for the answer.
Sorry. The answer is late.

Is there an example of how you implemented it in the KNIME Analytics Platform?

OK, I built something since the mentioned examples are not straightforward (if you want to play around with them further in a subfolder of the workflow there is a /script/ folder containing some tests with R and Python scrips mentioned in the links but the results are insufficient if you have more complex INSERTS statements).

As an example I use the Mondial MySQL dump mentioned here:
https://blog.twineworks.com/converting-a-mysql-dump-to-csv-files-b5e92d7cc5dd
(which is also included in the workflow subfolder /data/)

A fair warning: it ain’t pretty and may not be for the faint-hearted, and depending on your dump some modifications might be necessary and if your data gets very big you might have a problem since KNIME Flow variables might not be able to handle it. For this example, it does work, however.

So what this workflow does is it identifies the 4 components of the SQL statements needed to (re-)build a database. The numbers will be represented in the no_marker/no_marker_orig columns.

  1. DROP TABLE IF EXISTS - initiates a clean start and makes sure the table has been dropped
  2. CREATE TABLE statement initiating the (empty) structure
  3. a row about primary keys and more informations specific to MySQL which H2 and other simple DBs might not understand. This No 3 will be replaced by a dummy line closing the CREATE statement (the construction with the dummy variable and ALTER table is not pretty at all - might just have removed the comma and added a bracket - but hey over-engineering is my thing)
  4. The INSERT statement filling the structure with the data. This is a tricky part since the data can contain quotes, pairs of information and other quirks. And the statement can be quite long. We would have to do with restrictions of memory and limitations of flow variables.

This is an overview of what you get. Fill in the MySQL dump and (re-)create the databases in H2.

This is what it looks like in a detail with the 3 steps:

Here are the three (remaining) steps in their glory:

DROP TABLE IF EXISTS `sea`;

CREATE TABLE `sea` ( `Name` varchar(25) , `Depth` int(11) ,  
  `xxx_dummy` int(1) );

ALTER TABLE `sea` DROP COLUMN `xxx_dummy`;

INSERT INTO `sea` VALUES ('Arabian Sea',5203),('Arctic Ocean',5220),('Atlantic Ocean',9219),('Baltic Sea',459),('Black Sea',2211),('Caribbean Sea',7240),('East China Sea',2370),('Gulf of Aden',5143),('Gulf of Mexico',4375),('Indian Ocean',6400),('Irish Sea',272),('Mediterranean Sea',5121),('North Sea',200),('Norwegian Sea',3860),('Pacific Ocean',11034),('Persian Gulf',102),('Red Sea',2635),('Sea of Azov',100),('Sea of Japan',4036),('South China Sea',5420),('Sunda Sea',7440),('Yellow Sea',200);

This might be used as an inspiration how such a problem might be handled since various dumps might have varying structures.

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.