Migrate Cyrillic Database

First, I apologize because my English is not very good. Without further ado, this is my problem: I have been migrating several databases from Drupal 6 to Drupal 8 without any problem. Database that had the information in Spanish, English, German and Portuguese. Everything fine up there. The problem arose when I had to migrate a Russian database, which uses the Cyrillic alphabet. It is not about the MySQL database coding since it is correct, and I tried inserting Russian text through an insert query (INSERT TO …) in the database manager and it worked perfectly. I also checked in KNIME Analytics Platform, Database Reader >> Data from Database, that the information in Russian language will be displayed correctly and everything is fine. Therefore, it seems to me that the problem is when KNIME Analytics Platform inserts the data. I would like to know, if there is a node that I could use to code the characters of the Cyrillic alphabet; or am I missing something necessary for KNIME Analytics Platform to insert the data correctly? I attach an image so you can see how the flow was. I thank you in advance for all the information you can give me. Best regards.

You could check if your Text file encoding settings are working, I would assume UTF-8 should be OK, but maybe you have to try another setting. And also the nodes Themselves might have settings.

And could you provide an example where this is not working? Maybe with SQLite instead of MySQL so it could run independently.

Screenshots of settings here:

kn_example_cyrillic_sqlite.knwf (34.5 KB)

Thanks for answering. I have tried the example and, effectively, it works with SQL lite. Then I have adapted it to save the information in the MySql database, and it does not save the data well. I have tried with different encodings that support the Cyrillic alphabet (utf-8, utf-8 RUSSIAN CHARSET, cp1251, cp866) and it has not worked, it keeps appearing as unknown characters (?), However, and I insist on this, when I insert the data in Russian, through a query (INSERT INTO …), directly in the database manager, the characters are inserted correctly. Thanks in advance for any response that can help me.

Have you tried to use a simple example? Information in Russian characters stored in a KNIME table and then try to insert that into the MySQL database? In your screenshot I can see HTML code. Question is if the connector behaves in a different way if HTML is involved.

There was a thread some time ago where @thor suggested to add a line to the knime.ini. But there was no mentioning of a solution for the problem. Also the DB in question was MySQL.

-Dfile.encoding=UTF-8

1 Like

Thanks again for your response. I tried adding the line: -Dfile.encoding=UTF-8 at the end of the file knime.ini and it did not work. I have attached a simple example to see if you can prove it and tell me what is happening. The relationship of the files is as follows:

  • db1: Database that has a table (table1) with a text field (field1) with texts in Russian.
  • db2: Identical structure that db1, but with empty table1.
  • Workflow.zip: Sample workflow. Insert the data from db1.table1 into db2.table1.

All these files are inside the Example.zip file that I attached. I hope you can help me. Best regards.
Example.zip (12.7 KB)

1 Like

I have toyed around with the data but I cannot read the MySQL dumps in an easy way since I do not have MySQL.

Question is have you tried to store a small set of Russian text in a KNIME (without HTML marking) and then tried to store that in the MySQL.

Next radical solution could be to paste the text into a SQL executor and the load it. Not very elegant but may be something that could work.

Question is if this hint could help:

where ist does say that especially HTML must start with an additional text:

<meta charset=UTF-8>

Also:

And here it says that you cannot use UTF-8 with MySQL but have to use utf8mb4. And it could very well be you have to do something with the underlying database.

1 Like