ETL Help: Loading SAS Datasets to MS SQL Server

Hi Hela,
DB Writer is a combination of the DB Table Creator and DB Insert node to make it easier to write data into the db into a new table. However usually one creates a DB table once and then continues to insert records into the already created table. This is why we split up the tow operations. The DB Table Creator also allows you to specify indices e.g. primary/unique keys which you should consider to create to speedup querying the data or updating the data later on in the db.
You can use flow variables to automatically use the same schema and table name in the DB Insert node as you use in the DB Table Creator node. Actually the DB Table Creator node automatically exposes the table name and schema as flow variables (they are called tableName and schema). In the DB Insert node you can use the flow variables via the two buttons next to the schema and table name input fields as shown below:


Bye
Tobias

1 Like

Thank you Tobias again. I followed your instruction and DB Insert has a green light but no Tables are written to SQL. Here is the new workflow:

Also, I can’t make connection between DB Insert node and Variable Loop End node. Do I need the latter at all? is the Loop node causing tables are not in SQL?

Helal

Hi Helal,
you need to connect the DB Insert node with the Variable Loop End node to close the loop. Otherwise the Loop Body will only be executed once. You can connect the DB Insert using the Flow Variable ports that each node has but that are hidden by default.
For more details on Flow Variables and Loops in KNIME have a look at our online course.
Bye
Tobias

1 Like

Thank you Tobias again. I will do.

Helal

Hi Tobias,
I followed the instructions and was able to get it to wrok yesterday by importing 10 SAS data files to SQL. Today, I receive the “String or binary data would be truncated” even though I set the string in DB Table Creator SQL type to 1024 and onetime I set it to max. I looked at the log to see which file(s) is causing it but didn’t find specific file or field information. Any thoughts?

Thank You,

Helal

HI Scott,
With Tobias’s help, I am getting very close to complete my ETL process reading multiple SAS data sets and importing them to SQL. However, after troubleshooting few issues, there emerged another one. This one is at the DB Insert node and error says: cannot insert Null into column on table. I made sure that “Not Null” option in DB Table Creator node is unchecked for all columns in the data sets. Attached is the new Knime flow. Any suggestions how to resolve the issue?

Thank you,
Helal

Not offhand, but I am not a DB expert unfortunately. Do you have any additional information from the log, perhaps? Let me ask internally.

Thank you. Attached is the log.
Knime.log (41.0 KB)

Hello helalmob,
is it possible that the table already existed in the DB with the not null option enabled for the dbtest02.dbo.btact. Bye default the “Faile if table exists” option in the DB Table Creator node is disabled which is why might not have noticed it. In order to recreate the table simply add a DB Table Remover node before the DB Table Creator node.
Bye
Tobias

2 Likes

Two things. There is a warning about a timeout of 3 seconds which might not be enough.

2020-03-20 11:10:09,303 : WARN : main : : Node : DB Insert : 0:381 : java.sql.SQLException: I/O Error: Connection reset by peer: socket write error
2020-03-20 11:10:21,742 : WARN : main : : DBDataPortObjectSpec : DB Insert : 0:381 : Node configuration was aborted after 3 seconds. To change the timeout parameter ‘Retrieve in configure timeout’ go to the ‘Advanced’ tab of the database connector node.

And a strange warning about a file NULL that could not be read by the SAS reader which might be connected to the NULL problem.

2020-03-19 20:22:56,563 : WARN : main : : Node : SAS7BDAT Reader : 0:369 : C:\Program Files\KNIME\null (The system cannot find the file specified)

You might want to check that. Maybe use a path with KNIME protocol.

Maybe you could provide us with a simple SAS file with unsuspicious data from your system so we might have a try.

3 Likes

It appears you have by-passed the Data Cleanup before the DB Insert, you should link the Data Cleanup node to your DB Insert node, otherwise none of the cleaned-up data is being passed to SQL Server.

Burke

1 Like

Hello Tobias,
For this trial and get the ETL process working, I usually delete all tables in the test database before running the workflow. So, there are no tables in the test database. Later and once workflow worked, I will add the node yo mentioned.

Thank you,

Helal

I am not sure why I get C:\Program Files\KNIME\null (The system cannot find the file specified).

Helal

Me neither. But this is at one point where the node seems to fail. Therefore my idea to use KNIME protocol to see if it changes anything and provide us with a minimal example that reproduces the error.

Also you could try an entirely new node. I have some hints that with certain nodes not all relevant settings are reset.

Hi Helal,
“the system cannot find the specified file” message is only a warning and not the main reason why your workflow fails.
According to the KNIME log the DB Insert node fails because you try to write missing values into a database column that does not support NULL (missing) values.
Can you please double check that you delete the table from the database either via another tool or the DB Table Remove node in KNIME and then make sure that the Not Null column in the Columns tab of the DB Table Creator node is not selected


Also make sure to use the same schema and table name in all the nodes.
Bye
Tobias

3 Likes

Hi Tobias,
Thank you for the tips. I did all you suggested and now I am getting a new error that a variable in one the data set is invalid!!
Here are screenshots of the process and associated questions:

  1. LoopOverFiles_Config: Here I am defining missing values with String = null
  2. DB Table Remover_Config: I am specifying Schema and table name. WI can’t find a way to select all tables or specify table names! it could be that the workflow stopped right after the importing hte first table into SQL
  3. DB Table Creator_Config1: I define Schema but the Table Name can’t be left blank so I populated with the name of the first SAS dataset (btact). Can’t fiind a way to say grab all table names from file names!
  4. DB Table Creator_Config2: I can’t see “Knimetry” under Schema drop down nor do I see “Tablename” under TableName drop down!
  5. DB Table Creator_Dynamic Setting: Here I am defining string value to SQL type varchar(max) to make up for those string fields with even more than 1024 characters
  6. DB Insert Config: Here is the settings for DB Insert node
  7. DB Insert Output Type Mapping: Here is the Output mapping
  8. DB Insert Flow variables: I selected tablename from Table drop down list but don’t see the Knimetry under Schema drop down
  9. Workflow before Running Last Step (Var loop end): Here is the workflow that ran with no issues before running the last step
  10. Workflow After Running Last Step (Var loop end): Here is the work flow that ran with errors after running the last step
  11. I included Knime.log file for further info.

Any help will be greatly appreciated.

Helal

Knime.log (5.0 KB)

Hi Tobias,
Any suggestions to move forward?

Thank you,

Helal

Have you tried to use all lower cases? Just to be sure this is not the point.

Then have you thought about my other remarks? Especially trying to delete and set up the relevant node from scratch just to be sure.

Then you could try combinations of your table name and failing variable name with different schemes from your database. And also try the same column name and scheme with a table created entirely in KNIME to get an idea if there might be a problem with the driver (you have the latest one?).

Hello helalmob,

the error message states that you want to write a column with name RecordDate into the db table with name Knimetry.btact but the column does not exist in the database.

I notice that you write the Schema name sometimes Knimetry (DB Table Creator) and sometimes knimetry (DB Table Remover). The schema is case sensitive so this refers to two different schemas! Ideally you also use flow variables for the schema name to be consistent.

Also if you want to write each SAS file into another db table you need to attach the db table name to each file and use this information as flow variable in all db nodes e.g. DB Table Creator and DB Inserter.

Bye
Tobias

1 Like

This topic was automatically closed 182 days after the last reply. New replies are no longer allowed.