Knime :- Session is down

Hi, I have been using Knime to load millions fo data daily. Suddenly i see some issue with Knime when the data load goes beyond 100 million records. the SSH tool throws an error while commiting the data intothe destination. I have attached the screenshot below. Any leads would be appreaciated.

Hi @bala_13

Can you let us know what is the destination you have? Does this problem happen frequently?

Best wishes
Ana

2 Likes

HI @ana_ved
I am tryind to load data into ThoughtSpot. I have been loading the data for few months but suddenly it started throwing this error. Now Irrespective of the data records i still see error even if i do some small loads arouns 20-40 millions. I am not sure what is the problem.
Yes the problem happens frequently now.

Hi @bala_13 , I don’t think that this issue is caused by Knime. Like any SSH client, I would not think that Knime would disconnect you from an SSH session, unless you explicitly disconnect from the session.

There are many factors that can happen for losing a session. Among other things:

  1. Network issues such as inconsistent connection - if there are any interruption, say you lose connection for a few seconds, while it may not be noticeable, it’s enough to break a session.
  2. Session has expired - this is usually a security precaution where the host would kill sessions, usually idle sessions, after a certain time.
  3. Host is killing your session because you are “abusing” their resources (Number of connections, bandwidth usage, etc).

Can you elaborate how you are loading these 100 million records? Are they loaded in batches?

4 Likes

Hey @bruno29a, I dont think there is netowrk issue problem beacuse its not happening for all the workflow which i am using to load data. Also for such huge load i am using a seperate host. So i am reading the data form database segmenting into 6 csv files and then loading the data to the destination using SSH tool by using a bash file. But your points does make a sense.

Hi @bala_13 , thanks for the additional information.

“I dont think there is netowrk issue problem beacuse its not happening for all the workflow which i am using to load data”. The longer a workflow is connected via an SSH pipe, the higher the risk of connection loss than a workflow that is connected for a shorter amount of time, so unless the other workflows that are loading successfully are running at the same time, and for the same or longer amount of time as the workflows that are failing, then you’re not necessarily comparing apples with apples.

“Also for such huge load i am using a seperate host”, what do you mean by separate host? By host here, I meant the server where you are connecting to, in this case it would be ThoughtSpot. I’d still be interested to hear what you meant though, as it could help figure out the issue.

For your 6 csv files, how many records and columns per file? And how are you loading them? All 6 at the same time, or one after the other?

1 Like

Hi @bala_13

Just wanted to check if you are still facing issues?

1 Like