Hi, I'm new here, I'm trying to use knime for my master thesis (twitter data analisys) in particular I need to perform some basic k-means clustering over tweet features that a parser daemon measures.
Now, I don't need to parse all the data for that, but just a sample.
I considered using a row sampling node, but I wander if there is a way to just use a LIMIT clause in the sql query. I tried to just add "LIMIT 0,1000" to the sql query both in database connector and database reader, but if I click "apply" it gets a red dot.
Is there a way to avoid downloading everything (best if with slq LIMIT clause)?
The read dot should also show you the reason in its tooltip. In your case I believe the SQL syntax is wrong, limit only take one argument (the number of rows) instead of two.
I tried to use only one argument in the query, but the result is the same error, althrought I find that the error is quite illuminating:
while my query is now
SELECT tweet.metadata FROM tweet LIMIT '1';
the error I get is:
WARN Database Reader com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ''1'; LIMIT 0' at line 1
which makes me think that there is a limit clause that the program itself adds to the query. Now I have to find where to edit that value to fit my task.
This is an error in the node. It adds a "LIMIT 0" at the end of the query in order to determine the table structure. If you have a LIMIT already in the original query, the resulting SQL is invalid. This will be fixed in 2.9.
That's quite a problem to my project, I'll build a dummy table to host just the sample and use the data from that instead than the original one, but now I know what's the problem ;)