duplicate checker

baj · June 21, 2010, 1:38pm

Hello,

I am running into problems with the duplicate checker that is being used by various nodes.
I get errors like this one:
ERROR Java Snippet Execute failed: Failed to check for duplicate row IDs
ERROR Loop End Execute failed: IOException while checking for duplicate row IDs

Which suggests to me that my filesystem is overloaded or some random numbers collide. I don’t understand why for example the java snippet is using the duplicate checker as it appends only rows and there is no way to change the number of rows or row-ids… There also other nodes that shouldn’t be using this mechanism. And I am not aware of any method to programmatically create a table with the duplicate checker being executed… I believe in my case a big portion of the execution time is being used by that functionality.
Is there a way to disable this? I would prefer if the user had the chance to verify the uniqueness of row-ids if he wants / needs to but is otherwise free to live without…

Please let me know if you would consider making this an option…

Thanks,

Bernd

wiswedel · June 23, 2010, 9:27am

Hi Bernd,

The java snippet node does not generate new rows so it shouldn’t throw any errors at you that there duplicate keys. And reading the error message above, it doesn’t complain about duplicates but it complains that it fails to check for duplicates. Can you send the detailed error message of that “IOException” (some writing error, full disk or so?). The duplicated checking happens by swapping out to disk if there are many rows (and hence row IDs).

The duplicate checking is a framework method, there is no way to disable it. There are no plans to change that (neither the constraint that rows need to be unique – we need it, e.g. for the highlighting).

Hope this helps,
Bernd