I have used a Python scripting within my KNIME workflow. Within the configuration window, the code runs successfully but throws and error when trying to actually run the node. Execute failed: Output DataFrame contains duplicate values in its index: ‘0’, ‘1’, ‘2’, and others. This is not supported. Please make sure that each entry in the index (i.e., each row key) is unique. Can you please help me with the solution.
your DataFrame has probably duplicate index entries.
Are your row keys unique?
Also you should maybe store this Rowid and later check if you have done the right transformations.
I just encountered a thing when using thre python nodes that you might run into problems when you do joins but did not reset the index within the python node since KNIME would use the rowid as index seemingly.
as the others have already suggested, you should make sure that each row in the Pandas DataFrame has a unique index.
Dropping/resetting the index is not the best option though, because KNIME uses the RowID (which is equal to the DataFrame index) to identify rows when you select / highlight individual cells in your data. If you reset the index, then this row identification will no longer work.
The best solution would be to let the index of all values in your input table as they are, but assign new unique indices (which are actually string based) to all new rows that you add.
@mlauber71: the RowID node that you mentioned has the same problem if “enable hiliting” is disabled.
If you do not care about selection/hiliting, then dropping the index is fine, though
Hope that helps,
Hello @mohini1329 and welcome to the KNIME forum.
Have you tried to reset the index values to your DF?
df = pd.concat([df.reset_index(drop=True)], axis=1).copy()
I think it would be good to mention the handling of the RowIDs somehwere in then Python documentation. And also one should be aware that if you copy a dataframe the index might also be resetted. So we should be careful with that.
What woudl be great is a Counter node that would support Long (integers) to handle very large data sets.
This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.