Can't get pandas.append to work in PythonScript node

Hi,

I'm using KNIME v2.11.3.  I am unable to get the panda append method to work in the KNIME Labs Python script node.  I get no errors, but when I print the contents of the output dataframe, it is empty indicating that the append method didn't work.

Here is a "cleansed" snippet of my code to illustrate.  The format of the input_table is the same as the output_table that I have defined, minus the 'CDS_FIELD_ID' column. (I am a novice python scripter, so please be easy on me ) :-)

import re
import pandas as pd

output_table = pd.DataFrame(columns=('ID','username','CREATED','MODIFIED','CONTENT','TYPE','NAME','CDS_FIELD_ID'))
cds_ids = pd.DataFrame(columns=('ID','CDS_FIELD_ID'))
for theID, username,theCreated,theModified,theContent,theType,theName in zip(input_table['ID'],input_table['USER521'],input_table['CREATED'],input_table['MODIFIED'],input_table['CONTENT'],input_table['TYPE'],input_table['NAME']):
	idSeries = re.findall(r'FACT\/([0-9]*)',theContent)
	print idSeries
	for theCDS in enumerate(idSeries):
		print theID,username,theCreated,theModified,theContent,theType,theName,theCDS[1]
		thisRow = pd.DataFrame([theID,username,theCreated,theModified,theType,theName,theCDS[1]])
		output_table.append(thisRow)
print output_table

Thanks,

Mark

Hi Mark,

here is a simple example on how to append one DataFrame to another:

output_table = input_table_1.append(input_table_2, ignore_index=True)

You have to assign the returned object of the append method (it does not modify the original object).

You also need to make sure that your row index is unique otherwise the DataFrame can not be translated back to a KNIME table (which needs unique RowIDs). That is what the option ignore_index=True is for, which assigns an autoincremented index to the rows.

Cheers,

Patrick

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.