Multiprocessing within Python Script Node

Hello – I’m wondering if anyone has had any success using the “multiprocessing” library within the python script node in KNIME. I tried using the code below, and although I don’t face any syntax concerns from the compiler, the program simply ends up skipping the last condition because it is not true “name=main”, and outputs an empty data table. This is the syntax that works outside KNIME and recommended within Python coding forums. I tried removing that condition and ran into an error. Any idea if the “main” function within KNIME’s Python would have a different name? has anyone had luck making multiprocessing work like this before?

    # Copy input to output
from multiprocessing import Pool
import numpy as np
import pandas as pd
#cummBOP=[]
#cummSHP=[]
#cummOO=[]
#cummFCSTPROG=[]

ROWCOUNT=len(input_table.index)
newColumns= pd.DataFrame(index=range(ROWCOUNT),columns=range(4))
newColumns.columns=['cumm BOP','cumm Shipments','cumm OO','cumm FProg']
iTPNum=input_table.columns.get_loc("TP Num")
#iCountry=input_table.columns.get_loc("Reporting Country")
#iSFU=input_table.columns.get_loc("SFU")
#iFPC=input_table.columns.get_loc("APO Product")
#iPlant=input_table.columns.get_loc("APO Location")
iSubSMO=input_table.columns.get_loc("Sub-SMO")
iBU=input_table.columns.get_loc("BU")
iBRAND=input_table.columns.get_loc("BRAND")
iFAMILY=input_table.columns.get_loc("FAMILY")
iMonth=input_table.columns.get_loc("Calendar Year/Month")
iBOP=input_table.columns.get_loc("BOP Values")
iOO=input_table.columns.get_loc("Open Orders+Sum(Values)")
iSH=input_table.columns.get_loc("Shipments+Sum(Values)")
iFP=input_table.columns.get_loc("Fcst Prog.")

def outerLoop(i):
	
	refTPNum=input_table.iloc[i,iTPNum]
	#refCountry=input_table.iloc[i,iCountry]
	refSubSMO=input_table.iloc[i,iSubSMO]
	#refSFU=input_table.iloc[i,iSFU]
	#refFPC=input_table.iloc[i,iFPC]
	#refPlant=input_table.iloc[i,iPlant]
	refBU=input_table.iloc[i,iBU]
	refBRAND=input_table.iloc[i,iBRAND]
	refFAMILY=input_table.iloc[i,iFAMILY]
	refMonth=input_table.iloc[i,iMonth]
	tempBOP=0
	tempSHP=0
	tempOO=0
	tempFCSTPROG=0
	print (i)
	for x in range(0,ROWCOUNT):
		TPNum=input_table.iloc[x,iTPNum]
		#Country=input_table.iloc[x,iCountry]
		BU=input_table.iloc[x,iBU]
		BRAND=input_table.iloc[x,iBRAND]
		FAMILY=input_table.iloc[x,iFAMILY]
		SubSMO=input_table.iloc[x,iSubSMO]
		#SFU=input_table.iloc[x,iSFU]
		#FPC=input_table.iloc[x,iFPC]
		#Plant=input_table.iloc[x,iPlant]
		Month=input_table.iloc[x,iMonth]
		print (x)
		if Month==refMonth and BU==refBU and BRAND==refBRAND and FAMILY==refFAMILY and SubSMO==refSubSMO and TPNum<=refTPNum:
			tempBOP=tempBOP+input_table.iloc[x,iBOP]
			tempSHP=tempSHP+input_table.iloc[x,iSH]
			tempOO=tempOO+input_table.iloc[x,iOO]
			tempFCSTPROG=tempFCSTPROG+input_table.iloc[x,iFP]
	newColumns.set_value(i,'cumm BOP',tempBOP)
	newColumns.set_value(i,'cumm Shipments',tempSHP)
	newColumns.set_value(i,'cumm OO',tempOO)
	newColumns.set_value(i,'cumm FProg',tempFCSTPROG)

ix=range(0,ROWCOUNT)

if __name__ == '__main__':
	pool = Pool()                         # Create a multiprocessing Pool
	pool.map(outerLoop, ix)  # process data_inputs iterable with pool
	
output_table=newColumns.apply(pd.to_numeric)

Hi @bebeid,

your code is not working, because KNIME is not running your python script as a standalone file, you should be able to run it successfully by just removing the if __name__ == '__main__': line and adjusting the indenting of the two lines below it.

best,
Gabriel

5 Likes

This topic was automatically closed 182 days after the last reply. New replies are no longer allowed.