Workaround for MATLAB scripting nodes

As described in previous posts on this subforum, I have never been able to get the MATLAB scripting nodes to work - the KNIME node just hangs and the MATLAB console spits out a Java error.

So since the Python scripting nodes now work well, I came up with a solution using Python as a bridge between KNIME and MATLAB. The following script in a Python Script (1 -> 1) node will execute the MATLAB script supplied in the MATLABScript flow variable, passing it the input table as a struct called knimeIn with one field for each column of the table, and return the contents of a corresponding struct called knimeOut to KNIME. It should be fairly obvious how to adapt this to a 1 -> 2, 2 -> 2 or 'source' version, or to supply the path to the script instead of the script itself or to use a hard-coded location for the .m file if that's what you prefer

I've tested this briefly on Windows 7 with MATLAB R2016b and Python 3.6 with scipy 0.20.3 and pandas 0.19.1; any feedback, or results from other platforms or versions, would be very welcome.

"""
Execute the MATLAB code supplied in the MATLABScript flow variable
in MATLAB and return the result

MATLAB code takes input as knimeIn and returns output as knimeOut: each is a
MATLAB struct with one field for each column of the KNIME input table; each
field contains a vector or a cell array of the column contents

Requires scipy (tested version: 0.20.3) and pandas (tested version: 0.19.1)
"""

import scipy.io
import pandas as pd
import tempfile
import os
import subprocess

with tempfile.TemporaryDirectory() as tempDir:
    # generate temp file names
    matInFile = os.path.join(tempDir, 'knimeIn.mat')
    matOutFile = os.path.join(tempDir, 'knimeOut.mat')
    matScript = os.path.join(tempDir, 'knime.m')
    # generate system command
    matCmd = "load('{0}');run('{1}');save('{2}', 'knimeOut', '-v7');quit".format(
        matInFile, matScript, matOutFile)
    # save script from variable to temp .m file
    with open(matScript, 'w') as sf:
        sf.write(flow_variables['MATLABScript'])
    # save input to temp .mat file
    scipy.io.savemat(matInFile, {'knimeIn':input_table.to_dict("list")})
    # execute MATLAB code to load data, run script, and quit
    subprocess.run(['matlab', '-automation', '-wait', '-r', matCmd])
    # read MATLAB output from file
    matResult = scipy.io.loadmat(matOutFile)

# Extract the returned data back into a DataFrame
# The getArray function is needed to cope with either row or column vectors
def getArray(x):
    return x[0] if x[0].size == x.size else x
knimeOut = matResult['knimeOut']
colNames = knimeOut.dtype.names
rowIndex = range(knimeOut[0,0][0].size)
output_table = pd.DataFrame(
        {col:getArray(knimeOut[0,0][n]) for n, col in enumerate(colNames)},
        index=rowIndex)

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.