Hi everyone,
I have an old KNIME workflow for survival analysis on a public and popular employee dataset that contains 1470 rows and 33 columns, and I’m running into an issue within a metanode for variable selection.
Inside a loop (Column List Loop Start), I iterate over columns and pass them to an R node (Table to R). My goal is to compute a log-rank test (survdiff from the survival package) for each variable against the survival outcome.
Here is the R script being used:
library("survival")
variable <- knime.flow.in[["currentColumnName"]]
column <- knime.in[,variable]
diff <- survdiff(Surv(knime.in$"Duración", knime.in$"Abandono") ~ column, data = knime.in, rho = 0)
significacion <- pchisq(diff$chisq, length(diff$n)-1, lower.tail = FALSE)
print(significacion)
However, the node does not execute properly.
Has anyone faced a similar issue or knows the correct way to handle it?
I would really appreciate any guidance or suggestions. I am sharing the workflow and dataset.
Thanks in advance!
Survival workflow.knar (258.1 KB)
DataSet.csv (364.4 KB)