how to import tables from .docx documents via R snippet

You can still try to install RServe anyway the Renviron is just an additional help. Or you would have to install R in a folder and tell it specifically where the library is.

Renviron by default sits in the home directory.

Next possibility would be to compile a R version on a different machine and copy it

1 Like

I just installed RServe in R 3.6.1; I reinstalled docxtractr, but Knime’s R Source (Table) still doesn’t see docxtractr

I still can’t use .Renviron, and I can’t use another machine, by friday also this one will be erased and formatted… by then… end of the topic… :frowning:

Surrender is not an option. Could you share your KNIME settings regarding R? And could you maybe share the log file. So I understand you now have the latest version of RServe 1.8.3 running. The rest should be simple …

A good part of analytics is not to give up and try to get to the point by small changes.

1 Like

Thank you, Mlauber, I think the same way you do, my problem is only lack of knowledge an lack of time… so:

how can I do this?

For the settings you could just make a screenshot like

Then in the KNIME folder you should have the knime.log file. Maybe you delete it first, then start KNIME again and try to run your workflow and then send us the file.

Then you could use this workflow to check some R settings (I might extend that further):

1 Like

Here are settings:

R

R-scripting

and this is RServe version according with your workflow:

OK I think I know what is going on. You have two R versions on your system. The ‘integrated’ one (32 Bit deep within KNIME) and your local one (the one you should use).

Just change the setting in the first screen to:

C:\Program Files\R\R-3.6.1

then it should see your R version.

The explanation is in the longer article:

4 Likes

HOOOOOOORRAAAYYYYYYY!!!

Thank You Mlauber, that was the last problem, now the workflow works!

Thank you so much for your patience, your competence and resilience, I would not have succeeded alone.

2 Likes

Glad it worked out in the end :slight_smile: Sometimes KNIME and R (not to mention Python) can be somewhat tricky but the reward is you gain an ocean of new possibilities.

1 Like

Excuse me, Mlauber, I have one more question about the tuning of the workflow:

I’m going to use it to extract all 7 tables from hundreds of .docx together in a specific directory

I tried to manage the nodes to do this, and I thought I had to change the settings of the String Widget

I suppose Default Value need to be changed in order to extract tables from all the documents I put in a specific directory: till now I didn’t get anything nor inserting the path of the directory, nor adding to it *.docx nor ?.docx

Have I to manage the flow variables? Or need I to change something in the R code in the R Source (Table) node? Perhaps iterating all the script?

I hope this will be the last difficult about this workflow…!

thank you again!

1 Like

Hi there,

nice one @mlauber71 for not giving up :clap::clap:

Br,
Ivan

3 Likes

Great, really effective!:+1::+1::+1:

Just a quick hint, the variable node is just there to simulate the usage of a variable in R.

In practice you might adapt something like this workflow:

Instead of excel files list .docx files and loop thru them and collect the results of the tables. Maybe I can create an example later.

2 Likes

I think this is the solution!

I linked List Files to a directory with a number of my docx, and it works

in the Snippet I tried to substitute

library(docxtractr)

path <- knime.flow.in[[“Location”]]

v_docs <- word_docs(path=path)

knime.out <- as.data.frame(v_docs)

but this didn’t work: I miss something (as usual!) but I think this could be the solution (if it’s possible replace .xls with .docx)

1 Like

I think you still need both R instances since you have changing structures. In this workflow alls .docx files get listed and then imported via R.

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.