Handling .RIS file (citation manager) in Knime

Hi,

 

I am handling RIS format files, generated by citation manager software as Zotero, EndNote (http://en.wikipedia.org/wiki/RIS_%28file_format%29). I attached a small sample of the RIS file I am dealing with.

 

I would appreciate any help in order to translate some operations into Knime. Below there are two registers that are in the same file, just separated one of another by an empty row. I need to transform this structure in a table (the simulation of the desired one is below).

From:

TY  - JOUR

TI  - Get up…

AU  - Clegg, Stewart

AU  - Carter, Chris

T2  - Management Review

 

TY  - JOUR

TI  - Bringing Space…

AU  - Kornberger, Martin

AU  - Clegg, Stewart

T2  - Organization-Studies

 

To (simulated table I need to obtain from Kime):

TY           TI                                AU                                                               T2

JOUR     Get up…                     Clegg, Stewart; Carter, Chris                  Management Review

JOUR     Bringing Space…       Kornberger, Martin Clegg, Stewart       Organization-Studies

 

OBSERVATIONS:

 

a) I tried to use the ‘File Reader’ node, being my first step the separation of the fields having as ‘column delimiter’ the dash sign -

The problem with the dash sign ‘-‘ is that some names have the dash as well (e.g. Organization-Studies) and it cause a mess in the whole file.

 

b) Some fields, as AU (author) will repeat in the RIS file considering the total of authors of an article. So, in the desired Knime table, I need that the authors be inserted in a same AU column, being separated each other from a semi-colon ;

 

Many thanks in advance,

Cadu

Hi Cadu,

Interesting problem... I have tried to solve, but with only partial success. I guess it would be easier with a custom importer node, or maybe with a node witch gets the whole table (Jython or R). Anyway, if you prefer, you can try to fix (enhance) the attached workflow, altough it requires the KNIME Utilities from HiTS (which is not yet updated to KNIME 2.8, so each start it will show some error messages - you have been warned ;) ). Probably I have just found a bug with this input. (Well, at least a way to improve. :) )

Cheers, gabor

This (R package to read ris files) might be more help.

Hi Gabor,

Thank you very much for share the workflow. I am doing my way trhrough Knime learning and KNIME Utilities is something to be added to my next steps to go deeper on handling this amazing tool.

 

QUESTION=> You mentioned that "(R package to read ris files) might be more help". I read that there is some ways to run R inside Knime. Would it be possible to use Knime and R (inside Knime) to read RIS files, or should I try to do this direct in R, without any interface with Knime?

 

Cheers,

Carlos

 

Hi Carlos,

There are labs and community extensions for KNIME, so you can use them within KNIME. (Under Windows there are binaries of R that can be installed, not sure it supports installing packages.) I guess you can install/check R packages from within the R nodes. You should specify in the metadata of the workflow which extensions, R version are required if you plan to share that. (Sometimes I wish I had shared these infos with myself for my workflows.)

The KNIME Utilities are specific for certain tasks I needed, so beyond those, it might be less interesting for others, it should have some updates, although not sure I'll perform those in the near future, so it might worth seeking for alternatives. ;)

Cheers, gabor

PS.: Have fun and success working on your PhD dissertation! :)

Hi Gabor,

 

Thanks again. Now I have Knime with R nodes!

 

HELP:

 

I could achive the transformation from RIS to BibTeX you suggested before (http://www.inside-r.org/packages/cran/ris/docs/read.ris).

 

Actually, I am using Zotero as citation manager and I can export as RIS, but also as BibTeX. So, I did this through Zotero and R wasn't necessary.

 

The structure of  BibTeX file (attached) solved the problem that exists in RIS file related to authors, since there is now just one field called 'author'.

 

So, I have now the need to transform BibTeX file in a table (same thing I was looking for how to do based on RIS file). Is Knime able to do such a transformation? Which nodes should I use?

 

OBSERVATION: The BibTeX field names can vary in different registers of the same file. For instance, in the attached example the 'copyright' field exist in one register but not in the other. It is because in the source of one of the registers (Zotero) the field is empty and so it isn't exported to the BibTeX output format. There is a kind of rule if the field doesn't exist in one of the registers, the field content should be treated as empty (but a column 'copyright' should exist always, once it appeared at least in one of the registers).

 

Thanks,

Cadu

I would use regular expressions or a similar way to extract the information you need from BibTeX files, but probably someone else created a node to handle BibTeX files. (Maybe the textprocessing nodes?) If you are familiar with Jython, or groovy (and/or KNIME development), imho the RIS is better to parse, but if your input is BibTeX, it might be better to parse those.