So I have a few 100 files with publication records for academic papers, things like Author name, phone, address , etc. One of the fields is Keywords: (another one with same problem is Abstract:) which in some files the entry is in a one line string in some other files entry is multiple lines. I would like to have one row for each field and entity. any ideas ?
How do you retrieve the data? One easy way would be to use the Document Grabber node, which you can use to get data for academic papers from PubMed. With a subsequent Document Data Extractor node you can extract fields like the author, title, abstract etc. all in one row for each field and entity.