XML file

madegomez · April 6, 2018, 7:36pm

Hi!
I am looking to extract text through the science direct API … I am using the following location:
https://api.elsevier.com/content/article/doi/10.1016/j.jweia.2013.11.013?&httpAccept=text%2Fxml

I am using the node of XML Reader and then Xpath to establish which section I want to extract, I look for the introduction and take the extracted terms to a docuemnto to place it in a bag of words.

I have not really been able to extract just the introduction with the Xpath node and I think it’s because of the conditions of the location address that I use where it appears
false </ openArchiveArticle>

However I’m not sure, I’m very primitive understanding XML. I would like to know if anyone knows how to perform this extraction and in case the extraction can not be done, I would like to know if it is for the reason that I explain or another.

One last point, I would like to know if someone could share with me how he/she has extracted information from articles (not the metadata) using an Elsevier API.

Thank you,

Manuela

kilian.thiel · April 9, 2018, 8:13am

Hi Manuela,

I am not 100% sure which fields exactly you want to extract from this XML. However I created an example workflow that extracts the abstract field and the authors from the XML.

I hope this helps.

Xml Parsing.knwf (66.5 KB)

Cheers, Kilian

madegomez · April 19, 2018, 12:20pm

Thanks so much for your help!
I have review your suggestion and It have been truly useful.

Manuela

system · June 2, 2023, 9:45pm

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.