Journal_title missing in Document Data Extractor

When used with “Document Grabber” and PubMed, “Document Data Extractor” does not have the option of extracting “Journal_title” as a feature. When I look at the document list in Document Viewer, and go inside a single document, I see that the Journal_title is part of he Document description.

Any insight? Can KNIME Gurus add “Journal_title” to the list of extracted features (i.e., data items).

Hi,

Is there a difference between journal title and title?
If you have an example I would be happy to look into this.

Best,
Martyna

1 Like

Hi Martyna,

I enjoyed your online webinar on “Topic Text Mining on Biomedical Literature” yesterday.

Yes, there is a difference between the journal_title and article_title. I was hoping the Source field would show it, but it only shows PubMed repeatedly as the value, and not the specific journal the article belongs to. The node does not extract the journal_title (e.g., International Journal of Medical Informatics).

Best,
Dursun

1 Like

Hi Dursun!

Happy to hear you enjoyed the webinar!
Now I know what you mean by journal_title. Yes, I agree this information might be useful too.
For the development of this service, the NCBI API was used, which means we are restricted to what they provide. If they don’t provide this information, we, of course, have no other option to get to that.

Did you check if their RESTful WebService provides this information? Executing a GET Request when we have a pmid might be faster than waiting for the review of a request and implementation on our side.
Nevertheless I will create a request for this and let’s see what’s possible.

Best,
Martyna

Hi Dursun,

you could also try to extract the journal_title from the document and add it as meta information using the Meta Info Inserter node. To extract meta information afterwards, you can use the Meta Info Extractor.

Best,
Julian

EDIT:

I had another look. Apparently, we get the information about the journal title from PubMed and we also store it as a section within the document, but unfortunately we don’t provide the possibility to extract this information. Currently, it can only be accessed in the Document Viewer. We will create a ticket, to get access to these kind of information.

1 Like

I created a ticket for that and will post it here if there are any updates!

1 Like

That would be very helpful. In literature mining, often you may want to see if certain topics correlate with certain journals. Being able to see the Journal_title will help in tabulating and charting that relationship.

Best - Dursun

2 Likes