Automated access to disease information is an important goal of information extraction and text mining efforts. Here, we want to create a model that learns disease names in a set of documents from biomedical literature. We will automatically extract literature from PubMed and use these documents to train our model on an initial set of disease names (the dictionary). We score the resulting model and check if we can extract new information by comparing the detected disease names to our initial set. Subsequently, we interactively inspect the diseases that co-ooccur in the same documents by a network approach and look into genetic information associated with these diseases.
This is a companion discussion topic for the original entry at https://kni.me/w/9gREnqIbKKgFjhUn