Semantic Search with BERT

Hub · October 22, 2020, 1:05pm

In this workflow, abstracts from the COVID-19 Open Research Dataset (https://www.kaggle.com/allen-institute-for-ai/CORD-19-research-challenge) are read in to perform semantic search using a TensorFlow 2 BERT model. For this purpose, a BERT model that has already been trained on the CORD-19 dataset is loaded from TensorFlow Hub (https://tfhub.dev/tensorflow/cord-19/swivel-128d/3). The BERT embeddings created from the abstracts are used to find semantically similar abstracts for the question asked; they are used to calculate the cosine similarity to the query embeddings and the semantically most relevant papers are displayed in a view afterwards. The data can be downloaded from https://www.kaggle.com/allen-institute-for-ai/CORD-19-research-challenge

This is a companion discussion topic for the original entry at https://kni.me/w/jBH9OVNBxEFkLMz2