I am developing a pipeline for text enrichment and named entity recognition in specific domain. This produces an annotated document which can be viewed with the Document Viewer.
I need to export these results (eg. document plus recognized entities) for them to be used externally. How can I achieve this?
Partial answer: I can get these terms with the Bag Of Words Creator node. I still need to attach all the terms for each document, eg. "document A has terms (NERs) X,Y,Z"
you could create a bow to get a list of all tagged terms. Filter the terms with the tag filter to keep only tagged terms in the documents, before creating the bow. Now group by documents and aggregate the terms. You could convert the terms as well as the tags to strings before. On strings you can e.g. use concatenate as aggregation function in the GroupBy node.
I hope this helps.