How to convert HTML characters into plain text

I have some data fields that have HTML encoding.

Example 1:
Isol. from < i >Dryopteris abbreviata< /i >
Example 2:
C< sub >23< /sub >H< sub >28< /sub >O< sub >8< /sub >

How do I convert these into the following text strings?

Example 1 (converted):
Isol. from Dryopteris abbreviata
where the ‘Dryopteris abbreviata’ is in italics

Example 2 (converted):
where the numbers are in subscript

I have tried using a Java Snippet node with org.jsoup.parser.Parser additional bundles import and out_MOLF = Parser.unescapeEntities(c_MOLF, false) as my code.

I could configure the node and get it to run. However, the text has not been converted at all.

I hope somebody can help me.

You might want to provide a sample upload to get better support

1 Like

Hi @David_G

It’s unfortunately not able to make the “normal” grey output table display the HTML-formatted text

But you can use the JS-based Table View node and that will display the HTML formatting without further ado

Mind that the output here is in the interactive view

That’s the only ‘solution’/workaround that I could think of, but maybe there are really Java-savvy folks know more :slight_smile:
Sorry I don’t have an better answer!



If plain text is wanted, there is also the Markup Tag Filter, which will remove such formatting tags from the string.

Hi @Alice_Krebs

Thank you for your answer.

So as I understand it I cannot expose the HTML-formatted text to an end-user in a webpage or Excel file or csv file.

It would be good if such nodes could be developed that would allow this.

Thanks for your help!


Hi @David_G

No, if you display the Interactive view of the Table View node as a webpage using our commercial product, the text will be formatted without further ado.

But you are right with regards to outputting csv and Excel.
Thanks for the feedback, will forward that to the devs :slight_smile: