I’m doing a text analysis project and i would like to ask if you have suggestion on how to differentiate Bold vs Normal text. The details is below:
- Input files: i have multiple Word documents (transcript from interviews). The questions are formatted in BOLD text, and the answers are in normal text.
- Output: i would want to have an excel output with Question in one column and corresponding answer for that question in 1 column.
I use the words parser and sentence extractor as first 2 nodes, and want write a rule to split Bold vs. normal text. But i’m not sure if this possible, as when i use the word parser note, i think the formatting is not taken into account.