Filtering and/or extracting terms from .docx based on color

Hi,
I have Word documents, .docx, that have colored text. Most of the text is black but some text maybe in red or blue.
I am trying to extract all terms from a document based on color and have not been able to figure this out. Ideally, a resulting table would have color columns (black, red and blue) and the related terms in those columns.

Has anyone tried this in KNIME?
Thank you.

Hi @Onjai,

is is possible to share some sample data please?

Best
Mike

Hi Mike,
I cannot publicly post the word documents.
All you need is one or two Word documents with a mixture of colored terms. Some terms in red, some black, some green. The colored terms would not be the same between documents, i.e. the term “happy” may be red colored in one document and colored blue in another document.
The goal is to extract all terms based on color into separate color columns (R, G, B).

Hi @Onjai,

so you cannot provide a sample Word document with sample data that meets your needs but, without offense, expect the community to create one based on your description to try to help you? Again, don’t get this wrong but it’s quite challenging to help already, wouldn’t you agree?

I as well as many others in the forum gladly help but it feels a bit unbalanced that supporters already donate their precious time and then it’s expected from them to create test data as well. Not quite motivating …

Cheers
Mike

Hi @mwiegand ,
Please forgive my ignorance and arrogance for the shortness of my post.
Attached word document.
Honesty is the practice.docx (14.1 KB)

The resultant table I am looking for would be formatted as follows:

Thank you.

Hi @Onjai,

no worries and thanks a lot for your understanding. I know it’s difficult, when facing challenges, to keep some details in mind so thanks for providing the data. I am already having a look. Please give me some time to think about this.

Edit: Let me know what you think about this :wink:

Best
Mike

2 Likes

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.