I am working on a project in which I have about 13,000 Instagram captions (each with a unique ID). I also have an excel with 194 preset dictionaries (for ex. “trust” with 30 words, “family” with 25 words etc). Each column in the excel represents a dictionary.
For each caption, I need to know which is the percentage of words related to each of the 194 dictionaries (for ex, the first caption contains 4% of words realted to “family”).
Can you post some sample data here, both a few different captions, and a selection of the dictionary categories? Assuming the data is not confidential, this would entice people to actually take a stab at your problem