This thread is for posting solutions to “Just KNIME It!” Challenge 38. This week we’ll dive into text processing in order to extract the context surrounding a target word.
Here is the challenge: Just KNIME It!
Feel free to link your solution from KNIME Hub as well!
And as always, if you have an idea for a challenge we’d love to hear it! Tell us all about it here .
Season 1 of “Just KNIME It!” is slowly coming to an end: we’ll wrap up on October 26!
here is my component to find eggs:
You can change word (it is case insensitive) and search window as well.
Hope it is egg-sactly what required.
Have a nice day,
Here’s my solution. The component has a list chooser which permits searching on any word.
Here is my solution to #justknimeit-38 :
KNIME Hub > gonhaddock > Spaces > Just_KNIME_It > Just KNIME It _ Challenge 038
I moved back to some R scripting again, looking for the flexibility of ‘ggwordcloud’ chart. The R branch within the component can be deleted as well (if you aren’t a R fan), and the analysis will keep operative returning a matrix of preceding and trailing words, available in component’s output port.
As the testing dataset is very sort, I moved into simple bag of words analysis; then abandon the initial momentum on performing Trigram analysis and so on.
Some more clean up could be done on grammar articles, auxiliar verbs … but I think the challenge is covered. is it?
PS.- A bonus chart…
here is my solution.
Just have realized that Term Neighborhood Extractor node is an available option but it duplicated some output.
here is my solution.
It only finds the first instance of the lookup word.
I also share component named “Word Window” on hub.
Lookup word and window length can be configured from component.
This is my shared component in Hub. Named ‘Word Windows’. Some improvements from my last post have been deployed (define N neighborhood window, non case sensitive for target word input, and word-cloud assigned to port as an output image)
@kwatari I didn’t realised either on the ‘Term Neighborhood Extractor’ node availability; as I did not explore any other solution in advance… I built the workflow from scratch based in regex detectors.
Thanks to @victor_palacios for proposing these interesting challenges. I’ve just learnt now on how to share a component
My Submission for Challenge 38 .Trying to figure out issue in few repetitive lines in output.
As always on Tuesdays, here’s our solution to the #justknimeit challenge 38!
Hope you enjoyed this little “egg hunt”
Don’t forget to come back tomorrow for a new challenge!