Online Course: [L4-TP] Introduction to Text Processing

heather.fyson · September 30, 2020, 3:00pm

Course Focus

This course is about text mining, its theory, concepts, and applications. Specifically, the course focuses on the acquisition, processing and mining of textual data with KNIME Analytics Platform.

You will learn how to use the Text Processing Extension to read textual data into KNIME, enrich it semantically, preprocess it, transform it into numerical data, and extract information and knowledge from it through descriptive analytics (data visualization, clustering) and predictive analytics (regression, classification) methods. The course also covers popular text mining applications including social media analytics, topic detection and sentiment analysis.

Put what you’ve learnt into practice with the hands-on exercises.

Course content

Introduction to Text Processing and Importing
Text Tokenization and Enrichment
Preprocessing
Transformation, and Classification Models
Visualization, Clustering, and Topic Modeling
Movie Forecasting Use Case, Recap and final Q&A

If you are interested in signing up:

Go to our Events page for a list of all our online courses and more details

outiloni · June 1, 2021, 3:50pm

Hi
are the exercise the same as in your online course https://knime.learnupon.com/enrollments/82888541/details?
thanks
Outi

hayasaka · June 2, 2021, 7:04pm

Dear Outi,

Exercises are different between the self-paced course and the instructor-led course (the one currently ongoing). I hope this answers your question.

Thanks,
-Satoru

outiloni · June 3, 2021, 7:38am

Dear Satoru
Can you share a link to the sentiment word dictionary please? and even demonstrate how to bring it in to Knime? I only found .txt file versions
many thanks
Outi

hayasaka · June 3, 2021, 7:27pm

Dear Outi,

You probably have to create a list of positive and negative terms from the .txt version. You can threshold the PosScore or NegScore to choose positive or negative terms, respectively. Some entries may be represented by a single SysetTerm, while others may have multiple terms.

Unfortunately the SentiWordNet has not be incorporated into KNIME yet…

-Satoru

outiloni · June 4, 2021, 10:06am

aaa, that’s why you used MPQA-OpinionCorpus ?!
thank you Satoru!
Outi

outiloni · June 4, 2021, 10:16am

Dear Satoru
my apologies, another question:
I’m going through 11- Visualisation exercise (Solution).
If I want to just gain sentiment tag clouds, I can jump straight from ‘Preprocessing 2 (with Bag of Words)’ to Visualisation (with TF, Document Data Extractor etc)?

When using multiple dictionaries for tagging, what was the trick that ‘newest’ dictionary does not overwrite previous tags?
many thanks again
Outi

hayasaka · June 4, 2021, 8:03pm

Dear Outi,

To generate a tag cloud with sentiment information, you need to tag the data with positive and negative sentiment tags. Then you need to perform the preprocessing steps, then generate TF, then you are ready to generate a tag cloud. You can assign different colors based on sentiment tags in the tag cloud.

Unfortunately there is no way to keep tags from the first tagging only. If terms are tagged by multiple tags, then you may need to use Tags to Strings node, for example, and select a particular type of tags you are interested.

-Satoru