Question regarding text processing and machine learning

Hello everyone

I’m still rather fresh when it comes to working with KNIME and I’ve run into a wall at my current project for a university course.

The projects goal is to see, if you can use the database of all orders in a company and use the problem description, model number, replaced parts as well as the invoice text to create a model, that can predict what the most likely solution for a problem is.

Basically you should be able to input the problem description and model number and the system should give you possible solutions in the form of spare parts and resolution text.

So far I’ve connected KNIME to the database, ran the orders through Tiki Language Detector and singled out all rows containing the model, that is most frequently used.

What I’m stuck on is how I can use the “error description” and “modelno.” to predict the “spare parts” and “resolution text”.
I’ve tried to do it with “Strings to Document” and get the frequency, but I can’t really wrap my head around where to start.

Has anyone a hint on how I should approach this?

I think you can use the string to document and then create a bag of word representation,do some cleaning and then use tfidf vectorizer for your text to get a numerical representation you can then use for your algorithm to fit.

Hello! Would that mean to create a “string to document” for all columns (model, error description, invoice text and spare parts) combined or for each of them?

This topic was automatically closed 182 days after the last reply. New replies are no longer allowed.