Hello! I’m trying to create a table with a list of rules for misspelled words and spelling out acronyms. What’s the best way to do that. I’ve tried Rule engine dictionary, but seem to give me an error. I’m thinking it doesn’t like the way the table is created or could be some kind of syntax error. Please help
Execute failed: Wrong rule in line: Row0 Line: 1: Expected a number, boolean, string, column, a table property or flow variable reference. clitch => “glitch” ^
I have a table with 196 rows and 2 columns. Column 1 is Rule, Column 2 is Result.
Not sure what your data looks like, but you might try the String Replacer (Dictionary) node. It allows wildcards which can be helpful for identifying misspellings.
That works for me. Thank you!! BTW… is there a node that identifies a spelling error but does not fix it… but gives me an option to decide if I would want to fix it or not?
Can you clarify. I’m not sure I understand you. Join what with a dictionary? and what dictionary… is it an existing dictionary or do I have to create it? I’m new to KNIME so sorry for the dumb questions.
I know of no node which would allow this kind of interactivity. Its probably possible to build a workflow which would permit selective changing of misspellings, but it would be pretty complicated.
you go e.g. to Oxfam or any other dictionary producer, get a list of words that are part of the dictionary for the language you wanna work with and get a thousands or millions long list.
that list you match against your list using one of the following nodes: Joiner, Reference Row Splitter, Value Lookup, …
and then you have all “officially” recognized words separated from those that may be misspelled / contain a typo
I have tried the spellchecker nodes… but the spellchecker node fixes the misspelled words. I’m looking for something that just identifies the misspelled word and gives me the opportunity to choose whether I want to get it fixed or not. For example comp could be changed in many ways depending on the industry… computer, complete, comprehensive. I want to see what replacement is being recommended and if that works for me.
Are you working with a list of single misspelled words or a list of strings with misspelled words in them? Also, are you looking for a specific list of misspelled words or every possible word in your language of choice?