I want to successfully run a text processing workflow developed in KNIME 2.9 under KNIME 3.3. When I run the v2.9 workflow in the v3.3 environment, it runs fine, but KNIME tells me that the majority ot text pre-processing nodes are deprecated. When I replace the v2.9 deprecated nodes with corresponding v3.3 versions, some of the replacements run extremely slow (70 fold slower) and in some cases fail to perform their function (no filtering). Let me illustrate the problem via the following example (attached) which shows:
(1) Execution of the v2.9 Punctuation Erasure node (deprecated) takes about 3 sec. Execution with the corresponding v3.3 node takes about 200 seconds.
(2) The output of the Bag of Words Creator node shows 220103 rows. The output of the v2.9 Punctuation Erasure node (deprecated) shows 209271 rows. This is expected showing that filtering has occurred. By contrast, the output of the corresponding v3.3 node shows 220103 rows identical to the input. Thus no filtering has occurred.
(3) The problem is not restricted to just the Punctuation Erasure node. Execution of the v2.9 N Chars Filter node (deprecated) takes only 3 seconds. Execution with the corresponding v3.3 node takes about 215 seconds (70 fold slower).
(4) Filtering is similarly affected. With the same input as before, the output of the v2.9 N Chars Filter node shows 164532 rows indicated that extensive filtering has occurred. By contrast, the corresponding v3.3 node output shows 220103 rows (same as the input) indicating that no filtering has occurred.
What is going on here?
--Paul