Does Knime have text search function and taxonomy?

Hi

I know KNIME has text processing plug in. But does it have text search function and logical operator? e.g. search in a collection of documents and return the ones that have "english" and "football" and doesn't have "american".

If the above is possible, can I build a taxonomy? how do I do that?

Kind regards

Wei

Hi Wei,

I'm happy to announce that we are currently working on a Lucene (http://lucene.apache.org) extension which will provide advanced data table indexing and searching e.g. boolean operators, fuzzy and wildcard matching. The extension will not only support basic KNIME data types such as boolean, double, string and date but also more complex types such as documents from the Text Processing plugin. We plan to release the first version of the extension as part of the KNIME Labs with the next release of KNIME.

Bye,

Tobias

Sound fantastic,  I have used some commercial text mining tool, one of the function will be very helpful is a taxonomy--built upon text search. User can define a tree of search terms and when click on a tree leaf/search term, a window will show all text/record lines that match that search. This is particular for review.

Kind regards

Wei

For text searches in bio-/medical literature (MEDLINE, etc.) an optional use of controlled vocabulary like MeSH terms would be helpful. Should be considered as a feature in future versions.

The use of taxonomies would be helpful to generalise searches, e.g. search for families of proteins - and do not have to use a long list of proteins, etc.

Frank

Hello everybody,

we have just released version 2.6.0. It includes the new Indexing & Searching feature which is available via the KNIME Labs update site. The new plugin supports besides the basic types such as string and numerical data also the full text search of documents allowing you to search within the text of all documents of a KNIME data table. In addition to the full text the meta data of the document is also indexed in separate fields which allow you to further filter the search result by particular authors, journals and publication dates. We hope you enjoy this new functionality.

Bye,

Tobias

Hi-

I just tried this out. Numerical search worked but text search fooled me because of stopwords. Do you have a list somewhere of exactly what Lucene features you used? Is it just the Standard Analyzer with USASCII lower-case and English stop-words and no stemming? I would say this is the right choice for log analysis etc. If you want to offer more options I would just use Embedded Solr instead, since it lets you configure almost all of Lucene from text files.

Thanks,

Lance Norskog (lance.norskog at the big G)