Newbie Questions

Greetings All:

Am very impressed with what you have built and we are exploring using KNIME as a key part of our analysis toolset at the government branch where I work. Please bear with my questions.

1) The "help contents" yields an HTTP 500 Internal Server Error
2) We need ways of sub-setting our datasets for detailed analysis. Which node(s) would we use?
3) We are looking at using PCA (principal components analysis). Are there PCA nodes?
4) How would I increase the amount of RAM that KNIME uses so at not to hit a java heap error?
5) We need a variety of data visualizations - are there other nodes/add-ins that we could access?
6) We would also like to automate the generation of standard reports with embedded data summaries/plots. Any suggestions as to how we could do this?
7) I see from the other forum that the interface to R is being improved. What is the timeline for the improvements?

Thanks for your time.

Iqbal

Hi lqbal,

Iqbal wrote:
Am very impressed with what you have built and we are exploring using KNIME as a key part of our analysis toolset at the government branch where I work

:oops:

Iqbal wrote:
1) The "help contents" yields an HTTP 500 Internal Server Error

Interesting. Maybe Fabian knows about this. Is there anything inside /.metadata/.log that may shed light onto this problem?

Iqbal wrote:
2) We need ways of sub-setting our datasets for detailed analysis. Which node(s) would we use?

Depending on what exactly you intend to do, you could use the Partitioning node (splits data into disjoint two parts), the Sampling node (filters out parts of the table), the Row Filter node (filters rows bases on certain column values) or the Cross Validation meta node. Maybe there are others, but I'm slowly using track about what's inside KNIME 8)

Iqbal wrote:
3) We are looking at using PCA (principal components analysis). Are there PCA nodes?

Well, we have a node that does PCA, but it is not in the official release (yet). But we may have a look at integrating it.

Iqbal wrote:
4) How would I increase the amount of RAM that KNIME uses so at not to hit a java heap error?

Either pass -Xmx1024m (e.g.) on the command line or edit the knime.ini (or .knime.ini under Linux) and replace the value behind -Xmx with the desired one.

Iqbal wrote:
5) We need a variety of data visualizations - are there other nodes/add-ins that we could access?

I only know the ones in the release and the ones in the optional JFreeChart plugin.

Hope that helps, for the remaining question some of my colleagues are the experts in the field.

Regards,

Thorsten

Hi Iqbal,

I am very interested in what data visualizations except the already contained you would like to see. In the current release there are

    box plot
    histogram
    line plot
    parallel coordinates
    pie chart
    scatter plot
    scatter plot matrix
    histogram (use hierarchical clustering)
    enrichment plot (use "mining->scoring-> enrichment plot)

All of which support visual variables (Data Views -> Property) color, shape and size and the highlighting functionality.

Some more information about the interactive views in KNIME and how to use them.

In the next release a 3D scatterplot will also be available. But I'm always keen on implementing new interactive views, so please let us know what else you expect.

Regarding the HTTP 500: please make sure to have write permisson on your installation directory, since the help files are created dynamically. (Fixed in the next version). If this doesn't help please tell me what OS and browser you use.

Regards,

Fabian

[/]

Quote:
Regarding the HTTP 500: please make sure to have write permisson on your installation directory, since the help files are created dynamically. (Fixed in the next version). If this doesn't help please tell me what OS and browser you use.

Windows Vista Home Premium and Firefox 2.0.0.13
(current test environment)

More response on other questions as the rest of the team members add their thoughts.

Thanks, Iqbal

Hi Iqbal,

regarding the HTTP 500 it seems that other people have this problem, too:

http://forums.dzone.com/eclipse/48-eclipse-help-display-problem-ms-vista.html

and

http://dev.eclipse.org/newslists/news.eclipse.newcomer/msg01317.html

So far there are no answers to these questions but it seems to be an Eclipse and Vista configuration problem. We keep track of that.

Can you have a look at the log file as Thorsten suggested?

Although this might be annoying: all of our documentation is also online

Regards,

Fabian

Some additional comments:

We would like to be able to choose a subset of rows that meet certain multiple criteris. We have used SQL statements (driven by pick lists) and would like to see something similar. Something like the "autofilter" function in MS excel.

We would like to see (quality) control plots (e.g. cusum; X charts); ; cumulative plots; spatial mapping; time series plus model fits (e.g. EWMA). The 3D plots would be great..we are playing with ggobi.

For reporting, how easy would it be to integrate BIRT?

Having GIS/mapping display as a graphic node would be great., A node that can identify spatial, temporal, or spatial-temporal clusters that then feed into a mapping node that allows those locations with potential clusters to be displayed.

Also, text mining tools; especially a spellchecker node with a customized dictionary, that we could pass text through to clean, and then pass onto classification nodes.

We may have resources to build some of these nodes ourselves once we come up to speed with the node building process.

Thanks and regards

Iqbal

Iqbal wrote:
For reporting, how easy would it be to integrate BIRT?

We already have a BIRT plugin :)
http://www.knime.org/download_extensions.html#birt

Hi Iqbal,

well, this goes far beyond the scope of what is currently available.

Regarding the views you may also use R (I suggest they have some of the demanded available). And as Thorsten already mentioned, the BIRT plug-in is provided on the extension page: http://www.knime.org/download_extensions.html

A text mining plug-in is currently under development. But for the task you describe a StringReplacer(Dictionary) may be sufficient.

All the GIS/geo/spatial data is not yet supported in KNIME at all.

Anyway, in order to speed up your development process: we also offer training, support and commissioned or even inhouse node development. For more information see http://www.knime.org/service.html

Best Regards,

Fabian

Iqbal wrote:
Some additional comments:

We would like to be able to choose a subset of rows that meet certain multiple criteris. We have used SQL statements (driven by pick lists) and would like to see something similar. Something like the "autofilter" function in MS excel.

...........

Iqbal

As a quick fix for now you can use the Column Combiner node to put together the columns you would like to filter on and then use the Row Filter node to filter for the specified multiple column condition. The only limitation here is you can only do one condition at a time unless regular expressions give you the flexibility you require.

Thanks for all your comments and suggestions.

We will look more closely at R to fill in the needed functions
(spatial, pca, etc)
and are quite interested in the node development workshops.
(please send me relevant contacts/information)

Iqbal

(We are also exploring the functionality of SPSS Clementine.)

Hi Iqbal,

we haven't planned a workshop in the near future, but if you are interested in a
training (inhouse or in here in Konstanz), please contact me via private message
to clarify the details.

Best,

Fabian

Thanks for your help. I'll connect with you for the training details.
Are there examples of commercial developments in KNIME that you could refer me to?

Iqbal

As far as I know there are currently three commercial extensions available:

Best,

Fabian

[/]

If you read the KNIME Parners page http://www.knime.org/partners.html you'll discover that the THINK software is also available for use in KNIME :) See http://www.treweren.com

Quote:
1) The "help contents" yields an HTTP 500 Internal Server Error

FYI - access to help works fine if I disconnect our internet connection.

how to install text mining/processing plug-in i cant find out a way to it.

See http://labs.knime.org/installation