Hi everyone, I am trying to understand what are the main use cases for Knime. What problems does it help solving? What it can do that other tools can’t. Are there alternatives?
The reason I am asking - I was asked to evaluate if my company needs Knime (someone from leadership have heard of the tool in the conference). Hence I am here collecting facts.
So far, after little research, my understanding is as follows. Knime is:
Business users / non-developer friendly ETL tool: Alternatives would be enterprise ETL tools like SSIS, Talend, IBM Data Stage. Or data prep inbuilt within Tableau or Power BI (provided you don’t need to write data back to DBs)
Business users friendly statistical modelling tool that otherwise would require descent R/Python/SQL experience. Alternatives would be SAS, Mathlab, R/Python IDEs
Environment to run ML jobs (from data prep, to model training). Alternatives would be Databricks, R/Python IDEs, Jupyter notebooks.
Viz and Data Analysis tool. Alternatives would be Power BI / Tableau, Etc.
Have a missed a functionality/use case?
Alternatives to Knime itself would be Alteryx. Any other close contenders?
It is difficult to state in a few lines what KNIME is and can do. I think you have mentioned some important functions that KNIME can provide. And there is more.
At a base it is a data analytics platform with a graphic workflow interface. Platform meaning that it is not restricted to a certain set of functions but will happily connect to other programs, data bases and languages/systems - namely R and Python.
For me it is mostly about the scalability and the philosophy behind KNIME. It is fairly easy to be approached by basic users migrating from some Excel tasks. You can take it from there and use the same workflow logic and interface to move up to more advanced tasks up to steering the Big Data jobs of major companies.
I know that sometimes very advanced people would like their Python code. I always would say: why choose when you can integrate great Python functions and libraries within a nice workflow that offers the opportunities of clean logical structuring and documentation.
Maintaining a Python environment can provide its own set of problems. My questions is always: is this function so brilliantly unique that we have to use some very special Python library (if yes let’s integrate that as clean as possible into KNIME). That is not to say not to use Python (quite the opposite) - but why would you have to choose when you can have both.
Of course if you are Google/Amazon and have a team of 10.000 highly trained software engineers on standby and and unlimited supply of servers you might opt for building it all from scratch with the Apache stack
Major alternatives would be the likes of RapidMiner (which is limited in the free version) and Alteryx (not free). Alteryx is based on R beneath and has a more let’s say guided surface that give first time users some stronger guidance.
Other programs that are somewhat similar (and I have some experience with) are Palantir/Foundry, SAS Enterprise Guide/Miner (JMP has some nice functions), IBM SPSS Modeler. They typically have a different set of challenges regarding integration in your company and a very different price tag (even considering KNIME’s commercial server) - and from my experience they are also not so open to outside tools and connections in the way KNIME is.
As a company you would have to think about your needs, the number of people you want to equip/empower to use the software, training costs, re-use of code samples (think KNIME hub and MetaNodes and components) - and the cost of all that. In this in my opinion KNIME offers an outstanding bundle that is not matched by any of the other platforms (which is not to say that they are bad tools).
To get additional ideas what KNIME is up to you might have a look at the white paper section that gives you an insight into more advanced use cases: https://www.knime.com/white-papers
Then I would encourage you to think about the citizen data scientist approach within a company - building up broad know how and a vibrant community - with reusable workflows and components:
Five Takeaways from the First KNIME Meetup@Siemens
Empower Your Own Experts! Continental Wins the Digital Leader Award
Welcome to the KNIME forum.
First of all you need to analyse you company task specifics and main task.
Say, if you want professional ETL KNIME will be difficult o use but for simple one is OK.
Also, on your research you omitted text analytics. Say, I do not analyze texts but use a fuzzy record connections based on provided from text analytics nodes.
Some features are available or effective only with KNIME server so on. I really recommend you to clarify main and optional tasks you plan KNIME to use.
You can use KNIME as a basic ETL tool but it’s no replacement for talend and the like at all. talend for example can read out database changes directly from the system WAL (write-ahead log, a low level thing in databases usually not user-accessible and certainly not via knime). This means a clean ETL solution. With knime you can only work with version and date columns or triggers to set up an ETL solution. Will works just fine for some use-cases, not for others.