School of Hive - with KNIME's local Big Data environment (SQL for Big Data)
Demonstrates a collection of Hive functions using KNIME's local Big Data environment including creating table structures from scratch and from an existing file and working with partitions.
Partitions are an essential organizing principle of Big Data systems. They will make is easier to store and handel big data tables.
All examples are fully functional. You could switch out the local big data environment for your own one.
This example focusses on Hive-SQL script in executors. Similar effects could be achieved by using KNIME's DB nodes
This is a companion discussion topic for the original entry at https://kni.me/w/1q-JwD0cwmuEkWpy
I tried to accomplish two things in one workflow:
- demonstrate some basic concepts and (SQL) code to deal with Big Data environments in this case Hive tables - with emphasis on partitions and to inspect the data structure created (both play important roles once you start using big data tables for real)
- do it using a KNIME environment and workflow since it is a very quick way to get your hands on a big data environment. And it demonstrates that you could easily use KNIME to create and execute HQL/SQL code on big data environments with generic code (or with the built in DB nodes)
So with KNIME again you do not have to choose. You can have it all the way from your humble desktop machine to some really huge enterprise big data environments …