Online Course: [L4-DE] Best Practices for Data Engineering - Online

hayasaka · May 6, 2022, 6:33pm

This course focuses on how to use KNIME Analytics Platform for data engineering and how to apply best practices when building data processing pipelines.

Learn the concepts behind connecting to multiple data sources, the methods for data anonymization, and advanced database topics. Be introduced to the Apache Hadoop ecosystem and find out how to handle big data with the Apache Spark integration. Finally, learn how to build and orchestrate modular workflows.

Put your knowledge into practice with hands-on exercises to build and orchestrate two applications: first, extract, validate, transform, blend, anonymize, and load the customer data to a database; second, use Spark to access, impute missing values, and aggregate the website usage data.

This is an instructor-led course consisting of four, 75-minute online sessions run by one of our KNIME data scientists. Each session has an exercise for you to complete at home and together, we will go through the solution at the start of the following session. The course concludes with a 15 to 30 minute wrap up session.

Course Content

Session 1: Introduction & technical setup, ETL, Connectors & Data access
Session 2: ETL, Data anonymization, Databases
Session 3: ELT, Big Data, Hadoop, Spark
Session 4: Cloud and Big Data connectivity, Orchestration
Session 5: Q&A

mlauber71 · May 7, 2022, 4:33am

@hayasaka I assume it is this one?

Lada · May 9, 2022, 5:06pm

hi @mlauber71 !

Yes, this is the correct course.

Lada

system · August 7, 2022, 5:06pm

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.