From the course: Azure Spark Databricks Essential Training
Optimize data pipelines
From the course: Azure Spark Databricks Essential Training
Optimize data pipelines
- [Lynn] Have you been working with data that's growing in volume and complexity and wondering how you're going to compute against this data? We'll be taking a look at managed Apache Spark clusters on Databricks Azure. We'll look at cluster set-up, different types of notebooks and a number of data workflows. These notebooks will include data processing with common scenarios such as Spark SQL, visualization and machine-learning scenarios with Spark ML, third-party libraries such as TensorFlow and Scikit-learn. We'll also look at a data pipelining and architectural patterns. I'm Lynn Langit. We have lots to cover, so let's get started.
Practice while you learn with exercise files
Download the files the instructor uses to teach the course. Follow along and learn by watching, listening and practicing.