Section 1 - Batch Processing with Databricks and Data Factory on Azure
One of the primary benefits of Azure Databricks is its ability to integrate with many other data environments to pull data through an ETL or ELT process. In module course, we examine each of the E, L, and T to learn how Azure Databricks can help ease us into a cloud solution.
Section 2 - Creating Pipelines and Activities
Processing big data in real-time is now an operational necessity for many businesses. Azure Stream Analytics is Microsoft’s serverless real-time analytics offering for complex event processing. In this section we examine how customers unlock valuable insights and gain competitive advantage by harnessing the power of big data.
Section 3 - Link Services and Datasets
A data factory can have one or more pipelines. A pipeline is a logical grouping of activities that together perform a task. The activities in a pipeline define actions to perform on your data. Before you create a dataset, you must create a linked service to link your data store to the data factory. This section deals with linked services and data sets within Azure Blob Storage.
Section 4 - Schedules and Triggers
Azure Data Factory is a fully managed, cloud-based data orchestration service that enables data movement and transformation. In this section, we explore scheduling triggers for Azure Data Factory to automate your pipeline execution.
Section 5 - Selecting Windowing Functions
In time-streaming scenarios, performing operations on the data contained in temporal windows is a common pattern. Stream Analytics has native support for windowing functions, enabling developers to author complex stream processing jobs with minimal effort. In this section, we study windowing functions related to in-stream analytics.
Section 6 - Configuring Input and Output for Streaming Data Solutions
This section teaches how to analyze phone call data using Azure Stream Analytics. The phone call data, generated by a client application, contains some fraudulent calls, which will be filtered by the Stream Analytics job.
Section 7 - ELT versus ETL in Polybase
Traditional SMP data warehouses use an Extract, Transform and Load (ETL) process for loading data. Azure SQL Data Warehouse is a massively parallel processing (MPP) architecture that takes advantage of the scalability and flexibility of compute and storage resources. Utilizing an Extract, Load, and Transform (ELT) process can take advantage of MPP and eliminate resources needed to transform the data prior to loading.