2022-04-21 12:36:52 +01:00

126 lines
3.6 KiB
Markdown

# Azure Data Integration Pipeline Training
## Advanced Design & Delivery (A Deep Dive)
![Slide Header](.\Agenda%20Header.png)
___
## Overview
In this course we'll quickly cover the fundamentals of data integration pipelines before going much deeper into our Azure resources.
Within a typical Azure data platform solution for any enterprise grade data analytics or data science workload an umbrella resource is needed to trigger, monitor, and handle the control flow for transforming datasets, with the goal being actionable data insight. Those requirements are met by deploying Azure Data Integration pipelines, delivered using Azure Synapse Analytics or Azure Data Factory. In this session, I'll show you how to create rich dynamic data pipelines and apply these orchestration resources in production. Using scaled architecture design patterns, best practice, data mesh principals, and the latest metadata driven frameworks. We will take a deep dive into the services, considering how to build custom activities, complex pipelines and think about hierarchical design patterns for enterprise grade deployments. All this and more in a complete set of 12 modules (based on real world experience) we will take you through how to implement data integration pipelines in production and delivered advanced orchestation patterns.
If that's not enough learning for you, a set of hands-on labs will also be made available that you can work through at your own pace. Whether you are new to Azure Data Integration Pipelines or have some experience, you will leave this course with new skills, ideas, and a much deeper understanding of the resources for your future data platform projects.
___
## Agenda
The following offers an insight into the complete agenda and module breakdown for this course.
* __Module 1:__ [Pipeline Fundamentals]()
* The History of Azure Orchestration
* Synapse Analytics vs Data Factory
* Integration Components
* Common Activities
* Execution Dependencies
___
* __Module 2:__ [Integration Runtime Design Patterns]()
* Compute Types
* Azure
* Hosted
* SSIS
* Patterns & Configuration
___
* __Module 3:__ [Data Transformation]()
* Data Flows
* Power Query Injection
* Spark Configuration
* Use Cases
___
* __Module 4:__ [Dynamic Pipelines]()
* Expressions & Interpolation
* Simple Metadata Driven Execution
* Dynamic Content Chains
* Reference Names
___
* __Module 5:__ [Pipeline Extensibility]()
* Azure Batch Service
* Tasks
* Compute Pools
* Scaling
* Pipeline Custom Activities
* Azure Management API
* Azure Functions
___
* __Module 6:__ [Execution Parallelism]()
* Control Flow Scale Out
* Concurrency Limitations
* Internal vs External Activities
* Orchestration Framework - [procfwk.com](http://procfwk.com/)
___
* __An Architects Recap__: [Modules 1 to 6]()
* Design
* Extract
* Transform
* Load
___
* __Module 7:__ [VNet Integration]()
* Private Endpoints
* Managed VNet's
* Firewall Bypass
___
* __Module 8:__ [Security]()
* Service Principals
* Managed Identities
* Azure Key Vault Integration
* Customer Managed Keys
* Pipeline Access & Permissions
___
* __Module 9:__ [Monitoring & Alerting]()
* Studio Monitoring
* Log Analytics & Kusto Queries
* Operational Dashboards
* Advanced Alerting
* __Module 10:__ [Solution Testing]()
* Development Time Validation
* Test Coverage
* NUnit Tests
___
* __Module 11:__ [CI/CD]()
* Source Control vs Developer UI
* Basic ARM Template Deployments
* Advanced Deployment Patterns
___
* __Module 12:__ [Final Thoughts]()
* Running Costs
* Conclusions
* Best Practices
___