Azure Data Integration Pipeline Training

Advanced Design & Delivery (A Deep Dive)

2 Day

Overview

In this course we'll quickly cover the fundamentals of data integration pipelines before going much deeper into our Azure resources.

Within a typical Azure data platform solution for any enterprise grade data analytics or data science workload an umbrella resource is needed to trigger, monitor, and handle the control flow for transforming datasets, with the goal being actionable data insight. Those requirements are met by deploying Azure Data Integration pipelines, delivered using Azure Synapse Analytics or Azure Data Factory. In this session, I'll show you how to create rich dynamic data pipelines and apply these orchestration resources in production. Using scaled architecture design patterns, best practice, data mesh principals, and the latest metadata driven frameworks. We will take a deep dive into the services, considering how to build custom activities, complex pipelines and think about hierarchical design patterns for enterprise grade deployments. All this and more in a complete set of 12 modules (based on real world experience) we will take you through how to implement data integration pipelines in production and delivered advanced orchestation patterns.

If that's not enough learning for you, a set of hands-on labs will also be made available that you can work through at your own pace. Whether you are new to Azure Data Integration Pipelines or have some experience, you will leave this course with new skills, ideas, and a much deeper understanding of the resources for your future data platform projects.

Agenda

The following offers an insight into the complete agenda and module breakdown for this course.

Day 1

Module 1: Pipeline Fundamentals
- The History of Azure Orchestration
- Synapse Analytics vs Data Factory
- Integration Components
- Common Activities
- Execution Dependencies

Module 2: Integration Runtime Design Patterns
- Compute Types
  - Azure
  - Hosted
  - SSIS
- Patterns & Configuration

Module 3: Data Transformation
- Data Flows
- Power Query Injection
- Spark Configuration
- Use Cases

Module 4: Dynamic Pipelines
- Expressions & Interpolation
- Simple Metadata Driven Execution
- Dynamic Content Chains
- Reference Names

Module 5: Pipeline Extensibility
- Azure Batch Service
  - Tasks
  - Compute Pools
  - Scaling
- Pipeline Custom Activities
- Azure Management API
- Azure Functions

Module 6: Execution Parallelism
- Control Flow Scale Out
- Concurrency Limitations
- Internal vs External Activities
- Orchestration Framework - procfwk.com

Day 2

An Architects Recap: Modules 1 to 6
- Design
- Extract
- Transform
- Load

Module 7: VNet Integration
- Private Endpoints
- Managed VNet's
- Firewall Bypass

Module 8: Security
- Service Principals
- Managed Identities
- Azure Key Vault Integration
- Customer Managed Keys
- Pipeline Access & Permissions

Module 9: Monitoring & Alerting
- Studio Monitoring
- Log Analytics & Kusto Queries
- Operational Dashboards
- Advanced Alerting

Module 10: Solution Testing
- Development Time Validation
- Test Coverage
- NUnit Tests

Module 11: CI/CD
- Source Control vs Developer UI
- Basic ARM Template Deployments
- Advanced Deployment Patterns

Module 12: Final Thoughts
- Running Costs
- Conclusions
- Best Practices

<< Back to root

4.0 KiB Raw Blame History