4.0 KiB
		
	
	
	
	
	
	
	
			
		
		
	
	Azure Data Integration Pipeline Training
Advanced Design & Delivery (A Deep Dive)
2 Day
Overview
In this course we'll quickly cover the fundamentals of data integration pipelines before going much deeper into our Azure resources.
Within a typical Azure data platform solution for any enterprise grade data analytics or data science workload an umbrella resource is needed to trigger, monitor, and handle the control flow for transforming datasets, with the goal being actionable data insight. Those requirements are met by deploying Azure Data Integration pipelines, delivered using Azure Synapse Analytics or Azure Data Factory. In this session, I'll show you how to create rich dynamic data pipelines and apply these orchestration resources in production. Using scaled architecture design patterns, best practice, data mesh principals, and the latest metadata driven frameworks. We will take a deep dive into the services, considering how to build custom activities, complex pipelines and think about hierarchical design patterns for enterprise grade deployments. All this and more in a complete set of 12 modules (based on real world experience) we will take you through how to implement data integration pipelines in production and delivered advanced orchestation patterns.
If that's not enough learning for you, a set of hands-on labs will also be made available that you can work through at your own pace. Whether you are new to Azure Data Integration Pipelines or have some experience, you will leave this course with new skills, ideas, and a much deeper understanding of the resources for your future data platform projects.
Agenda
The following offers an insight into the complete agenda and module breakdown for this course.
Day 1
- Module 1: Pipeline Fundamentals
- The History of Azure Orchestration
- Synapse Analytics vs Data Factory
- Integration Components
- Common Activities
- Execution Dependencies
 
- Module 2: Integration Runtime Design Patterns
- Compute Types
- Azure
- Hosted
- SSIS
 
- Patterns & Configuration
 
- Compute Types
- Module 3: Data Transformation
- Data Flows
- Power Query Injection
- Spark Configuration
- Use Cases
 
- Module 4: Dynamic Pipelines
- Expressions & Interpolation
- Simple Metadata Driven Execution
- Dynamic Content Chains
- Reference Names
 
- Module 5: Pipeline Extensibility
- Azure Batch Service
- Tasks
- Compute Pools
- Scaling
 
- Pipeline Custom Activities
- Azure Management API
- Azure Functions
 
- Azure Batch Service
- Module 6: Execution Parallelism
- Control Flow Scale Out
- Concurrency Limitations
- Internal vs External Activities
- Orchestration Framework - procfwk.com
 
Day 2
- An Architects Recap: Modules 1 to 6
- Design
- Extract
- Transform
- Load
 
- Module 7: VNet Integration
- Private Endpoints
- Managed VNet's
- Firewall Bypass
 
- Module 8: Security
- Service Principals
- Managed Identities
- Azure Key Vault Integration
- Customer Managed Keys
- Pipeline Access & Permissions
 
- Module 9: Monitoring & Alerting
- Studio Monitoring
- Log Analytics & Kusto Queries
- Operational Dashboards
- Advanced Alerting
 
- Module 10: Solution Testing
- Development Time Validation
- Test Coverage
- NUnit Tests
 
- Module 11: CI/CD
- Source Control vs Developer UI
- Basic ARM Template Deployments
- Advanced Deployment Patterns
 
- Module 12: Final Thoughts
- Running Costs
- Conclusions
- Best Practices
 
