2022-02-22 10:13:22 +00:00
2022-02-11 11:05:00 +00:00
2022-02-11 11:09:17 +00:00
2022-02-11 10:31:23 +00:00
2022-02-11 10:31:23 +00:00
2022-01-14 11:34:46 +00:00
2022-02-11 10:31:23 +00:00
2022-02-11 11:25:51 +00:00

Azure Data Integration Pipelines

Advanced Design & Delivery (A Deep Dive)

with Paul Andrew

Slide Header

Hey friends and welcome to this training course on Azure Data Integration Pipelines - Advanced Design & Delivery (A Deep Dive). Over the next two days we will be all becoming advanced pipeline workers!... And we completely recommend this description when describing your job to non-technical family members. But be warned, if you go on to tell them that the factory of pipes is in the cloud for orchestration and integration you are likely to be branded as crazy. However, here and now that is ok. You are amongst like-minded geeky friends that all want to become cloud pipeline workers that doing data plumbing with data pipes as well :-)

On a more serious note, throughout our day of training you will quickly notice, like with most technologies, there are an awful lot of different ways you can implement Azure orchestration services and understanding the best way to do something is often the biggest challenge. That said, if you only take away one thing from this training, I will ask that you have an appreciation of this fact; in depends! Then when delviering solutions you take a step back from the requirements and think about the overall technical design and how Azure Integration Pipelines should fit into your platform as a core component.

All too often with new and shiny services we start playing around then try to make the technology fit our solution. Rather than thinking about the solution requirements and which technology meets our needs. This is true of all developers, I don't want to preach, so am simply asking that we take a growth mindset. Think about the outputs and the requirements as a goal.


Course Overview

In this course we'll quickly cover the fundamentals of data integration pipelines before going much deeper into our Azure resources.

Within a typical Azure data platform solution for any enterprise grade data analytics or data science workload an umbrella resource is needed to trigger, monitor, and handle the control flow for transforming datasets, with the goal being actionable data insight. Those requirements are met by deploying Azure Data Integration pipelines, delivered using Azure Synapse Analytics or Azure Data Factory. In this session, I'll show you how to create rich dynamic data pipelines and apply these orchestration resources in production. Using scaled architecture design patterns, best practice, data mesh principals, and the latest metadata driven frameworks. We will take a deep dive into the services, considering how to build custom activities, complex pipelines and think about hierarchical design patterns for enterprise grade deployments. All this and more in a complete set of 12 modules (based on real world experience) we will take you through how to implement data integration pipelines in production and delivered advanced orchestation patterns.

If that's not enough learning for you, a set of hands-on labs will also be made available that you can work through at your own pace. Whether you are new to Azure Data Integration Pipelines or have some experience, you will leave this course with new skills, ideas, and a much deeper understanding of the resources for your future data platform projects.


Course Prerequisites

If you've never used Azure Data Integration Pipelines before in either Azure Data Factory or Azure Synapse Analytics, that's ok! However, please watch my 1 hour complete introduction session available via my blog and on YouTube, recorded as part of a recent community MeetUp. Link below:

https://mrpaulandrew.com/2021/08/23/an-introduction-to-azure-data-integration-pipelines/


Course Agenda

The following offers an insight into the complete agenda and module breakdown for this course.

  • Module 1: Pipeline Fundamentals

    • The History of Azure Orchestration
    • Synapse Analytics vs Data Factory
    • Integration Components
    • Common Activities
    • Execution Dependencies
  • Module 2: Integration Runtime Design Patterns

    • Compute Types
      • Azure
      • Hosted
      • SSIS
    • Patterns & Configuration
  • Module 3: Data Transformation

    • Data Flows
    • Power Query Injection
    • Spark Configuration
    • Use Cases
  • Module 4: Dynamic Pipelines

    • Expressions & Interpolation
    • Simple Metadata Driven Execution
    • Dynamic Content Chains
    • Reference Names
  • Module 5: Pipeline Extensibility

    • Azure Batch Service
      • Tasks
      • Compute Pools
      • Scaling
    • Pipeline Custom Activities
    • Azure Management API
    • Azure Functions
  • Module 6: Execution Parallelism

    • Control Flow Scale Out
    • Concurrency Limitations
    • Internal vs External Activities
    • Orchestration Framework - procfwk.com
  • Module 7: VNet Integration

    • Private Endpoints
    • Managed VNet's
    • Firewall Bypass
  • Module 8: Security

    • Managed Identities vs Service Principals
    • Azure Key Vault Backing
    • Pipeline Access & Permissions
  • Module 9: Monitoring & Alerting

    • Portal Monitoring
    • Log Analytics & Kusto Queries
    • Operational Dashboards
    • Advanced Alerting
  • Module 10: Solution Testing

    • Development Time Validation
    • Test Coverage
    • NUnit Tests
  • Module 11: CI/CD

    • Source Control vs Developer UI
    • Basic ARM Template Deployments
    • Advanced Deployment Patterns
  • Module 12: Final Thoughts

    • Running Costs
    • Conclusions
    • Best Practices

Speaker Biography

Paul Andrew is a Microsoft Data Platform MVP and Technical Architect within the Avanade Centre of Excellence team, with over 15 years experience in the industry, working as an engineer and solution architect. Day-to-day Paul is accountable for delivering enterprise grade data insights to international organisations where he wields the complete stack of Azure Data Platform resources. Paul leads delivery teams around the globe implementing the latest design patterns, creating architectural innovations, and defining best practice to ensure technical excellence for customers across a wide variety of industry verticals. Paul is passionate about technology, which is demonstrated in the community, he speaks at events and shares his knowledge gained from real world experience through his blog. Paul maintains the view that his job is also his hobby and doesnt ever want to take his fingers off the developers keyboard while maintaining a growth mindset.

Speaker Contact Details

Contact QR Code

mrpaulandrew.com/contact


Description
Training workshop content on Azure Data Factory and Azure Synapse Analytics Data Integration Pipelines
Readme 118 MiB
Languages
TSQL 90.6%
PowerShell 8.7%
C# 0.5%
PLpgSQL 0.2%