LEARNING TRACK

Data Engineering Track

Advanced
Online
Self-paced

Learn to build scalable data pipelines and manage cloud-based data systems using tools like SQL, Python, Spark, AWS, and dbt for modern data workflows.

About the Track

The Data Engineering Track is designed to equip participants with the practical skills and technical expertise required to build and maintain robust data solutions. This program covers a wide spectrum of topics, including data infrastructure, big data processing, cloud-based tools, and modern storage architectures. Students will gain hands-on experience with industry-standard technologies such as SQL, Python, Spark, AWS, and dbt, enabling them to manage complex data engineering workflows.

With a strong emphasis on real-world applications, the track prepares participants to design, implement, and optimize data pipelines, manage large-scale storage systems, and leverage cloud services to support enterprise-level data operations. Graduates will emerge with the confidence to address modern data challenges across various industries.

Top data skills program voted by students and partners

Courses in this track

Intermediate

By the end of this course, participants will be able to:

  • Define the role and core responsibilities of a data engineer and distinguish it from related roles.
  • Describe the end-to-end lifecycle of a data engineering project.
  • Identify key tools, systems, and architectural patterns used in data engineering workflows.
  • Understand the principles of scalable and maintainable data architecture.
Fundamental

By the end of this course, participants will be able to:

  • Grasp the core principles of relational databases and SQL syntax.
  • Effectively design, create, modify, and manage databases and tables.
  • Perform basic and advanced SQL queries for data retrieval and manipulation.
  • Utilize functions, aggregations, and conditional logic to conduct complex data analyses.
  • Apply window functions to uncover data trends and patterns.
  • Implement SQL in real-world applications, such as marketing analytics & risk management.

Fundamental

By the end of this course, participants will be able to:

  • Comprehend Python syntax and foundational programming concepts.
  • Effectively use data structures and perform file operations.
  • Implement control flow statements to create logic-driven programs.
  • Design and use functions to promote code reusability and modular programming.
  • Apply object-oriented programming principles to design and implement classes.
  • Handle exceptions to build robust, error-resistant Python applications.

Advanced

By the end of this course, participants will be able to:

  • Explain the role of core infrastructure components, including data collection, storage, and processing, and understand cloud architecture principles.
  • Navigate and operate within Linux and use Docker for containerization to support scalable data engineering workflows.
  • Deploy and manage big data solutions on the Azure cloud platform, leveraging key services for processing and management.
Intermediate

By the end of this course, participants will be able to:

  • Model scalable and efficient data warehouses using relational design principles for analytics.
  • Implement data transformations and cloud architecture best practices to optimize performance and maintainability.
  • Apply data governance principles, including quality, lineage, and access control, in modern warehousing environments.
  • Develop ELT/ETL pipelines with cloud-based tools such as Snowflake, Azure Data Factory, and dbt.
Advanced

By the end of this course, participants will be able to:

  • Explain the principles of big data and distributed computing, including the role of Apache Spark in processing large-scale datasets.
  • Design and implement data lake and lakehouse architectures using tools such as Azure Data Lake Storage, Delta Lake, and open table formats.
  • Build scalable data processing workflows on Databricks, leveraging Spark for batch and real-time structured data.
  • Integrate NoSQL databases and schema-on-read designs into modern data architectures to support unstructured and semi-structured data at scale.
Advanced

By the end of this course, participants will be able to:

  • Build and manage scalable pipelines with Azure Data Factory.
  • Apply dynamic datasets, triggers, templates, and reusable design patterns.
  • Integrate ADF with Azure DevOps and Logic Apps for deployment and automation.
  • Orchestrate transformations in Synapse to deliver enterprise-ready solutions.
Advanced

By the end of this course, participants will be able to:

  • Describe the core concepts and use cases of NoSQL systems and demonstrate how to query and manage data using non-relational models.
  • Design and implement real-time streaming pipelines using Kafka and Spark Streaming for processing and analyzing live data.
  • Develop scalable lakehouse solutions using Delta Lake on Databricks, integrating the benefits of data lakes and data warehouses.
  • Apply tools like ElasticSearch and Databricks to support high-performance, enterprise-level data workflows.
Specialized

By the end of this course, participants will be able to:

  • Design scalable solutions: Define project requirements and create robust data engineering architectures using ELT, ETL, or Lakehouse patterns.
  • Build and deploy pipelines: Develop end-to-end data pipelines with Azure Data Factory, Databricks, and Spark for efficient data ingestion, transformation, and storage.
  • Incorporate machine learning: Integrate basic ML components (e.g., NLP models) into pipelines to enable data-driven insights and automation.
  • Present project outcomes: Prepare and deliver professional presentations showcasing technical decisions, challenges, and business impact.
This Learning Track include:
Not the right path for you?

We also offer personalized and customized path that suits your learning and career goals. Talk to our advisors.

student success

What our graduates are saying

Laura Vieira

Graduated 2021 | Reviewed on 17 October 2021

“Amazing course and support”

The course was really great a little too fast if you are not in a technical field already. You will need to study hard reviewing classes and making the labs and assignments but you always have the support of the TAs (they are awesome) or even the instructors and your own classmates that are always helping each other via Slack. There will be a final project where you will need to present a pipeline that you created (don’t worry they will be helping you!). Then, they will be helping you to find a job. Shaohua is always looking for the best for his students, he wants to make sure that you have the best experience with them.

OUR ALUMNI ARE WORKING AT

Common Questions

Find answers to your questions about the Learning Track

A Learning Track is a curated series of courses, projects, and assessments designed to help you master a specific skill set or career path.

Learning Tracks include multiple courses plus capstone or portfolio projects, offering a more comprehensive and structured learning experience.

Access is tied to your subscription. As long as your subscription is active, you can continue learning. 

Yes, each Learning Track includes hands-on labs and a final capstone project to build your portfolio.

Absolutely. Once you complete the entire Learning Track, you’ll receive a certificate of completion.

Still have questions?

If you have other queries or specific concerns about our Learning Track Subscription, don’t hesitate to let us know. Your feedback is important to us, and we aim to provide the best support possible.

Your Learning Journey Awaits 🚀

Grow your skills, build projects you’ll be proud of, and unlock new opportunities — all at your pace.

Download Data Engineering Track Course Package