Operating LLMs in Production

3

Days

Online Live

Delivery Method

Advanced

Skill Level

$1,800 USD

Fee

Overview

The Operating LLMs in Production course provides a comprehensive understanding of deploying and managing LLMs in production environments. This course is designed for both NLP engineer, LLM engineer and MLOps professional seeking to master LLM operations. Participants will gain hands-on experience in setting up cloud-based AI infrastructure, utilizing GPU clusters, and leveraging distributed training and inference techniques. Participants will gain hands-on experience in optimizing and scaling LLMs for real-world applications.

Completing this course will help you:

  • Deploy LLMs in production using containerized cloud infrastructure.
  • Utilize GPUs and other hardware accelerators to optimize LLM training and inference workloads.
  • Implement distributed training and inference techniques to scale LLM deployments across multiple nodes.
  • Monitor and optimize LLM performance in production environments, balancing costs, performance, and scalability.

Curriculum

  • Module 1 - AI Infrastructure
    • Understand the fundamentals of AI infrastructure for managing LLMs in production.
    • Explore cloud-based services, tools, and platforms essential for deploying and maintaining LLMs.
    • Learn best practices for setting up scalable infrastructure that supports LLM applications.
  • Module 2 - GPU Hardware Accelerator
    • Learn how to leverage GPU resources effectively for faster model training and inference.
    • Explore hardware configurations and optimizations for production-scale LLMs.
  • Module 3 - Distributed Model Training
    • Understand the basics of DDP, DeepSpeed, and FSDP.
    • Learn how to use TRL for distributed training across multiple nodes.
    • Learn how to use ray train and ray tune to train LLMs efficiently.

  • Module 4 - Efficient LLM Inference
    • Learn how to deploy LLMs efficiently using vLLM.
    • Learn to implement distributed inference for handling high-volume LLM requests.
    • Explore strategies to minimize latency and maximize throughput in production environments.
  • Module 5 - LLMOps
    • Understand the key concepts of LLMOps including guardrails, security checks, monitoring, prompt management, data augmentation, and annotation.
  • Module 6 - LLMs in Production Project
    • In this mini-project, you will implement an end-to-end LLM Operations pipeline to train, serve, and deploy LLMs in a cloud environment.

Schedule

Online

Dec 09 to Dec 11, 2024

9:30am - 4:30pm EST

Online

Mar, 2025

9:30am - 4:30pm EST

Online

Jul, 2025

9:30am - 4:30pm EDT

Corporate Training Inquiry
Please include the date and time that you are interested in. If you couldn't find a suitable schedule, please leave us a note. Our program manager will get in touch.