Operating LLMs in Production Workshop

3

Days

Online Live

Delivery Method

Advanced

Skill Level

$2,400 USD

Fee

Overview

This immersive 3-day workshop is designed to equip students with the skills to deploy and manage large language models (LLMs) in production environments. From understanding cloud-based infrastructure to leveraging distributed training and inference, students will gain hands-on experience in optimizing and scaling LLMs for real-world applications.

Completing this course will help you:

  • Deploy LLMs for distributed inference
  • Optimize batch processing, reduce latency
  • Use Ray, HF Accelerate for distribution
  • Monitor/scale LLM workloads in production

Learning Path

  • Module 1 - AI Infrastructure

    Introduce the foundational AI infrastructure components required to manage LLMs in production.

  • Module 2 - GPUs Hardware Accelerator

    Explore how to leverage GPUs and other hardware accelerators for LLM training and inference in production environments.

  • Module 3 - Distributed Training

    Dive into distributed training techniques for scaling LLM training across multiple nodes.

  • Module 4 - Distributed Inference

    Learn how to efficiently run LLMs at scale using distributed inference techniques to handle high-volume requests.

Schedule

Online

Nov 4 - 6, 2024

9:00am - 5:00pm ET

Online

Mar, 2025

9:00am - 5:00pm ET

Online

July, 2025

9:00am - 5:00pm ET

Corporate Training Inquiry
Please include the date and time that you are interested in. If you couldn't find a suitable schedule, please leave us a note. Our program manager will get in touch.