We may earn an affiliate commission when you visit our partners.
Course image
Course image
Coursera logo

Quantization in Depth

Younes Belkada and Marc Sun

In Quantization in Depth you will build model quantization methods to shrink model weights to ¼ their original size, and apply methods to maintain the compressed model’s performance. Your ability to quantize your models can make them more accessible, and also faster at inference time.

Read more

In Quantization in Depth you will build model quantization methods to shrink model weights to ¼ their original size, and apply methods to maintain the compressed model’s performance. Your ability to quantize your models can make them more accessible, and also faster at inference time.

Implement and customize linear quantization from scratch so that you can study the tradeoff between space and performance, and then build a general-purpose quantizer in PyTorch that can quantize any open source model. You’ll implement techniques to compress model weights from 32 bits to 8 bits and even 2 bits.

Join this course to:

1. Build and customize linear quantization functions, choosing between two “modes”: asymmetric and symmetric; and three granularities: per-tensor, per-channel, and per-group quantization.

2. Measure the quantization error of each of these options as you balance the performance and space tradeoffs for each option.

3. Build your own quantizer in PyTorch, to quantize any open source model’s dense layers from 32 bits to 8 bits.

4. Go beyond 8 bits, and pack four 2-bit weights into one 8-bit integer.

Quantization in Depth lets you build and customize your own linear quantizer from scratch, going beyond standard open source libraries such as PyTorch and Quanto, which are covered in the short course Quantization Fundamentals, also by Hugging Face.

This course gives you the foundation to study more advanced quantization methods, some of which are recommended at the end of the course.

Enroll now

What's inside

Syllabus

Quantization in Depth
In Quantization in Depth you will build model quantization methods to shrink model weights to ¼ their original size, and apply methods to maintain the compressed model’s performance. Your ability to quantize your models can make them more accessible, and also faster at inference time. Implement and customize linear quantization from scratch so that you can study the tradeoff between space and performance, and then build a general-purpose quantizer in PyTorch that can quantize any open source model. You’ll implement techniques to compress model weights from 32 bits to 8 bits and even 2 bits.Join this course to: 1. Build and customize linear quantization functions, choosing between two “modes”: asymmetric and symmetric; and three granularities: per-tensor, per-channel, and per-group quantization. 2. Measure the quantization error of each of these options as you balance the performance and space tradeoffs for each option. 3. Build your own quantizer in PyTorch, to quantize any open source model’s dense layers from 32 bits to 8 bits.4. Go beyond 8 bits, and pack four 2-bit weights into one 8-bit integer. Quantization in Depth lets you build and customize your own linear quantizer from scratch, going beyond standard open source libraries such as PyTorch and Quanto, which are covered in the short course Quantization Fundamentals, also by Hugging Face. This course gives you the foundation to study more advanced quantization methods, some of which are recommended at the end of the course.

Good to know

Know what's good
, what to watch for
, and possible dealbreakers
Focuses on techniques for model efficiency to minimize cost
Teaches advanced topics in machine learning, such as quantization and lossy compression
Taught by industry professionals with expertise in deep learning optimization
Practical skills and knowledge applicable to real-world ML projects

Save this course

Save Quantization in Depth to your list so you can find it easily later:
Save

Activities

Be better prepared before your course. Deepen your understanding during and after it. Supplement your coursework and achieve mastery of the topics covered in Quantization in Depth with these activities:
Review Python Basics
Reinforce your understanding of Python basics, which will provide a solid foundation for the course.
Browse courses on Python Basics
Show steps
  • Review Python data types, operators, and control flow.
  • Practice writing and running simple Python programs.
  • Review object-oriented programming concepts in Python.
Linear Quantization Exercises
Enhance your understanding of linear quantization by completing a series of exercises.
Show steps
  • Implement linear quantization functions with different modes and granularities.
  • Measure the quantization error of each option.
  • Analyze the performance and space trade-offs for each option.
Custom PyTorch Quantizer
Deepen your understanding of quantization by building a custom PyTorch quantizer.
Show steps
  • Design and implement a general-purpose quantizer in PyTorch.
  • Quantize a specific open source model using your custom quantizer.
  • Validate the accuracy and performance of the quantized model.
One other activity
Expand to see all activities and additional details
Show all four activities
Quantization Blog Post
Solidify your understanding of quantization by writing a blog post explaining the concepts.
Browse courses on Quantization
Show steps
  • Research and summarize the key principles of quantization.
  • Discuss the benefits and limitations of quantization.
  • Share your insights and learnings with others.

Career center

Learners who complete Quantization in Depth will develop knowledge and skills that may be useful to these careers:

Reading list

We haven't picked any books for this reading list yet.

Share

Help others find this course page by sharing it with your friends and followers:

Similar courses

Here are nine courses similar to Quantization in Depth.
Quantization Fundamentals with Hugging Face
Most relevant
PyTorch Basics for Machine Learning
Most relevant
Predictive Analytics with PyTorch
Fashion Image Classification using CNNs in Pytorch
Deep Learning with Python and PyTorch
The Complete Neural Networks Bootcamp: Theory,...
Fat Loss for Guys: Get Ripped and Workout at Home
Learn Everything about Full-Stack Generative AI, LLM...
Introduction to On-Device AI
Our mission

OpenCourser helps millions of learners each year. People visit us to learn workspace skills, ace their exams, and nurture their curiosity.

Our extensive catalog contains over 50,000 courses and twice as many books. Browse by search, by topic, or even by career interests. We'll match you to the right resources quickly.

Find this site helpful? Tell a friend about us.

Affiliate disclosure

We're supported by our community of learners. When you purchase or subscribe to courses and programs or purchase books, we may earn a commission from our partners.

Your purchases help us maintain our catalog and keep our servers humming without ads.

Thank you for supporting OpenCourser.

© 2016 - 2024 OpenCourser