We may earn an affiliate commission when you visit our partners.
Course image
Mike X Cohen

Deep Understanding of Large Language Models (LLMs): Architecture, Training, and Mechanisms

Description

Large Language Models (LLMs) like ChatGPT, GPT- But most courses only teach you how to use LLMs. This 90+ hour intensive course teaches you how they actually work — and how to dissect them using machine-learning and mechanistic interpretability methods.

This is a deep, end-to-end exploration of transformer architectures, self-attention mechanisms, embeddings layers, training pipelines, and inference strategies — with hands-on Python and PyTorch code at every step.

Read more

Deep Understanding of Large Language Models (LLMs): Architecture, Training, and Mechanisms

Description

Large Language Models (LLMs) like ChatGPT, GPT- But most courses only teach you how to use LLMs. This 90+ hour intensive course teaches you how they actually work — and how to dissect them using machine-learning and mechanistic interpretability methods.

This is a deep, end-to-end exploration of transformer architectures, self-attention mechanisms, embeddings layers, training pipelines, and inference strategies — with hands-on Python and PyTorch code at every step.

Whether your goal is to build your own transformer from scratch, fine-tune existing models, or understand the mathematics and engineering behind state-of-the-art generative AI, this course will give you the foundation and tools you need.

What You’ll Learn

  • The complete architecture of LLMs — tokenization, embeddings, encoders, decoders, attention heads, feedforward networks, and layer normalization

  • Mathematics of attention mechanisms — dot-product attention, multi-head attention, positional encoding, causal masking, probabilistic token selection

  • Training LLMs — optimization (Adam, AdamW), loss functions, gradient accumulation, batch processing, learning-rate schedulers, regularization (L1, L2, decorrelation), gradient clipping

  • Fine-tuning and prompt engineering for downstream NLP tasks, system-tuning

  • Evaluation metrics — perplexity, accuracy, and benchmark datasets such as MAUVE, HellaSwag, SuperGLUE, and ways to assess bias and fairness

  • Practical PyTorch implementations of transformers, attention layers, and language model training loops, custom classes, custom loss functions

  • Inference techniques — greedy decoding, beam search, top-k sampling, temperature scaling

  • Scaling laws and trade-offs between model size, training data, and performance

  • Limitations and biases in LLMs — interpretability, ethical considerations, and responsible AI

  • Decoder-only transformers

  • Embeddings, including token embeddings and positional embeddings

  • Sampling techniques — methods for generating new text, including top-p, top-k, multinomial, and greedy

Why This Course Is Different

  • 93+ hours of HD video lectures — blending theory, code, and practical application

  • Code challenges in every section — with full, downloadable solutions

  • Builds from first principles — starting from basic Python/Numpy implementations and progressing to full PyTorch LLMs

  • Suitable for researchers, engineers, and advanced learners who want to go beyond “black box” API usage

  • Clear explanations without dumbing down the content — intensive but approachable

Who Is This Course For?

  • Machine learning engineers and data scientists

  • AI researchers and NLP specialists

  • Software developers interested in deep learning and generative AI

  • Graduate students or self-learners with intermediate Python skills and basic ML knowledge

Technologies & Tools Covered

  • Python and PyTorch for deep learning

  • NumPy and Matplotlib for numerical computing and visualization

  • Google Colab for free GPU access

  • Hugging Face Transformers for working with pre-trained models

  • Tokenizers and text preprocessing tools

  • Implement Transformers in PyTorch, fine-tune LLMs, decode with attention mechanisms, and probe model internals

What if you have questions about the material?

This course has a Q&A (question and answer) section where you can post your questions about the course material (about the maths, statistics, coding, or machine learning aspects). I try to answer all questions within a day. You can also see all other questions and answers, which really improves how much you can learn. And you can contribute to the Q&A by posting to ongoing discussions.

By the end of this course, you won’t just know how to work with LLMs — you’ll understand why they work the way they do, and be able to design, train, evaluate, and deploy your own transformer-based language models.

Enroll now and start mastering Large Language Models from the ground up.

Enroll now

What's inside

Learning objectives

  • Large language model (llm) architectures, including gpt (openai) and bert
  • Transformer blocks
  • Attention algorithm
  • Pytorch
  • Llm pretraining
  • Explainable ai
  • Mechanistic interpretability
  • Machine learning
  • Deep learning
  • Principal components analysis
  • High-dimensional clustering
  • Dimension reduction
  • Advanced cosine similarity applications
  • Show more
  • Show less

Syllabus

Introductions
[IMPORTANT] Prerequisites and how to succeed in this course
Using the Udemy platform
Getting the course code, and the detailed overview
Read more

Save this course

Create your own learning path. Save this course to your list so you can find it easily later.
Save

Activities

Coming soon We're preparing activities for A deep understanding of AI large language model mechanisms. These are activities you can do either before, during, or after a course.

Career center

Learners who complete A deep understanding of AI large language model mechanisms will develop knowledge and skills that may be useful to these careers:

Reading list

We haven't picked any books for this reading list yet.
This beginner-friendly guide focuses on the use of transformers in NLP, providing a solid foundation for understanding the inner workings of LLMs.
Offers a comprehensive overview of LLMs, covering their theoretical foundations, practical applications, and future directions.
This comprehensive handbook includes a chapter on LLMs, providing a thorough overview of their history, evolution, and applications.
This collection of papers presents cutting-edge research on LLMs, exploring their capabilities and potential applications in various NLP tasks.
Focuses on the use of Transformers for natural language processing tasks. It covers a wide range of topics, including language modeling, machine translation, and question answering.
Provides a practical guide to deep learning for computer vision, focusing on the design and implementation of deep learning models for image and video processing. It is written by a leading researcher in the field and is known for its clear explanations and hands-on approach.
Provides a comprehensive overview of deep learning for natural language processing, covering the fundamental concepts, algorithms, and applications. It is written by a leading researcher in the field and is considered one of the most authoritative resources on deep learning for NLP.
Provides a hands-on introduction to deep learning using the Python programming language. It is written by the creator of the Keras deep learning library and is known for its practical examples and clear explanations.
Provides a comprehensive overview of deep learning for genomics, covering the fundamental concepts, algorithms, and applications. It is written by a leading researcher in the field and is known for its clear explanations and hands-on approach.
Provides a comprehensive overview of deep learning for finance, covering the fundamental concepts, algorithms, and applications. It is written by a leading researcher in the field and is known for its clear explanations and hands-on approach.
Provides a comprehensive overview of deep learning for climate science, covering the fundamental concepts, algorithms, and applications. It is written by a leading researcher in the field and is known for its clear explanations and hands-on approach.
Provides a comprehensive overview of deep learning for materials science, covering the fundamental concepts, algorithms, and applications. It is written by a leading researcher in the field and is known for its clear explanations and hands-on approach.
Provides a comprehensive overview of deep learning for robotics, covering the fundamental concepts, algorithms, and applications. It is written by a leading researcher in the field and is known for its clear explanations and hands-on approach.
Provides a comprehensive overview of deep learning for transportation, covering the fundamental concepts, algorithms, and applications. It is written by a leading researcher in the field and is known for its clear explanations and hands-on approach.
Provides a comprehensive overview of deep learning, covering the fundamental concepts, algorithms, and applications. It is written by three leading researchers in the field and is considered one of the most authoritative resources on deep learning.

Share

Help others find this course page by sharing it with your friends and followers:

Similar courses

Similar courses are unavailable at this time. Please try again later.
Our mission

OpenCourser helps millions of learners each year. People visit us to learn workspace skills, ace their exams, and nurture their curiosity.

Our extensive catalog contains over 50,000 courses and twice as many books. Browse by search, by topic, or even by career interests. We'll match you to the right resources quickly.

Find this site helpful? Tell a friend about us.

Affiliate disclosure

We're supported by our community of learners. When you purchase or subscribe to courses and programs or purchase books, we may earn a commission from our partners.

Your purchases help us maintain our catalog and keep our servers humming without ads.

Thank you for supporting OpenCourser.

© 2016 - 2025 OpenCourser