We may earn an affiliate commission when you visit our partners.
Mehmet Ozkaya

In this course, you'll learn how to Design Generative AI Architectures with integrating AI-Powered S/LLMs into EShop Support Enterprise Applications using Prompt Engineering, RAG, Fine-tuning and Vector DBs.

We will design Generative AI Architectures with below components;

  1. Small and Large Language Models (S/LLMs)

  2. Prompt Engineering

  3. Retrieval Augmented Generation (RAG)

  4. Fine-Tuning

  5. Vector Databases

Read more

In this course, you'll learn how to Design Generative AI Architectures with integrating AI-Powered S/LLMs into EShop Support Enterprise Applications using Prompt Engineering, RAG, Fine-tuning and Vector DBs.

We will design Generative AI Architectures with below components;

  1. Small and Large Language Models (S/LLMs)

  2. Prompt Engineering

  3. Retrieval Augmented Generation (RAG)

  4. Fine-Tuning

  5. Vector Databases

We start with the basics and progressively dive deeper into each topic. We'll also follow LLM Augmentation Flow is a powerful framework that augments LLM results following the Prompt Engineering, RAG and Fine-Tuning.

Large Language Models (LLMs) module;

  • How Large Language Models (LLMs) works?

  • Capabilities of LLMs: Text Generation, Summarization, Q&A, Classification, Sentiment Analysis, Embedding Semantic Search, Code Generation

  • Generate Text with ChatGPT: Understand Capabilities and Limitations of LLMs (Hands-on)

  • Function Calling and Structured Output in Large Language Models (LLMs)

  • LLM Models: OpenAI ChatGPT, Meta Llama, Anthropic Claude, Google Gemini, Mistral Mixral, xAI Grok

  • SLM Models: OpenAI ChatGPT 4o mini, Meta Llama 3.2 mini, Google Gemma, Microsoft Phi 3.5

  • Interacting Different LLMs with Chat UI: ChatGPT, LLama, Mixtral, Phi3

  • Interacting OpenAI Chat Completions Endpoint with Coding

  • Installing and Running Llama and Gemma Models Using Ollama to run LLMs locally

  • Modernizing and Design EShop Support Enterprise Apps with AI-Powered LLM Capabilities

Prompt Engineering module;

  • Steps of Designing Effective Prompts: Iterate, Evaluate and Templatize

  • Advanced Prompting Techniques: Zero-shot, One-shot, Few-shot, Chain-of-Thought, Instruction and Role-based

  • Design Advanced Prompts for EShop Support – Classification, Sentiment Analysis, Summarization, Q&A Chat, and Response Text Generation

  • Design Advanced Prompts for Ticket Detail Page in EShop Support App w/ Q&A Chat and RAG

Retrieval-Augmented Generation (RAG) module;

  • The RAG Architecture Part 1: Ingestion with Embeddings and Vector Search

  • The RAG Architecture Part 2: Retrieval with Reranking and Context Query Prompts

  • The RAG Architecture Part 3: Generation with Generator and Output

  • E2E Workflow of a Retrieval-Augmented Generation (RAG) - The RAG Workflow

  • Design EShop Customer Support using RAG

  • End-to-End RAG Example for EShop Customer Support using OpenAI Playground

Fine-Tuning module;

  • Fine-Tuning Workflow

  • Fine-Tuning Methods: Full, Parameter-Efficient Fine-Tuning (PEFT), LoRA, Transfer

  • Design EShop Customer Support Using Fine-Tuning

  • End-to-End Fine-Tuning a LLM for EShop Customer Support using OpenAI Playground

Also, we will discuss

  • Choosing the Right Optimization – Prompt Engineering, RAG, and Fine-Tuning

Vector Database and Semantic Search with RAG module

  • What are Vectors, Vector Embeddings and Vector Database?

  • Explore Vector Embedding Models: OpenAI - text-embedding-3-small, Ollama - all-minilm

  • Semantic Meaning and Similarity Search: Cosine Similarity, Euclidean Distance

  • How Vector Databases Work: Vector Creation, Indexing, Search

  • Vector Search Algorithms: kNN, ANN, and Disk-ANN

  • Explore Vector Databases: Pinecone, Chroma, Weaviate, Qdrant, Milvus, PgVector, Redis

Lastly, we will Design EShopSupport Architecture with LLMs and Vector Databases

  • Using LLMs and VectorDBs as Cloud-Native Backing Services in Microservices Architecture

  • Design EShop Support with LLMs, Vector Databases and Semantic Search

  • Azure Cloud AI Services: Azure OpenAI, Azure AI Search

  • Design EShop Support with Azure Cloud AI Services: Azure OpenAI, Azure AI Search

This course is more than just learning Generative AI, it's a deep dive into the world of how to design Advanced AI solutions by integrating LLM architectures into Enterprise applications.

You'll get hands-on experience designing a complete EShop Customer Support application, including LLM capabilities like Summarization, Q&A, Classification, Sentiment Analysis, Embedding Semantic Search, Code Generation.

Enroll now

What's inside

Learning objectives

  • Generative ai model architectures (types of generative ai models)
  • Transformer architecture: attention is all you need
  • Large language models (llms) architectures
  • Text generation, summarization, q&a, classification, sentiment analysis, embedding semantic search
  • Generate text with chatgpt: understand capabilities and limitations of llms (hands-on)
  • Function calling and structured outputs in large language models (llms)
  • Llm providers: openai, meta ai, anthropic, hugging face, microsoft, google and mistral ai
  • Llm models: openai chatgpt, meta llama, anthropic claude, google gemini, mistral mixral, xai grok
  • Slm models: openai chatgpt 4o mini, meta llama 3.2 mini, google gemma, microsoft phi 3.5
  • How to choose llm models: quality, speed, price, latency and context window
  • Interacting different llms with chat ui: chatgpt, llama, mixtral, phi3
  • Installing and running llama and gemma models using ollama
  • Modernizing enterprise apps with ai-powered llm capabilities
  • Designing the 'eshop support app' with ai-powered llm capabilities
  • Advanced prompting techniques: zero-shot, one-shot, few-shot, cot
  • Design advanced prompts for ticket detail page in eshop support app w/ q&a chat and rag
  • The rag architecture: ingestion with embeddings and vector search
  • E2e workflow of a retrieval-augmented generation (rag) - the rag workflow
  • End-to-end rag example for eshop customer support using openai playground
  • Fine-tuning methods: full, parameter-efficient fine-tuning (peft), lora, transfer
  • End-to-end fine-tuning a llm for eshop customer support using openai playground
  • Choosing the right optimization – prompt engineering, rag, and fine-tuning
  • Vector database and semantic search with rag
  • Explore vector embedding models: openai - text-embedding-3-small, ollama - all-minilm
  • Explore vector databases: pinecone, chroma, weaviate, qdrant, milvus, pgvector, redis
  • Using llms and vectordbs as cloud-native backing services in microservices architecture
  • Design eshop support with llms, vector databases and semantic search
  • Design eshop support with azure cloud ai services: azure openai, azure ai search
  • Show more
  • Show less

Syllabus

Introduction
Tools and Resources for the Course - Course Slides
Course Project: EShop Customer Support with AI-Powered Capabilities using LLMs
What is Generative AI ?
Read more
Evolution of AI: AI, Machine Learning, Deep Learning and Generative AI
How Generative AI works ?
Generative AI Model Architectures (Types of Generative AI Models)
Transformer Architecture: Attention is All you Need
What are Large Language Models (LLMs) ?
How Large Language Models (LLMs) works?
What is Token And Tokenization ?
How LLMs Use Tokens
Capabilities of LLMs: Text Generation, Summarization, Q&A, Classification
LLM Use Cases and Real-World Applications
Limitations of Large Language Models (LLMs)
Generate Text with ChatGPT: Understand Capabilities and Limitations of LLMs
LLM Settings: Temperature, Max Tokens, Stop sequences, Top P, Frequency Penalty
Function Calling in Large Language Models (LLMs)
Structured Output in Large Language Models (LLMs)
What are Small Language Models (SLMs) ? Use Cases / How / Why / When
LLM Quiz
Exploring and Running Different LLMs w/ HuggingFace and Ollama
LLM Providers: OpenAI, Meta AI, Anthropic, Hugging Face, Microsoft, Google
LLM Models: OpenAI ChatGPT, Meta Llama, Anthropic Claude, Google Gemini, Mistral
SLM Models: OpenAI ChatGPT 4o mini, Meta Llama 3.2 mini, Gemma, Phi-3
How to Choose LLM Models: Quality, Speed, Price, Latency and Context Window
Open Source vs Proprietary Models
Hugging Face - The GitHub of Machine Learning Models
LLM Interaction Types: No-Code (ChatUI) or With-Code (API Keys)
Interacting Different LLMs with Chat UI: ChatGPT, LLama, Mixtral, Phi3
Interacting OpenAI Chat Completions Endpoint with Coding
Ollama – Run LLMs Locally
Installing and Running Llama and Gemma Models Using Ollama
Ollama integration using Semantic Kernel and C# with coding
Modernizing Enterprise Apps with AI-Powered LLM Capabilities
Designing the 'EShop Support App' with AI-Powered LLM Capabilities
LLMs Augmentation Flow: Prompt Engineering -> RAG -> Fine tunning -> Trained
Prompt Engineering
What is Prompt ?
Elements and Roles of a Prompt
What is Prompt Engineering ?
Steps of Designing Effective Prompts: Iterate, Evaluate and Templatize
Advanced Prompting Techniques
Zero-Shot Prompting
One-shot Prompting
Few-shot Prompting
Chain-of-Thought Prompting
Instruction-based and Role-based Prompting
Design Advanced Prompts for EShop Support – Classification, Sentiment Analysis
Design Advanced Prompts for Ticket Detail Page in EShop Support App w/ Q&A Chat
Test Prompts for Eshop Support Customer Ticket w/ Playground
Prompt Engineering Quiz
Retrieval-Augmented Generation (RAG)
What is Retrieval-Augmented Generation (RAG) ?
Why Need Retrieval-Augmented Generation (RAG) ? Why is RAG Important?
How Does Retrieval-Augmented Generation (RAG) Work?
The RAG Architecture Part 1: Ingestion with Embeddings and Vector Search
The RAG Architecture Part 2: Retrieval with Reranking and Context Query Prompts
The RAG Architecture Part 3: Generation with Generator and Output
E2E Workflow of a Retrieval-Augmented Generation (RAG) - The RAG Workflow
Applications Use Cases of RAG
Challenges and Key Considerations of Using RAG -- Retrieval-Augmented Generation
Design EShop Customer Support using RAG
End-to-End RAG Example for EShop Customer Support using OpenAI Playground
RAG Quiz
Fine-Tuning LLMs
What is Fine-Tuning ?
Why Need Fine-Tuning ?
When to Use Fine-Tuning ?
How Does Fine-Tuning Work?
Fine-Tuning Methods: Full, Parameter-Efficient Fine-Tuning (PEFT), LoRA
Applications & Use Cases of Fine-Tuning
Challenges and Key Considerations of Fine-Tuning
Design EShop Customer Support Using Fine-Tuning
End-to-End Fine-Tuning a LLM for EShop Customer Support using OpenAI Playground
Fine-Tuning Quiz
Choosing the Right Optimization – Prompt Engineering, RAG, and Fine-Tuning
Comparison of Prompt Engineering, RAG, and Fine-Tuning
Training Own Model for LLM Optimization
Vector Databases and Semantic Search with RAG
What is a Vector Database?
What are Vectors and Vector Embeddings ?
Explore Vector Embedding Models: OpenAI - text-embedding-3-small, Ollama minilm
Semantic Meaning and Similarity Search: Cosine Similarity, Euclidean Distance
How Vector Databases Work: Vector Creation, Indexing, Search
Vector Search Algorithms: kNN, ANN, and Disk-ANN
Use Cases and Applications of Vector Databases
Explore Vector Databases: Pinecone, Chroma, Weaviate, Qdrant, Milvus, PgVector
Vector Database Quiz
Design EShopSupport Architecture with LLMs and Vector Databases
Using LLMs and VectorDBs as Cloud-Native Backing Services in Microservices
Design EShop Support with LLMs, Vector Databases and Semantic Search
Azure Cloud AI Services: Azure OpenAI, Azure AI Search
Design EShop Support with Azure Cloud AI Services: Azure OpenAI, Azure AI Search
Thanks

Good to know

Know what's good
, what to watch for
, and possible dealbreakers
Covers LLM augmentation flow, which is a powerful framework that augments LLM results following prompt engineering, RAG, and fine-tuning, which are all essential for modern applications
Explores vector databases like Pinecone, Chroma, and Weaviate, which are crucial for building scalable and efficient AI-powered applications, especially those using semantic search
Includes hands-on experience with OpenAI Playground for RAG and fine-tuning, which allows learners to experiment and prototype AI solutions without extensive coding
Examines Azure Cloud AI Services like Azure OpenAI and Azure AI Search, which are valuable for deploying AI solutions in a cloud environment, especially for those using Microsoft technologies
Requires learners to interact with different LLMs using Chat UI and coding, which may require learners to create accounts and API keys with third-party services
Uses text-embedding-3-small, which may be outdated, as there may be newer embedding models available that offer better performance and efficiency for semantic search tasks

Save this course

Save Generative AI Architectures with LLM, Prompt, RAG, Vector DB to your list so you can find it easily later:
Save

Activities

Be better prepared before your course. Deepen your understanding during and after it. Supplement your coursework and achieve mastery of the topics covered in Generative AI Architectures with LLM, Prompt, RAG, Vector DB with these activities:
Review Transformer Architecture
Solidify your understanding of the foundational Transformer architecture. This will help you grasp the inner workings of LLMs, which are based on this architecture.
Browse courses on Transformer Architecture
Show steps
  • Read research papers on Transformer architecture.
  • Watch video lectures explaining self-attention.
  • Implement a simplified Transformer in code.
Read 'Generative AI with Python and TensorFlow 2'
Learn how to implement generative AI models using Python and TensorFlow 2. This book will provide practical guidance on building and training various generative models.
Show steps
  • Read the chapters on GANs, VAEs, and Transformers.
  • Experiment with the code examples provided in the book.
  • Adapt the code to your own projects.
Read 'Natural Language Processing with Transformers'
Gain a deeper understanding of Transformers and their applications in NLP. This book will provide practical insights into using Transformers for various generative AI tasks.
Show steps
  • Read the chapters on Transformer architecture and attention mechanisms.
  • Experiment with the code examples provided in the book.
  • Apply the concepts learned to your own projects.
Four other activities
Expand to see all activities and additional details
Show all seven activities
Experiment with Different Vector Databases
Gain hands-on experience with different vector databases. This will help you understand their strengths and weaknesses and choose the right database for your needs.
Show steps
  • Set up accounts with Pinecone, Chroma, and Weaviate.
  • Load a sample dataset into each database.
  • Perform similarity searches and compare the results.
  • Evaluate the performance and scalability of each database.
Build a Simple RAG Pipeline
Practice building a Retrieval-Augmented Generation (RAG) pipeline. This hands-on project will solidify your understanding of RAG architecture and its components.
Show steps
  • Choose a dataset of documents to use for retrieval.
  • Implement an embedding model to create vector representations of the documents.
  • Set up a vector database to store the embeddings.
  • Build a retrieval mechanism to find relevant documents based on a user query.
  • Integrate a generative model to generate a response based on the retrieved documents.
Write a Blog Post on Prompt Engineering
Reinforce your knowledge of prompt engineering techniques. Writing a blog post will force you to organize your thoughts and explain the concepts clearly.
Show steps
  • Research different prompt engineering techniques.
  • Choose a specific technique to focus on.
  • Write a clear and concise explanation of the technique.
  • Provide examples of how to use the technique effectively.
  • Publish your blog post online.
Contribute to a RAG Open Source Project
Deepen your understanding of RAG by contributing to an open-source project. This will expose you to real-world challenges and best practices.
Show steps
  • Find an open-source RAG project on GitHub.
  • Read the project's documentation and code.
  • Identify a bug or feature to work on.
  • Submit a pull request with your changes.

Career center

Learners who complete Generative AI Architectures with LLM, Prompt, RAG, Vector DB will develop knowledge and skills that may be useful to these careers:
Generative AI Engineer
A Generative AI Engineer specializes in developing and deploying systems that create new content using generative models. This course is exceptionally well-suited to this role, providing fundamental knowledge of generative AI architectures, including practical experience with large language models. The course covers prompt engineering, retrieval augmented generation, fine-tuning techniques, and vector databases, all essential for someone working with generative AI. The hands-on approach, including designing an EShop customer support application, enables a Generative AI Engineer to gain practical skills that can be directly applied to real-world applications. The practical skills gained in the prompt engineering and fine-tuning modules are essential.
Prompt Engineer
A Prompt Engineer is responsible for crafting effective prompts to elicit desired responses from large language models, and this course provides extensive training for this role. The course includes a module dedicated to prompt engineering which covers techniques like zero-shot, one-shot, few-shot, and chain-of-thought prompting. This enables the Prompt Engineer to design and iterate on prompts for various applications, and helps them understand how to evaluate their effectiveness. The course provides a foundation for a Prompt Engineer to create effective prompts for a variety of tasks, including text generation, summarization, and question answering.
AI Solutions Architect
An AI Solutions Architect designs and implements AI-driven systems, and this course directly aligns with that role by providing in-depth knowledge of generative AI architectures. The course covers integrating large language models into enterprise applications using prompt engineering, retrieval augmented generation, and vector databases, which are all crucial for building intelligent solutions. This course enables an AI Solutions Architect to understand how to design and optimize AI systems for real-world applications, particularly in customer support. The practical focus on building an EShop customer support application also translates directly to real world projects.
AI Application Developer
An AI Application Developer builds software applications that integrate AI capabilities, and this course directly supports that role. The course teaches how to incorporate large language models into real-world applications, using techniques like prompt engineering, retrieval augmented generation and fine-tuning. This course will enable an AI Application Developer to build intelligent applications that are capable of performing complex tasks using LLMs and related technologies. The focus on building an EShop support system provides hands-on experience with implementing those techniques in a practical business context.
Natural Language Processing Engineer
A Natural Language Processing Engineer focuses on developing systems that process and understand human language, and this course’s content directly aligns with that role. The course provides in-depth instruction on how large language models work, how to use them for various tasks, and how to fine-tune them for specific applications. A Natural Language Processing Engineer would find the modules on prompt engineering, RAG, and vector databases particularly useful. The course covers how to use these technologies to handle and process text data. This course helps a Natural Language Processing Engineer build systems that understand the nuances of human communication.
Machine Learning Engineer
A Machine Learning Engineer focuses on building and maintaining machine learning models and systems, and this course is highly relevant to that. This course teaches the practical aspects of working with large language models, including fine-tuning, prompt engineering, and utilizing vector databases, all of which are key for machine learning systems. The course’s detailed exploration of different LLM providers, model selection, and optimization strategies directly helps a Machine Learning Engineer build and deploy effective AI solutions. The emphasis on RAG and vector databases is particularly useful for building systems that can interact with and learn from large datasets.
Data Scientist
A Data Scientist uses data to derive insights and build predictive models; this course may be useful in that context. The course covers topics like vector databases and semantic search, which are valuable tools for data analysis. A Data Scientist can use these to enhance models, derive deeper meaning and explore relationships within data. The course's content on utilizing embeddings for semantic search and building retrieval systems can be directly applied to data analysis tasks. While much of this course focuses on the engineering side, the skills related to the analysis of LLM outputs can enhance the Data Scientist's toolkit.
Software Engineer
A Software Engineer develops software applications, and this course may be useful to that role, especially for those working on AI-powered features. The focus on integrating large language models into applications and using vector databases aligns with this area of software engineering. A Software Engineer can use this course to learn more about these cutting edge technologies. The software engineer benefits from understanding how to use these tools to build more intelligent and effective applications. The course provides practical experience and provides some of the tools used to integrate AI functionality into software projects.
AI Research Scientist
An AI Research Scientist explores theoretical and applied aspects of artificial intelligence, and this course may be useful to that role. The course provides a strong practical understanding of generative AI architectures, including large language models. An AI Research Scientist can benefit from the hands-on experience with LLMs and the various techniques for optimizing them. The research scientist can also learn by gaining exposure to prompt engineering, RAG, and fine-tuning, all of which inform future research. While this course is more focused on applied skills, it can enhance their understanding of current techniques and technologies.
Cloud Solutions Architect
A Cloud Solutions Architect designs and implements cloud-based systems, and this course may be useful to those who work with AI services on the cloud. The course covers how to integrate LLMs and vector databases into cloud architectures. A Cloud Solutions Architect can use this knowledge to design scalable and efficient cloud solutions that leverage AI capabilities. The course also includes material on Azure AI services, which is particularly relevant to cloud deployments leveraging Microsoft. The design of cloud native backing services is also covered, giving the Cloud Solutions Architect practical experience.
Technical Lead
A Technical Lead is responsible for guiding technical teams and projects, and this course may be useful for those who manage AI-focused projects. The course provides a broad understanding of how to integrate large language models with other systems via techniques like prompt engineering, RAG and fine-tuning. This helps the Technical Lead guide projects that are integrating those technologies. The course helps a Technical Lead understand the different optimization strategies for LLMs and how to use vector databases to improve performance and capabilities.
Data Analyst
A Data Analyst examines data to identify trends and insights that help inform business decisions. This course may be helpful as it includes topics like semantic search on vector databases, which can help organize data. The ability to use AI to augment data analysis can help a Data Analyst work faster. This course also includes an overview of how embeddings are used in semantic search, which is relevant for data analysis and retrieval. This course could be a useful way to improve a Data Analyst's capabilities.
Business Intelligence Developer
A Business Intelligence Developer designs and implements systems that analyze business data. This course may be useful to those exploring the role of AI in business intelligence. The course includes material on how to use LLMs to process and interact with text data. This can enhance data analysis capabilities and help to derive more meaningful insights. While this course does not directly teach business intelligence techniques, it can enhance a Business Intelligence Developer's capabilities by teaching how to integrate AI into existing technology.
Database Administrator
A Database Administrator manages databases. This course may be useful for a Database Administrator to gain a better understanding of vector databases. This course will help a Database Administrator understand how semantic search works and how vector databases can be used to augment existing databases. This may lead to new opportunities within a Database Administrator's role.
Support Specialist
A Support Specialist provides customer support and this course may be useful to that role. The course provides an overview of how large language models work and how they can be used to provide customer support. This can help a Support Specialist leverage AI to improve their capabilities and overall support processes. The design of an EShop support application will be directly applicable to a Support Specialist who wishes to incorporate AI into their day to day workload. The information in this course is helpful in understanding how AI can be used in customer support.

Reading list

We've selected two books that we think will supplement your learning. Use these to develop background knowledge, enrich your coursework, and gain a deeper understanding of the topics covered in Generative AI Architectures with LLM, Prompt, RAG, Vector DB.
Provides a comprehensive guide to using Transformers for NLP tasks. It covers the Transformer architecture in detail, including attention mechanisms and encoder-decoder structures. It also explores various applications of Transformers, such as text generation, summarization, and question answering. This book is valuable as a reference for understanding the practical aspects of using Transformers in Generative AI.
Provides a practical guide to building generative AI models using Python and TensorFlow 2. It covers various generative models, including GANs, VAEs, and Transformers. It also explores different applications of generative AI, such as image generation, text generation, and music generation. This book is valuable for learning how to implement generative AI models in code.

Share

Help others find this course page by sharing it with your friends and followers:

Similar courses

Similar courses are unavailable at this time. Please try again later.
Our mission

OpenCourser helps millions of learners each year. People visit us to learn workspace skills, ace their exams, and nurture their curiosity.

Our extensive catalog contains over 50,000 courses and twice as many books. Browse by search, by topic, or even by career interests. We'll match you to the right resources quickly.

Find this site helpful? Tell a friend about us.

Affiliate disclosure

We're supported by our community of learners. When you purchase or subscribe to courses and programs or purchase books, we may earn a commission from our partners.

Your purchases help us maintain our catalog and keep our servers humming without ads.

Thank you for supporting OpenCourser.

© 2016 - 2025 OpenCourser