We may earn an affiliate commission when you visit our partners.
Course image
Younes Belkada, Marc Sun , and Maria Khalusova

The availability of models and their weights for anyone to download enables a broader range of developers to innovate and create.

In this course, you’ll select open source models from Hugging Face Hub to perform NLP, audio, image and multimodal tasks using the Hugging Face transformers library. Easily package your code into a user-friendly app that you can run on the cloud using Gradio and Hugging Face Spaces.

You will:

1. Use the transformers library to turn a small language model into a chatbot capable of multi-turn conversations to answer follow-up questions.

Read more

The availability of models and their weights for anyone to download enables a broader range of developers to innovate and create.

In this course, you’ll select open source models from Hugging Face Hub to perform NLP, audio, image and multimodal tasks using the Hugging Face transformers library. Easily package your code into a user-friendly app that you can run on the cloud using Gradio and Hugging Face Spaces.

You will:

1. Use the transformers library to turn a small language model into a chatbot capable of multi-turn conversations to answer follow-up questions.

2. Translate between languages, summarize documents, and measure the similarity between two pieces of text, which can be used for search and retrieval.

3. Convert audio to text with Automatic Speech Recognition (ASR), and convert text to audio using Text to Speech (TTS).

4. Perform zero-shot audio classification, to classify audio without fine-tuning the model.

5. Generate an audio narration describing an image by combining object detection and text-to-speech models.

6. Identify objects or regions in an image by prompting a zero-shot image segmentation model with points to identify the object that you want to select.

7. Implement visual question answering, image search, image captioning and other multimodal tasks.

8. Share your AI app using Gradio and Hugging Face Spaces to run your applications in a user-friendly interface on the cloud or as an API.

The course will provide you with the building blocks that you can combine into a pipeline to build your AI-enabled applications!

Enroll now

What's inside

Syllabus

Open Source Models with Hugging Face
The availability of models and their weights for anyone to download enables a broader range of developers to innovate and create. In this course, you’ll select open source models from Hugging Face Hub to perform NLP, audio, image and multimodal tasks using the Hugging Face transformers library. Easily package your code into a user-friendly app that you can run on the cloud using Gradio and Hugging Face Spaces. You will: (1) Use the transformers library to turn a small language model into a chatbot capable of multi-turn conversations to answer follow-up questions. (2) Translate between languages, summarize documents, and measure the similarity between two pieces of text, which can be used for search and retrieval. (3) Convert audio to text with Automatic Speech Recognition (ASR), and convert text to audio using Text to Speech (TTS). (4) Perform zero-shot audio classification, to classify audio without fine-tuning the model. (5) Generate an audio narration describing an image by combining object detection and text-to-speech models. (6) Identify objects or regions in an image by prompting a zero-shot image segmentation model with points to identify the object that you want to select. (7) Implement visual question answering, image search, image captioning and other multimodal tasks. (8) Share your AI app using Gradio and Hugging Face Spaces to run your applications in a user-friendly interface on the cloud or as an API. The course will provide you with the building blocks that you can combine into a pipeline to build your AI-enabled applications!

Good to know

Know what's good
, what to watch for
, and possible dealbreakers
Develops core Natural Language Processing (NLP), audio, image, and multimodal skills
Provides hands-on experience with model deployment and packaging
Utilizes Gradio and Hugging Face Spaces for seamless app sharing and API implementation
Taught by renowned Hugging Face instructors with expertise in NLP, audio, and image processing
May not be suitable for complete beginners in NLP, audio, or image processing
Requires basic knowledge of Python programming

Save this course

Save Open Source Models with Hugging Face to your list so you can find it easily later:
Save

Activities

Be better prepared before your course. Deepen your understanding during and after it. Supplement your coursework and achieve mastery of the topics covered in Open Source Models with Hugging Face with these activities:
Review fundamentals of signal processing for audio applications
A strong foundation in signal processing is essential for audio-related tasks. This activity refreshes your knowledge, ensuring a solid understanding.
Browse courses on Signal Processing
Show steps
  • Read through the Hugging Face documentation on audio processing
  • Review basic concepts of signal processing, such as sampling, quantization, and filtering
Practice language translation using the transformers library
Hands-on practice with the transformers library helps solidify your understanding of language translation techniques.
Show steps
  • Translate a short paragraph from English to French in Python using the transformers library
  • Explore different translation models and compare their accuracy
Build an image classification model with Hugging Face Hub
This activity provides practical experience in model deployment and serving, enhancing your understanding of the model deployment process.
Show steps
  • Follow the Hugging Face tutorial on deploying a model to Hugging Face Hub
  • Deploy your own image classification model to Hugging Face Hub
One other activity
Expand to see all activities and additional details
Show all four activities
Develop a multimodal AI application using Gradio
Creating a multimodal AI application allows you to apply your knowledge in a practical project, solidifying your understanding of multimodal AI.
Show steps
  • Design the user interface for your application using Gradio
  • Integrate multiple AI models into your application

Career center

Learners who complete Open Source Models with Hugging Face will develop knowledge and skills that may be useful to these careers:

Reading list

We haven't picked any books for this reading list yet.

Share

Help others find this course page by sharing it with your friends and followers:

Similar courses

Here are nine courses similar to Open Source Models with Hugging Face.
Building Generative AI-Powered Applications with Python
Most relevant
Building Multimodal Search and RAG
Most relevant
Developing Generative AI Applications with Python
Most relevant
Using Neural Networks for Image and Voice Data Analysis
Most relevant
OpenAI Transcription API
Most relevant
DevOps, DataOps, MLOps
Most relevant
It Speaks! Create Synthetic Speech Using Cloud Text-to...
Most relevant
Create video, audio and infographics for online learning
Most relevant
Deploy A Microsoft Azure Speech To Text Web App
Most relevant
Our mission

OpenCourser helps millions of learners each year. People visit us to learn workspace skills, ace their exams, and nurture their curiosity.

Our extensive catalog contains over 50,000 courses and twice as many books. Browse by search, by topic, or even by career interests. We'll match you to the right resources quickly.

Find this site helpful? Tell a friend about us.

Affiliate disclosure

We're supported by our community of learners. When you purchase or subscribe to courses and programs or purchase books, we may earn a commission from our partners.

Your purchases help us maintain our catalog and keep our servers humming without ads.

Thank you for supporting OpenCourser.

© 2016 - 2024 OpenCourser