We may earn an affiliate commission when you visit our partners.
Course image
Google Cloud Training

This is a self-paced lab that takes place in the Google Cloud console. In addition to batch pipelines, Data Fusion also allows you to create real-time pipelines, that can process events as they are generated. Currently, realtime pipelines execute using Apache Spark Streaming on Cloud Dataproc clusters. In this lab, you will learn how to build a streaming pipeline using Data Fusion.

Enroll now

What's inside

Syllabus

Building Realtime Pipelines in Cloud Data Fusion

Good to know

Know what's good
, what to watch for
, and possible dealbreakers
Teaches real-time pipelines using Apache Spark Streaming on Cloud Dataproc clusters, which is the standard in industry for real-time analytics
Builds a strong foundation for beginners to develop professional skills or deep expertise in real-time data analysis
Taught by Google Cloud Training, who are recognized for their work in cloud computing
May require prior experience with Apache Spark and Cloud Dataproc

Save this course

Save Building Realtime Pipelines in Cloud Data Fusion to your list so you can find it easily later:
Save

Activities

Be better prepared before your course. Deepen your understanding during and after it. Supplement your coursework and achieve mastery of the topics covered in Building Realtime Pipelines in Cloud Data Fusion with these activities:
Review data fusion and BigQuery concepts
Review key concepts of data fusion and BigQuery to strengthen your foundation for this course.
Browse courses on Data Fusion
Show steps
  • Review the documentation on data fusion and BigQuery
  • Complete the quickstart tutorials for both services
  • Create a sample data pipeline using data fusion and BigQuery
Review BigQuery SQL
This activity will strengthen your understanding of BigQuery SQL, which is a critical skill for working with large datasets in Google Cloud.
Show steps
  • Review the BigQuery SQL documentation
  • Complete the BigQuery SQL tutorial
  • Practice writing BigQuery SQL queries
Form a study group with classmates
Join or form a study group to collaborate with classmates, discuss concepts, and prepare for assessments.
Browse courses on Collaboration
Show steps
  • Identify classmates interested in forming a study group
  • Establish regular meeting times and a communication channel
  • Prepare study materials and discussion topics
Seven other activities
Expand to see all activities and additional details
Show all ten activities
Attend a meetup on data fusion or streaming data
Attend a meetup to connect with other professionals in the field and learn about the latest trends and best practices.
Browse courses on Data Fusion
Show steps
  • Find a meetup in your area
  • Attend the meetup and engage with other attendees
Build a Data Pipeline Using Cloud Data Fusion
This activity will provide hands-on experience with Cloud Data Fusion, which is a fully managed data integration service.
Browse courses on Data Fusion
Show steps
  • Follow the Cloud Data Fusion quickstart guide
  • Create a simple data pipeline
  • Deploy the data pipeline
Build a streaming data pipeline
Build a streaming data pipeline using the techniques learned in this course to practice and reinforce your understanding.
Browse courses on Streaming Data
Show steps
  • Create a source and sink dataset in BigQuery
  • Create a streaming pipeline in data fusion using Apache Spark Streaming
  • Test the pipeline using a streaming data source
Create a blog post or article on streaming data pipelines
Create a blog post or article to share your knowledge of streaming data pipelines, solidifying your understanding and potentially helping others.
Browse courses on Streaming Data
Show steps
  • Choose a topic related to streaming data pipelines
  • Research and gather information on the topic
  • Create an outline and write the content
  • Edit and publish your blog post or article
Cloud Data Fusion Scenarios
This activity will provide hands-on experience with Cloud Data Fusion in realistic scenarios.
Browse courses on Cloud Data Fusion
Show steps
  • Analyze a data pipeline and identify potential issues
  • Troubleshoot a data pipeline
Create a presentation on streaming data pipelines
Create a presentation to demonstrate your knowledge of streaming data pipelines and practice your presentation skills.
Browse courses on Streaming Data
Show steps
  • Outline the key concepts of streaming data pipelines
  • Create slides that explain the architecture and components of a streaming data pipeline
  • Practice delivering the presentation
Build a end-to-end data pipeline project
Undertake a comprehensive project to design and implement a complete data pipeline, integrating the skills and knowledge acquired in this course.
Browse courses on Data Pipelines
Show steps
  • Define the scope and objectives of the project
  • Gather and prepare the necessary data
  • Design and implement the data pipeline
  • Test and evaluate the pipeline
  • Deploy and monitor the pipeline

Career center

Learners who complete Building Realtime Pipelines in Cloud Data Fusion will develop knowledge and skills that may be useful to these careers:
Data Engineer
A Data Engineer can use the skills they learn in this course to build and maintain real-time data pipelines. These pipelines can be used to process large amounts of data in real time, which can help businesses make better decisions and respond to changing conditions more quickly. This course can help Data Engineers gain the skills they need to be successful in their careers by providing them with hands-on experience in building real-time data pipelines.
Data Analyst
A Data Analyst can use the skills they learn in this course to create and implement real-time data pipelines. These pipelines can be used to analyze data in real time, which can help businesses make better decisions and respond to changing conditions more quickly. This course can help Data Analysts gain the skills they need to be successful in their careers by providing them with hands-on experience in building real-time data pipelines.
Data Scientist
A Data Scientist can use the skills they learn in this course to develop and implement real-time data analysis models. These models can be used to analyze data in real time and identify trends and patterns that can help businesses make better decisions. This course can help Data Scientists gain the skills they need to be successful in their careers by providing them with hands-on experience in building real-time data analysis models.
Software Engineer
A Software Engineer can use the skills they learn in this course to develop and implement real-time data processing systems. These systems can be used to process large amounts of data in real time, which can help businesses make better decisions and respond to changing conditions more quickly. This course can help Software Engineers gain the skills they need to be successful in their careers by providing them with hands-on experience in building real-time data processing systems.
Cloud Architect
A Cloud Architect can use the skills they learn in this course to design and implement real-time data pipelines. These pipelines can be used to process large amounts of data in real time, which can help businesses make better decisions and respond to changing conditions more quickly. This course can help Cloud Architects gain the skills they need to be successful in their careers by providing them with hands-on experience in designing and implementing real-time data pipelines.
DevOps Engineer
A DevOps Engineer can use the skills they learn in this course to deploy and manage real-time data pipelines. These pipelines can be used to process large amounts of data in real time, which can help businesses make better decisions and respond to changing conditions more quickly. This course can help DevOps Engineers gain the skills they need to be successful in their careers by providing them with hands-on experience in deploying and managing real-time data pipelines.
Data Quality Analyst
A Data Quality Analyst can use the skills they learn in this course to ensure that data is accurate and consistent in real time. This can help businesses make better decisions and respond to changing conditions more quickly. This course can help Data Quality Analysts gain the skills they need to be successful in their careers by providing them with hands-on experience in ensuring that data is accurate and consistent in real time.
Business Analyst
A Business Analyst can use the skills they learn in this course to analyze data in real time and identify trends and patterns that can help businesses make better decisions. This course can help Business Analysts gain the skills they need to be successful in their careers by providing them with hands-on experience in analyzing data in real time.
Data Integration Specialist
A Data Integration Specialist can use the skills they learn in this course to integrate data from multiple sources in real time. This can help businesses make better decisions and respond to changing conditions more quickly. This course can help Data Integration Specialists gain the skills they need to be successful in their careers by providing them with hands-on experience in integrating data from multiple sources in real time.
Data Management Analyst
A Data Management Analyst can use the skills they learn in this course to analyze and manage data in real time. This can help businesses make better decisions and respond to changing conditions more quickly. This course can help Data Management Analysts gain the skills they need to be successful in their careers by providing them with hands-on experience in analyzing and managing data in real time.
Database Administrator
A Database Administrator can use the skills they learn in this course to manage and maintain databases in real time. This can help businesses make better decisions and respond to changing conditions more quickly. This course can help Database Administrators gain the skills they need to be successful in their careers by providing them with hands-on experience in managing and maintaining databases in real time.
Data Warehouse Analyst
A Data Warehouse Analyst can use the skills they learn in this course to analyze data in a data warehouse and identify trends and patterns that can help businesses make better decisions. This course can help Data Warehouse Analysts gain the skills they need to be successful in their careers by providing them with hands-on experience in analyzing data in a data warehouse.
Cloud Data Engineer
A Cloud Data Engineer can use the skills they learn in this course to build and manage data pipelines in the cloud. These pipelines can be used to process large amounts of data in real time, which can help businesses make better decisions and respond to changing conditions more quickly. This course can help Cloud Data Engineers gain the skills they need to be successful in their careers by providing them with hands-on experience in building and managing data pipelines in the cloud.
Systems Analyst
A Systems Analyst can use the skills they learn in this course to analyze and design systems that process data in real time. This can help businesses make better decisions and respond to changing conditions more quickly. This course can help Systems Analysts gain the skills they need to be successful in their careers by providing them with hands-on experience in analyzing and designing systems that process data in real time.
Information Security Analyst
An Information Security Analyst can use the skills they learn in this course to protect data in real time. This can help businesses make better decisions and respond to changing conditions more quickly. This course can help Information Security Analysts gain the skills they need to be successful in their careers by providing them with hands-on experience in protecting data in real time.

Reading list

We've selected nine books that we think will supplement your learning. Use these to develop background knowledge, enrich your coursework, and gain a deeper understanding of the topics covered in Building Realtime Pipelines in Cloud Data Fusion.
Provides a hands-on guide to using Apache Spark for real-time data processing. It covers topics such as Spark Streaming, Spark SQL, and Spark MLlib.
Provides a comprehensive overview of Spark. It covers the basics of Spark, as well as more advanced topics such as Spark SQL and Spark Streaming.
Provides a comprehensive guide to using AWS for real-time machine learning. It covers topics such as Amazon SageMaker, Amazon Kinesis, and Amazon EMR.
Provides a comprehensive overview of the data mesh architecture. It covers the benefits of the data mesh architecture, as well as how to implement it in your organization.
Provides a comprehensive overview of data-intensive text processing with MapReduce. It covers the basics of MapReduce, as well as more advanced topics such as natural language processing and machine learning.
Provides a comprehensive overview of deep learning with Python. It covers the basics of deep learning, as well as more advanced topics such as convolutional neural networks and recurrent neural networks.
Provides a comprehensive overview of Spark. It covers topics such as Spark architecture, data processing, and data analysis.
Provides a comprehensive overview of designing data-intensive applications. It covers topics such as data modeling, data storage, and data processing.

Share

Help others find this course page by sharing it with your friends and followers:

Similar courses

Here are nine courses similar to Building Realtime Pipelines in Cloud Data Fusion.
Building Resilient Streaming Analytics Systems on Google...
Most relevant
Building Advanced Codeless Pipelines on Cloud Data Fusion
Most relevant
Building Resilient Streaming Analytics Systems on Google...
Most relevant
Data Engineering using Kafka and Spark Structured...
Most relevant
Handling Streaming Data with Azure Databricks Using Spark...
Most relevant
Exploring the Apache Beam SDK for Modeling Streaming Data...
Most relevant
Building Batch Data Pipelines on Google Cloud
Most relevant
Redacting Confidential Data within your Pipelines in...
Most relevant
Conceptualizing the Processing Model for the GCP Dataflow...
Most relevant
Our mission

OpenCourser helps millions of learners each year. People visit us to learn workspace skills, ace their exams, and nurture their curiosity.

Our extensive catalog contains over 50,000 courses and twice as many books. Browse by search, by topic, or even by career interests. We'll match you to the right resources quickly.

Find this site helpful? Tell a friend about us.

Affiliate disclosure

We're supported by our community of learners. When you purchase or subscribe to courses and programs or purchase books, we may earn a commission from our partners.

Your purchases help us maintain our catalog and keep our servers humming without ads.

Thank you for supporting OpenCourser.

© 2016 - 2024 OpenCourser