Sorry, this page is no longer available

We may earn an affiliate commission when you visit our partners.

Building Batch Data Pipelines on GCP em Português Brasileiro

Google Cloud Training

Os pipelines de dados geralmente se encaixam em um dos três paradigmas: extração-carregamento, extração-carregamento-transformação ou extração-transformação-carregamento. Este curso descreve qual paradigma deve ser usado em determinadas situações e quando isso ocorre com dados em lote. Além disso, vamos falar sobre várias tecnologias no Google Cloud para transformação de dados, incluindo o BigQuery, a execução do Spark no Dataproc, gráficos de pipeline no Cloud Data Fusion e processamento de dados sem servidor com o Dataflow. Os participantes vão ganhar experiência prática na criação de componentes de pipelines de dados no Google Cloud usando o Qwiklabs.

Enroll now

Or subscribe to Coursera Plus

And get unlimited access to Coursera

Here's a deal for you

Save money when you learn with a deal that may be relevant to this course.

All coupon codes, vouchers, and discounts are applied automatically unless otherwise noted.

Valid until August 30

Google AI App Builder

Learn how to use Gemini API and API Studio with a three-course series from Google DeepMind

What's inside

Syllabus

Introdução

Neste módulo, vamos apresentar o curso e a programação

Introdução à criação de pipelines de dados em lote

Este módulo analisa diferentes métodos de carregamento de dados: EL, ELT e ETL (e quando cada um deve ser usado)

Traffic lights

Read about what's good

what should give you pause

and possible dealbreakers

Ensina vários métodos de carregamento de dados em lote

Ensina quando utilizar o EL, ELT e ETL

Oferece experiência prática no uso de tecnologias do Google Cloud, como BigQuery, Dataproc e Dataflow

Ensina como gerenciar pipelines de dados com o Cloud Data Fusion e o Cloud Composer

Oferece atividades práticas no Qwiklabs, permitindo que os alunos apliquem os conceitos

Save this course

Create your own learning path. Save this course to your list so you can find it easily later.

Save

Reviews summary

Pipelines de dados batch em gcp

De acordo com os alunos, este curso oferece uma base sólida e prática para a construção de pipelines de dados em lote no GCP. Muitos elogiam a clareza das explicações e a didática impecável, destacando os laboratórios Qwiklabs como um ponto forte para a fixação do conteúdo. A disponibilidade do curso em português brasileiro é amplamente apreciada. No entanto, alguns estudantes com menos experiência em nuvem acharam o ritmo um pouco rápido, sugerindo que uma base prévia em cloud pode ser benéfica. Há também menções de que certos tópicos, como Dataproc e , poderiam ter uma cobertura mais aprofundada ou precisar de atualização.

Permite uma compreensão mais fluida e direta para falantes nativos.

"O português claro é um diferencial."

"O idioma nativo é um grande bônus."

"Muito satisfeito! ... O idioma nativo é um grande bônus."

Proporcionam experiência prática valiosa com as ferramentas do GCP.

"Os laboratórios Qwiklabs são super práticos e ajudam muito a fixar o conteúdo."

"A didática é impecável, e os exemplos práticos com Dataflow foram extremamente úteis."

"Os exercícios práticos são o ponto forte... Os laboratórios me deram confiança para aplicar o que aprendi."

Alguns tópicos poderiam ser mais detalhados ou atualizados.

"Sinto que alguns tópicos poderiam ser mais aprofundados, especialmente o Dataproc."

"Apenas acho que a seção de Cloud Data Fusion poderia ter mais detalhes."

"O curso cobre os tópicos essenciais, mas alguns módulos parecem desatualizados ou com informações superficiais, principalmente sobre Dataproc."

Pode ser rápido para iniciantes sem base prévia em cloud.

"Conteúdo relevante, mas achei um pouco rápido para quem não tem experiência prévia com cloud."

"Tive que pausar e pesquisar bastante. Os labs são bons, mas exigem um pouco de familiaridade."

"Para quem já tem alguma experiência, pode sentir falta de profundidade... Para iniciantes, é um desafio."

Activities

Be better prepared before your course. Deepen your understanding during and after it. Supplement your coursework and achieve mastery of the topics covered in Building Batch Data Pipelines on GCP em Português Brasileiro with these activities:

Organize and Review Course Resources

Show steps

Organize and review course materials, ensuring that you have a comprehensive understanding of the topics covered, maximizing retention and recall.

Show steps

Gather lecture notes, slides, and assignments.
Review materials regularly to reinforce understanding.

Participate in Q&A Forums

Show steps

Contribute to the course community by actively participating in discussion forums, answering questions, and sharing your insights, solidifying your understanding while assisting others.

Browse courses on Q&A

Show steps

Join relevant Q&A forums or discussion groups.
Monitor forums for questions related to course topics.
Provide thoughtful and accurate responses.

Practice SQL skills on Qwiklabs

Show steps

Completing SQL drills will help you reinforce foundational SQL skills and improve your ability to transform data effectively.

Browse courses on SQL

Show steps

Follow the instructions in the Qwiklabs SQL course.
Complete the SQL exercises in the course.
Review the solutions and explanations.

Seven other activities

Expand to see all activities and additional details

Show all ten activities

Follow a tutorial on using the Cloud Data Fusion GUI

Show steps

By following a tutorial, you'll gain practical experience in using the Cloud Data Fusion interface, which is essential for managing pipelines.

Browse courses on Cloud Data Fusion

Show steps

Choose a tutorial on the Cloud Data Fusion website or YouTube.
Follow the instructions in the tutorial.
Create a simple pipeline using the GUI.

Hands-on Hadoop and Spark Exercises

Show steps

Practice essential Hadoop and Spark techniques within the Dataproc environment, solidifying your understanding of data processing frameworks.

Browse courses on Hadoop

Show steps

Configure and launch a Hadoop cluster on Dataproc.
Run a Spark job to process data stored in Cloud Storage.
Optimize your Spark job configurations for performance.

Explore Dataflow Documentation and Tutorials

Show steps

Expand your knowledge of Dataflow by exploring official documentation and tutorials, gaining a deeper understanding of its capabilities for serverless data processing.

Browse courses on Dataflow

Show steps

Review Dataflow documentation to understand its architecture and key features.
Follow hands-on tutorials to build and deploy Dataflow pipelines.

Attend a Cloud Data Fusion Workshop

Show steps

Enhance your understanding of Cloud Data Fusion by attending a workshop, where you'll gain hands-on experience and expert insights into managing data pipelines.

Browse courses on Cloud Data Fusion

Show steps

Register for a Cloud Data Fusion workshop.
Attend the workshop and actively participate.
Follow up after the workshop to practice and reinforce your learnings.

Write a blog post summarizing the key concepts of data transformation

Show steps

Writing a blog post will help you consolidate your understanding of data transformation concepts and share your knowledge with others.

Browse courses on Data Transformation

Show steps

Research and gather information on data transformation.
Organize your thoughts and create an outline.
Write the blog post.
Revise and edit your post.
Publish your blog post.

Design a Data Pipeline Architecture

Show steps

Apply your knowledge to create a comprehensive data pipeline architecture, demonstrating your ability to design and plan data processing solutions.

Show steps

Define the data sources, transformations, and destination for your pipeline.
Select appropriate technologies and services for each pipeline component.
Document your design and share it for review.

Build a Real-World Data Pipeline

Show steps

Challenge yourself by building a data pipeline that addresses a real-world problem, demonstrating your ability to apply course concepts and solve practical data challenges.

Show steps

Identify a business problem that can be solved with a data pipeline.
Gather and prepare the necessary data.
Design and implement a data pipeline to process and analyze the data.
Deploy and monitor your pipeline.

Career center

Learners who complete Building Batch Data Pipelines on GCP em Português Brasileiro will develop knowledge and skills that may be useful to these careers:

ETL Developer

ETL Developers design and build data pipelines that extract, transform, and load data from one system to another. They use their understanding of data pipelines and data integration tools to create solutions that can help organizations move data between different systems.

See salaries and explore the career path for ETL Developer

Data Pipeline Architect

Data Pipeline Architects design and build data pipelines for organizations. They use their understanding of data pipelines and data engineering principles to create solutions that can help organizations move data between different systems and applications.

See salaries and explore the career path for Data Pipeline Architect

Cloud Data Engineer

Cloud Data Engineers are responsible for designing, building, and managing data pipelines in the cloud. They use their understanding of cloud computing technologies and data engineering principles to create solutions that can help organizations move data to and from the cloud.

See salaries and explore the career path for Cloud Data Engineer

Big Data Engineer

Big Data Engineers are responsible for designing, building, and managing big data pipelines. They use their understanding of big data technologies and data engineering principles to create solutions that can help organizations store, process, and analyze large volumes of data.

See salaries and explore the career path for Big Data Engineer

Data Warehouse Engineer

Data Warehouse Engineers are responsible for designing, building, and managing data warehouses. They use their understanding of data warehousing technologies and data engineering principles to create solutions that can help organizations store and analyze large volumes of data.

See salaries and explore the career path for Data Warehouse Engineer

Data Lake Engineer

Data Lake Engineers are responsible for designing, building, and managing data lakes. They use their understanding of data lake technologies and data engineering principles to create solutions that can help organizations store and process large volumes of data in its raw format.

See salaries and explore the career path for Data Lake Engineer

Data Management Consultant

Data Management Consultants help organizations improve their data management practices. They use their understanding of data management principles and technologies to help organizations define and implement data management strategies.

See salaries and explore the career path for Data Management Consultant

Data Engineer

Data Engineers use their mastery of big data tools and infrastructure to design, build, and maintain data management systems. Enterprises hire Data Engineers to help make sense of their massive volumes of data and to help find ways to leverage it. This course "Building Batch Data Pipelines on GCP em Português Brasileiro" may be useful as a way of building a foundation in working with data pipelines which is part of the core work of a Data Engineer.

See salaries and explore the career path for Data Engineer

Database Administrator

Database Administrators use their understanding of database technologies to design, implement, and maintain database management systems. They are responsible for ensuring that their organization has fast, secure, and reliable access to its data. This course "Building Batch Data Pipelines on GCP em Português Brasileiro" may be useful as a way of building a foundation in pipeline data processing which can be used in database management systems.

See salaries and explore the career path for Database Administrator

Data Scientist

Data Scientists use their expertise in statistics, machine learning, and data mining to extract insights from data. They use these insights to help organizations make better decisions. This course "Building Batch Data Pipelines on GCP em Português Brasileiro" may be useful as a way of building a foundation in data pipelines which is a core part of the Data Scientist's toolkit.

See salaries and explore the career path for Data Scientist

Data Analyst

Data Analysts gather, analyze, interpret, and present data to help organizations make informed decisions. Data Analysts use their understanding of data and statistical techniques to identify trends, patterns, and insights in data. This course "Building Batch Data Pipelines on GCP em Português Brasileiro" may be useful as a way of building a foundation in data pipelining.

See salaries and explore the career path for Data Analyst

Data Integration Engineer

Data Integration Engineers design and build data integration solutions. They use their understanding of data integration tools and technologies to create solutions that can help organizations integrate data from multiple sources into a single, unified view.

See salaries and explore the career path for Data Integration Engineer

Machine Learning Engineer

Machine Learning Engineers are responsible for building, deploying, and maintaining machine learning models. They use their understanding of machine learning algorithms and cloud computing technologies to create models that can help organizations solve real-world problems. This course "Building Batch Data Pipelines on GCP em Português Brasileiro" may be useful as a way of building a foundation in creating and managing data pipelines for machine learning.

See salaries and explore the career path for Machine Learning Engineer

Software Engineer

Software Engineers design, develop, and maintain software applications. They use their understanding of programming languages and software engineering principles to create software that meets the needs of users. This course "Building Batch Data Pipelines on GCP em Português Brasileiro" may be useful as a way of building a foundation in data pipeline technologies which are used by Software Engineers who work on big data projects.

See salaries and explore the career path for Software Engineer

Cloud Architect

Cloud Architects design and manage cloud computing infrastructure. They use their understanding of cloud computing technologies and best practices to create cloud solutions that meet the needs of organizations. This course "Building Batch Data Pipelines on GCP em Português Brasileiro" may be useful as a way of building a foundation in data pipelines on GCP which is a major cloud vendor.

See salaries and explore the career path for Cloud Architect

Reading list

We've selected six books that we think will supplement your learning. Use these to develop background knowledge, enrich your coursework, and gain a deeper understanding of the topics covered in Building Batch Data Pipelines on GCP em Português Brasileiro.

Continuous Delivery

Save

Provides guidance on applying continuous delivery principles to data pipelines. Covers automation, testing, and deployment best practices.

(English) Continuous Delivery: Reliable Software Releases...

Hardcover

$$$

(English) Continuous Delivery: Reliable Software Releases...

Kindle Edition

Kafka: The Definitive Guide

Save

Provides a deep dive into building and managing stream processing pipelines with Apache Kafka. Covers topics such as data ingestion, transformations, and real-time analytics.

(English) Kafka: The Definitive Guide: Real-Time Data and...

Paperback

IBM Technical Computing Clouds

Save

A practical guide to building data pipelines in R. Covers data wrangling, transformation, and analysis using popular R packages such as dplyr, tidyr, and ggplot2.

(English) IBM Technical Computing Clouds

Paperback

Secret Colors

Save

Delves into the principles and patterns of designing data-intensive applications. While it doesn't directly cover Google Cloud technologies, it provides a foundational understanding of data management concepts and techniques applicable to building data pipelines on any platform.

Hands-On Machine Learning with Scikit-Learn, Keras,...

Save

Provides a hands-on introduction to large-scale machine learning using TensorFlow. It covers topics such as data preprocessing, model training, and evaluation.

(English) Hands-On Machine Learning with Scikit-Learn, Keras,...

Paperback

(English) Hands-On Machine Learning with Scikit-Learn, Keras,...

Paperback

$$$

(English) Hands-On Machine Learning with Scikit-Learn, Keras,...

Kindle Edition

Advanced Analytics with PySpark

Save

Explores advanced analytics techniques using Apache Spark. While it doesn't focus specifically on data pipelines, it provides valuable insights into data analysis and processing techniques that can be applied within the context of data pipelines.

(English) Advanced Analytics with PySpark: Patterns for...

Paperback

(English) Advanced Analytics with PySpark

Kindle Edition

Help others find this course page by sharing it with your friends and followers: