We may earn an affiliate commission when you visit our partners.
Course image
Google Cloud Training

Os pipelines de dados geralmente se encaixam em um dos três paradigmas: extração-carregamento, extração-carregamento-transformação ou extração-transformação-carregamento. Este curso descreve qual paradigma deve ser usado em determinadas situações e quando isso ocorre com dados em lote. Além disso, vamos falar sobre várias tecnologias no Google Cloud para transformação de dados, incluindo o BigQuery, a execução do Spark no Dataproc, gráficos de pipeline no Cloud Data Fusion e processamento de dados sem servidor com o Dataflow. Os participantes vão ganhar experiência prática na criação de componentes de pipelines de dados no Google Cloud usando o Qwiklabs.

Enroll now

What's inside

Syllabus

Introdução
Neste módulo, vamos apresentar o curso e a programação
Introdução à criação de pipelines de dados em lote
Este módulo analisa diferentes métodos de carregamento de dados: EL, ELT e ETL (e quando cada um deve ser usado)
Read more
Como executar o Spark no Dataproc
Este módulo mostra como executar o Hadoop no Dataproc, como usar o Cloud Storage e como otimizar os jobs do Dataproc.
Processamento de dados sem servidor com o Dataflow
Este módulo aborda o uso do Dataflow para criar pipelines de processamento de dados
Gerenciamento de pipelines de dados com
Este módulo mostra como gerenciar pipelines de dados com o Cloud Data Fusion e o Cloud Composer.
Resumo do curso

Good to know

Know what's good
, what to watch for
, and possible dealbreakers
Ensina vários métodos de carregamento de dados em lote
Ensina quando utilizar o EL, ELT e ETL
Oferece experiência prática no uso de tecnologias do Google Cloud, como BigQuery, Dataproc e Dataflow
Ensina como gerenciar pipelines de dados com o Cloud Data Fusion e o Cloud Composer
Oferece atividades práticas no Qwiklabs, permitindo que os alunos apliquem os conceitos

Save this course

Save Building Batch Data Pipelines on GCP em Português Brasileiro to your list so you can find it easily later:
Save

Reviews summary

Effective data pipeline design

This course teaches learners how to create batch data pipelines on GCP. It covers a range of GCP technologies for data transformation, including BigQuery, Spark on Dataproc, Cloud Data Fusion, and Dataflow. Reviews indicate that the course is well-structured and provides valuable hands-on experience through Qwiklabs.
Covers real-world use cases for data pipelines.
Instructor has extensive knowledge and expertise in the field.
"Apresentacao fantastica sobre as principais ferramentas para criacao e gerenciamento de pipelines."
Provides hands-on experience with GCP data pipeline components.
"Participants gain practical experience in creating data pipeline components on Google Cloud using Qwiklabs."
May encounter technical issues with labs.
"Alguns laboratorios poderiam pedir para o aluno responder algumas questoes com base em resultados das atividades executadas. Tambem tive dificuldade em realizar um labotatorio porque o GCP nao conseguiu criar uma instancia do Cloud Composer, tive que repetir a atividade."

Activities

Be better prepared before your course. Deepen your understanding during and after it. Supplement your coursework and achieve mastery of the topics covered in Building Batch Data Pipelines on GCP em Português Brasileiro with these activities:
Organize and Review Course Resources
Organize and review course materials, ensuring that you have a comprehensive understanding of the topics covered, maximizing retention and recall.
Show steps
  • Gather lecture notes, slides, and assignments.
  • Review materials regularly to reinforce understanding.
Participate in Q&A Forums
Contribute to the course community by actively participating in discussion forums, answering questions, and sharing your insights, solidifying your understanding while assisting others.
Browse courses on Q&A
Show steps
  • Join relevant Q&A forums or discussion groups.
  • Monitor forums for questions related to course topics.
  • Provide thoughtful and accurate responses.
Practice SQL skills on Qwiklabs
Completing SQL drills will help you reinforce foundational SQL skills and improve your ability to transform data effectively.
Browse courses on SQL
Show steps
  • Follow the instructions in the Qwiklabs SQL course.
  • Complete the SQL exercises in the course.
  • Review the solutions and explanations.
Seven other activities
Expand to see all activities and additional details
Show all ten activities
Follow a tutorial on using the Cloud Data Fusion GUI
By following a tutorial, you'll gain practical experience in using the Cloud Data Fusion interface, which is essential for managing pipelines.
Browse courses on Cloud Data Fusion
Show steps
  • Choose a tutorial on the Cloud Data Fusion website or YouTube.
  • Follow the instructions in the tutorial.
  • Create a simple pipeline using the GUI.
Hands-on Hadoop and Spark Exercises
Practice essential Hadoop and Spark techniques within the Dataproc environment, solidifying your understanding of data processing frameworks.
Browse courses on Hadoop
Show steps
  • Configure and launch a Hadoop cluster on Dataproc.
  • Run a Spark job to process data stored in Cloud Storage.
  • Optimize your Spark job configurations for performance.
Explore Dataflow Documentation and Tutorials
Expand your knowledge of Dataflow by exploring official documentation and tutorials, gaining a deeper understanding of its capabilities for serverless data processing.
Browse courses on Dataflow
Show steps
  • Review Dataflow documentation to understand its architecture and key features.
  • Follow hands-on tutorials to build and deploy Dataflow pipelines.
Attend a Cloud Data Fusion Workshop
Enhance your understanding of Cloud Data Fusion by attending a workshop, where you'll gain hands-on experience and expert insights into managing data pipelines.
Browse courses on Cloud Data Fusion
Show steps
  • Register for a Cloud Data Fusion workshop.
  • Attend the workshop and actively participate.
  • Follow up after the workshop to practice and reinforce your learnings.
Write a blog post summarizing the key concepts of data transformation
Writing a blog post will help you consolidate your understanding of data transformation concepts and share your knowledge with others.
Browse courses on Data Transformation
Show steps
  • Research and gather information on data transformation.
  • Organize your thoughts and create an outline.
  • Write the blog post.
  • Revise and edit your post.
  • Publish your blog post.
Design a Data Pipeline Architecture
Apply your knowledge to create a comprehensive data pipeline architecture, demonstrating your ability to design and plan data processing solutions.
Show steps
  • Define the data sources, transformations, and destination for your pipeline.
  • Select appropriate technologies and services for each pipeline component.
  • Document your design and share it for review.
Build a Real-World Data Pipeline
Challenge yourself by building a data pipeline that addresses a real-world problem, demonstrating your ability to apply course concepts and solve practical data challenges.
Show steps
  • Identify a business problem that can be solved with a data pipeline.
  • Gather and prepare the necessary data.
  • Design and implement a data pipeline to process and analyze the data.
  • Deploy and monitor your pipeline.

Career center

Learners who complete Building Batch Data Pipelines on GCP em Português Brasileiro will develop knowledge and skills that may be useful to these careers:
ETL Developer
ETL Developers design and build data pipelines that extract, transform, and load data from one system to another. They use their understanding of data pipelines and data integration tools to create solutions that can help organizations move data between different systems.
Data Pipeline Architect
Data Pipeline Architects design and build data pipelines for organizations. They use their understanding of data pipelines and data engineering principles to create solutions that can help organizations move data between different systems and applications.
Cloud Data Engineer
Cloud Data Engineers are responsible for designing, building, and managing data pipelines in the cloud. They use their understanding of cloud computing technologies and data engineering principles to create solutions that can help organizations move data to and from the cloud.
Big Data Engineer
Big Data Engineers are responsible for designing, building, and managing big data pipelines. They use their understanding of big data technologies and data engineering principles to create solutions that can help organizations store, process, and analyze large volumes of data.
Data Warehouse Engineer
Data Warehouse Engineers are responsible for designing, building, and managing data warehouses. They use their understanding of data warehousing technologies and data engineering principles to create solutions that can help organizations store and analyze large volumes of data.
Data Lake Engineer
Data Lake Engineers are responsible for designing, building, and managing data lakes. They use their understanding of data lake technologies and data engineering principles to create solutions that can help organizations store and process large volumes of data in its raw format.
Data Management Consultant
Data Management Consultants help organizations improve their data management practices. They use their understanding of data management principles and technologies to help organizations define and implement data management strategies.
Data Engineer
Data Engineers use their mastery of big data tools and infrastructure to design, build, and maintain data management systems. Enterprises hire Data Engineers to help make sense of their massive volumes of data and to help find ways to leverage it. This course "Building Batch Data Pipelines on GCP em Português Brasileiro" may be useful as a way of building a foundation in working with data pipelines which is part of the core work of a Data Engineer.
Database Administrator
Database Administrators use their understanding of database technologies to design, implement, and maintain database management systems. They are responsible for ensuring that their organization has fast, secure, and reliable access to its data. This course "Building Batch Data Pipelines on GCP em Português Brasileiro" may be useful as a way of building a foundation in pipeline data processing which can be used in database management systems.
Data Scientist
Data Scientists use their expertise in statistics, machine learning, and data mining to extract insights from data. They use these insights to help organizations make better decisions. This course "Building Batch Data Pipelines on GCP em Português Brasileiro" may be useful as a way of building a foundation in data pipelines which is a core part of the Data Scientist's toolkit.
Data Analyst
Data Analysts gather, analyze, interpret, and present data to help organizations make informed decisions. Data Analysts use their understanding of data and statistical techniques to identify trends, patterns, and insights in data. This course "Building Batch Data Pipelines on GCP em Português Brasileiro" may be useful as a way of building a foundation in data pipelining.
Data Integration Engineer
Data Integration Engineers design and build data integration solutions. They use their understanding of data integration tools and technologies to create solutions that can help organizations integrate data from multiple sources into a single, unified view.
Machine Learning Engineer
Machine Learning Engineers are responsible for building, deploying, and maintaining machine learning models. They use their understanding of machine learning algorithms and cloud computing technologies to create models that can help organizations solve real-world problems. This course "Building Batch Data Pipelines on GCP em Português Brasileiro" may be useful as a way of building a foundation in creating and managing data pipelines for machine learning.
Software Engineer
Software Engineers design, develop, and maintain software applications. They use their understanding of programming languages and software engineering principles to create software that meets the needs of users. This course "Building Batch Data Pipelines on GCP em Português Brasileiro" may be useful as a way of building a foundation in data pipeline technologies which are used by Software Engineers who work on big data projects.
Cloud Architect
Cloud Architects design and manage cloud computing infrastructure. They use their understanding of cloud computing technologies and best practices to create cloud solutions that meet the needs of organizations. This course "Building Batch Data Pipelines on GCP em Português Brasileiro" may be useful as a way of building a foundation in data pipelines on GCP which is a major cloud vendor.

Reading list

We've selected six books that we think will supplement your learning. Use these to develop background knowledge, enrich your coursework, and gain a deeper understanding of the topics covered in Building Batch Data Pipelines on GCP em Português Brasileiro.
Provides guidance on applying continuous delivery principles to data pipelines. Covers automation, testing, and deployment best practices.
Provides a deep dive into building and managing stream processing pipelines with Apache Kafka. Covers topics such as data ingestion, transformations, and real-time analytics.
A practical guide to building data pipelines in R. Covers data wrangling, transformation, and analysis using popular R packages such as dplyr, tidyr, and ggplot2.
Delves into the principles and patterns of designing data-intensive applications. While it doesn't directly cover Google Cloud technologies, it provides a foundational understanding of data management concepts and techniques applicable to building data pipelines on any platform.
Explores advanced analytics techniques using Apache Spark. While it doesn't focus specifically on data pipelines, it provides valuable insights into data analysis and processing techniques that can be applied within the context of data pipelines.

Share

Help others find this course page by sharing it with your friends and followers:

Similar courses

Here are nine courses similar to Building Batch Data Pipelines on GCP em Português Brasileiro.
Business Transformation with Google Cloud em Português
Most relevant
Building Resilient Streaming Systems on GCP em Português...
Most relevant
ML Pipelines on Google Cloud - Português
Most relevant
Serverless Machine Learning with Tensorflow on Google...
Most relevant
Modernizing Data Lakes and Data Warehouses with GCP em...
Most relevant
Understanding Your Google Cloud Costs em Português
Most relevant
Google Cloud Product Fundamentals em Português Brasileiro
Most relevant
Building Resilient Streaming Analytics Systems on GCP em...
Most relevant
RH, Dados e Inteligência Artificial
Most relevant
Our mission

OpenCourser helps millions of learners each year. People visit us to learn workspace skills, ace their exams, and nurture their curiosity.

Our extensive catalog contains over 50,000 courses and twice as many books. Browse by search, by topic, or even by career interests. We'll match you to the right resources quickly.

Find this site helpful? Tell a friend about us.

Affiliate disclosure

We're supported by our community of learners. When you purchase or subscribe to courses and programs or purchase books, we may earn a commission from our partners.

Your purchases help us maintain our catalog and keep our servers humming without ads.

Thank you for supporting OpenCourser.

© 2016 - 2024 OpenCourser