We may earn an affiliate commission when you visit our partners.
Course image
Google Cloud Training

Les pipelines de données s'inscrivent généralement dans le paradigme EL (extraction et chargement), ELT (extraction, chargement et transformation) ou ETL (extraction, transformation et chargement). Ce cours vous indiquera quel paradigme utiliser pour le traitement de données par lot en fonction du contexte. Il vous présentera également plusieurs solutions Google Cloud de transformation des données, y compris BigQuery, l'exécution de Spark sur Dataproc, les graphiques de pipelines dans Cloud Data Fusion et le traitement des données sans serveur avec Dataflow. Les participants mettront en pratique les connaissances qu'ils auront acquises en créant des composants de pipelines de données sur Google Cloud à l'aide de Qwiklabs.

Enroll now

What's inside

Syllabus

Présentation
Dans ce module, nous vous présentons le cours et son déroulement.
Présentation de la création de pipelines de données par lot
Ce module passe en revue différentes méthodes de chargement de données (EL, ELT et ETL) et vous indique quand les utiliser.
Read more
Exécuter Spark sur Dataproc
Ce module vous apprend à exécuter Hadoop sur Dataproc, à exploiter Cloud Storage et à optimiser vos tâches Dataproc.
Traiter des données sans serveur avec Dataflow
Ce module vous explique comment utiliser Dataflow pour créer vos pipelines de traitement de données.
Gérer des pipelines de données avec Cloud Data Fusion
Ce module vous montre comment gérer des pipelines de données avec Cloud Data Fusion et Cloud Composer.
Résumé du cours

Good to know

Know what's good
, what to watch for
, and possible dealbreakers
Développe les pipelines de données, outils indispensables pour le traitement des données en masse
Enseigné par Google Cloud Training, reconnu pour son expertise dans le domaine du traitement des données
Propose une étude complète des paradigmes de traitement de données par lot
Couvre les solutions de transformation de données de Google Cloud, notamment BigQuery, Spark sur Dataproc, Cloud Data Fusion et Dataflow
Comprend des exercices pratiques sur Google Cloud à l'aide de Qwiklabs

Save this course

Save Building Batch Data Pipelines on GCP en Français to your list so you can find it easily later:
Save

Reviews summary

Excellent data pipeline builder course

This course provides a great introduction to data pipeline building using GCP tools. Students are provided with many examples and exercises to help them build their skills. Overall, this course is a great resource for anyone looking to learn more about data pipelines.
Many examples are provided to help students learn.
"Excellent course with complete examples..."

Activities

Be better prepared before your course. Deepen your understanding during and after it. Supplement your coursework and achieve mastery of the topics covered in Building Batch Data Pipelines on GCP en Français with these activities:
Review Introduction to Data Warehousing
Review the fundamentals of data warehousing and its role in managing and analyzing large datasets, ensuring a strong foundation for the course topics.
Show steps
  • Read assigned sections from a data warehousing textbook
  • Summarize key concepts and terminologies in data warehousing
  • Complete practice exercises on data modeling and data storage techniques
Practice Data Munging and Cleaning Techniques
Strengthen foundational skills in data munging and cleaning by practicing techniques, ensuring proficiency in preparing data for processing and analysis.
Show steps
  • Find online resources or tutorials on data munging and cleaning
  • Practice data cleaning techniques on sample datasets
  • Apply data munging techniques to real-world datasets
Review Apache Spark Fundamentals
Solidify your understanding of Apache Spark core concepts to enhance your comprehension of Spark processing in this course.
Browse courses on Apache Spark
Show steps
  • Go through Apache Spark documentation.
  • Practice writing Spark applications.
Six other activities
Expand to see all activities and additional details
Show all nine activities
Participate in Discussion Forums on Data Processing
Engage with peers in online discussions, exchanging ideas, sharing insights, and clarifying concepts to enhance understanding of data processing techniques.
Show steps
  • Join online forums related to data processing
  • Participate in discussions by asking questions and sharing knowledge
  • Collaborate with peers on solving data processing challenges
Explore Cloud Data Fusion Concepts and Tutorials
Gain hands-on experience with Cloud Data Fusion by following guided tutorials, deepening understanding of its capabilities and use cases.
Browse courses on Cloud Data Fusion
Show steps
  • Follow Google Cloud tutorials on Cloud Data Fusion
  • Build a simple data pipeline using Cloud Data Fusion
  • Troubleshoot common issues while working with Cloud Data Fusion
Solve Data Processing Coding Challenges
Enhance problem-solving and coding skills by attempting data processing coding challenges, reinforcing the practical application of course concepts.
Show steps
  • Identify relevant data processing coding challenge platforms
  • Solve coding challenges related to data ingestion, transformation, and analysis
  • Review solutions and learn from different approaches
Organize and Summarize Course Materials
Enhance understanding and retention by organizing and summarizing key concepts, notes, and assignments from the course, facilitating efficient review and recall.
Show steps
  • Consolidate notes, assignments, and quizzes into a central location
  • Summarize key concepts and takeaways from each module
  • Create diagrams or mind maps to visualize relationships and connections
Attend a Workshop on Data Transformation with BigQuery
Deepen understanding of BigQuery's capabilities by attending a workshop, gaining practical experience in data transformation techniques and best practices.
Browse courses on BigQuery
Show steps
  • Research and identify relevant BigQuery workshops
  • Register and attend a workshop focusing on data transformation
  • Apply the learned techniques in hands-on exercises
Develop a Data Pipeline for a Real-World Dataset
Apply course concepts by building a complete data pipeline for a real-world dataset, demonstrating proficiency in data processing and integration techniques.
Show steps
  • Identify a suitable real-world dataset
  • Design and implement a data pipeline using the techniques learned in the course
  • Document the pipeline architecture and implementation

Career center

Learners who complete Building Batch Data Pipelines on GCP en Français will develop knowledge and skills that may be useful to these careers:
Data Architect
Data Architects plan and design data management systems, ensuring that data is accessible, reliable, and secure. This course helps build a foundation for a career as a Data Architect by providing an understanding of batch data pipeline construction on Google Cloud Platform (GCP). Learners will gain hands-on experience with Cloud Data Fusion and develop skills in managing data pipelines, making them well-prepared for the responsibilities of a Data Architect role.
Data Engineer
Data Engineers design, build, and maintain data pipelines, ensuring data is processed efficiently and accurately. This course provides a comprehensive overview of batch data pipeline construction on GCP. Learners will gain experience with tools such as BigQuery, Spark on Dataproc, and Dataflow, preparing them for the technical challenges of a Data Engineer position.
Data Scientist
Data Scientists leverage data to uncover insights and solve business problems. This course helps build a foundation for a career as a Data Scientist by providing an understanding of batch data pipeline construction on GCP. Learners will gain hands-on experience with BigQuery and Spark on Dataproc, developing skills in data wrangling and transformation, which are essential for Data Scientists.
Machine Learning Engineer
Machine Learning Engineers build and deploy machine learning models to solve real-world problems. This course provides a foundation for a career as a Machine Learning Engineer by introducing batch data pipeline construction on GCP. Learners will gain experience with tools such as BigQuery and Spark on Dataproc, developing skills in data preparation and feature engineering, which are critical for Machine Learning Engineers.
Business Intelligence Analyst
Business Intelligence Analysts provide insights to businesses by analyzing data and identifying trends. This course helps build a foundation for a career as a Business Intelligence Analyst by providing an understanding of batch data pipeline construction on GCP. Learners will gain experience with tools such as BigQuery and Dataflow, developing skills in data extraction, transformation, and visualization, which are essential for Business Intelligence Analysts.
Data Analyst
Data Analysts collect, clean, and interpret data to help businesses make informed decisions. This course may be useful for aspiring Data Analysts, as it provides an introduction to batch data pipeline construction on GCP. Learners will gain experience with tools such as BigQuery and Spark on Dataproc, developing skills in data wrangling and analysis.
Software Engineer
Software Engineers design, develop, and maintain software systems. This course may be useful for Software Engineers interested in specializing in data engineering or big data, as it provides an introduction to batch data pipeline construction on GCP. Learners will gain experience with tools such as BigQuery and Spark on Dataproc, developing skills in data processing and transformation.
Cloud Engineer
Cloud Engineers design, build, and manage cloud-based systems. This course may be useful for Cloud Engineers interested in specializing in data engineering or big data, as it provides an introduction to batch data pipeline construction on GCP. Learners will gain experience with tools such as BigQuery and Spark on Dataproc, developing skills in data processing and management.
Database Administrator
Database Administrators manage and maintain database systems. This course may be useful for Database Administrators interested in specializing in big data or cloud-based databases, as it provides an introduction to batch data pipeline construction on GCP. Learners will gain experience with tools such as BigQuery and Spark on Dataproc, developing skills in data management and optimization.
Data Warehouse Engineer
Data Warehouse Engineers design, build, and maintain data warehouses, which are central repositories for large amounts of data. This course may be useful for Data Warehouse Engineers interested in specializing in cloud-based data warehouses or big data, as it provides an introduction to batch data pipeline construction on GCP. Learners will gain experience with tools such as BigQuery and Spark on Dataproc, developing skills in data integration and management.
ETL Developer
ETL Developers design and develop extract, transform, and load (ETL) processes, which move data from source systems to target systems. This course may be useful for ETL Developers interested in specializing in cloud-based ETL or big data, as it provides an introduction to batch data pipeline construction on GCP. Learners will gain experience with tools such as BigQuery and Spark on Dataproc, developing skills in data extraction, transformation, and loading.
Data Integration Engineer
Data Integration Engineers design and develop data integration solutions, which connect disparate data sources and enable data sharing. This course may be useful for Data Integration Engineers interested in specializing in cloud-based data integration or big data, as it provides an introduction to batch data pipeline construction on GCP. Learners will gain experience with tools such as BigQuery and Spark on Dataproc, developing skills in data extraction, transformation, and integration.
Big Data Architect
Big Data Architects design and manage big data systems, which are used to process and analyze large amounts of data. This course may be useful for Big Data Architects interested in specializing in cloud-based big data or batch data processing, as it provides an introduction to batch data pipeline construction on GCP. Learners will gain experience with tools such as BigQuery and Spark on Dataproc, developing skills in data management and processing.
Cloud Data Engineer
Cloud Data Engineers design and manage data systems in the cloud. This course may be useful for Cloud Data Engineers interested in specializing in batch data processing or big data, as it provides an introduction to batch data pipeline construction on GCP. Learners will gain experience with tools such as BigQuery and Spark on Dataproc, developing skills in data management and processing.
Data Management Consultant
Data Management Consultants help organizations improve their data management practices and leverage data to make better decisions. This course may be useful for Data Management Consultants interested in specializing in cloud-based data management or big data, as it provides an introduction to batch data pipeline construction on GCP. Learners will gain experience with tools such as BigQuery and Spark on Dataproc, developing skills in data management and consulting.

Reading list

We've selected 12 books that we think will supplement your learning. Use these to develop background knowledge, enrich your coursework, and gain a deeper understanding of the topics covered in Building Batch Data Pipelines on GCP en Français.
Provides a comprehensive overview of the principles and patterns for designing data-intensive applications. It covers everything from data modeling to data processing to data storage.
Comprehensive guide to using Python for data science. It covers all the basics of Python, as well as how to use it for data analysis, machine learning, and data visualization.
Comprehensive guide to using R for data science. It covers all the basics of R, as well as how to use it for data analysis, machine learning, and data visualization.
Comprehensive guide to statistical learning. It covers all the basics of statistical learning, in a simple and easy-to-understand way.
Comprehensive guide to deep learning with R. It covers all the basics of deep learning, as well as how to use R for deep learning tasks.
Provides a comprehensive guide to using Apache Airflow for building data pipelines on Google Cloud Platform. It covers everything from setting up your environment to developing and deploying your pipelines.
Gentle introduction to machine learning. It covers all the basics of machine learning, in a simple and easy-to-understand way.

Share

Help others find this course page by sharing it with your friends and followers:

Similar courses

Here are nine courses similar to Building Batch Data Pipelines on GCP en Français.
La Data Intelligence au service des organisations
Most relevant
Google Cloud Product Fundamentals en Français
Most relevant
Traitement d'images : segmentation et caractérisation
Most relevant
Modernizing Data Lakes and Data Warehouses with GCP en...
Most relevant
Business Transformation with Google Cloud en Français
Most relevant
Traitement d'images : analyse fréquentielle et multi...
Most relevant
Tensorflow : Analyse de Sentiments avec Word Embedding
Most relevant
Smart Analytics, Machine Learning, and AI on GCP en...
Most relevant
Troubles du spectre de l'autisme : interventions
Most relevant
Our mission

OpenCourser helps millions of learners each year. People visit us to learn workspace skills, ace their exams, and nurture their curiosity.

Our extensive catalog contains over 50,000 courses and twice as many books. Browse by search, by topic, or even by career interests. We'll match you to the right resources quickly.

Find this site helpful? Tell a friend about us.

Affiliate disclosure

We're supported by our community of learners. When you purchase or subscribe to courses and programs or purchase books, we may earn a commission from our partners.

Your purchases help us maintain our catalog and keep our servers humming without ads.

Thank you for supporting OpenCourser.

© 2016 - 2024 OpenCourser