We may earn an affiliate commission when you visit our partners.

Building Batch Data Pipelines on GCP en Français

Les pipelines de données s'inscrivent généralement dans le paradigme EL (extraction et chargement), ELT (extraction, chargement et transformation) ou ETL (extraction, transformation et chargement). Ce cours vous indiquera quel paradigme utiliser pour le traitement de données par lot en fonction du contexte. Il vous présentera également plusieurs solutions Google Cloud de transformation des données, y compris BigQuery, l'exécution de Spark sur Dataproc, les graphiques de pipelines dans Cloud Data Fusion et le traitement des données sans serveur avec Dataflow. Les participants mettront en pratique les connaissances qu'ils auront acquises en créant des composants de pipelines de données sur Google Cloud à l'aide de Qwiklabs.

Enroll now

Or subscribe to Coursera Plus

And get unlimited access to Coursera

Here's a deal for you

Save money when you learn with a deal that may be relevant to this course.

All coupon codes, vouchers, and discounts are applied automatically unless otherwise noted.

Valid until August 30

Google AI App Builder

Learn how to use Gemini API and API Studio with a three-course series from Google DeepMind

What's inside

Syllabus

Présentation

Dans ce module, nous vous présentons le cours et son déroulement.

Présentation de la création de pipelines de données par lot

Ce module passe en revue différentes méthodes de chargement de données (EL, ELT et ETL) et vous indique quand les utiliser.

Traffic lights

Read about what's good

what should give you pause

and possible dealbreakers

Développe les pipelines de données, outils indispensables pour le traitement des données en masse

Enseigné par Google Cloud Training, reconnu pour son expertise dans le domaine du traitement des données

Propose une étude complète des paradigmes de traitement de données par lot

Couvre les solutions de transformation de données de Google Cloud, notamment BigQuery, Spark sur Dataproc, Cloud Data Fusion et Dataflow

Comprend des exercices pratiques sur Google Cloud à l'aide de Qwiklabs

Save this course

Create your own learning path. Save this course to your list so you can find it easily later.

Save

Reviews summary

Pipelines de données batch sur gcp

Selon les apprenants, ce cours est une solide introduction à la construction de pipelines de données par lot sur GCP, mettant en avant des services clés comme Dataflow et Dataproc. Les laboratoires Qwiklabs pratiques sont fréquemment cités comme un atout majeur, permettant de solidifier la compréhension des concepts. Les étudiants apprécient la clarté des explications et la pédagogie de l'instructeur. Cependant, certains avis signalent un contenu parfois daté ou un manque de profondeur pour les utilisateurs plus avancés, notamment sur Cloud Data Fusion. Le cours reste néanmoins très pertinent pour les débutants en ingénierie de données sur le cloud.

Explications claires et instructeur pédagogue.

"Les explications sont claires et les labs Qwiklabs sont absolument indispensables..."

"L'instructeur est très pédagogue. Je l'ai suivi pour ma certification et il m'a beaucoup aidé."

"Très pédagogique et bien expliqué. Permet de comprendre l'écosystème GCP pour le batch."

Cours idéal pour les débutants sur GCP.

"Ce cours est une excellente ressource pour quiconque souhaite maîtriser la construction de pipelines de données sur GCP."

"Super cours pour démarrer avec l'ingénierie des données sur Google Cloud."

"En tant que novice, j'ai trouvé les explications claires et faciles à suivre."

Les labs Qwiklabs sont un point fort crucial.

"Les Qwiklabs sont absolument indispensables pour une compréhension pratique."

"Les labs sont pertinents, même si parfois les environnements Qwiklabs peuvent être un peu lents à démarrer."

"J'ai particulièrement apprécié l'aspect 'hands-on' avec les labs qui sont bien conçus."

Pacing ou problèmes de présentation pour certains.

"Le rythme est parfois un peu rapide pour les débutants absolus. Il est préférable d'avoir quelques notions préalables en programmation ou en bases de données."

"Déçu par ce cours. J'ai trouvé que l'instructeur lisait trop souvent ses notes et que les explications manquaient de spontanéité."

"La présentation est un peu trop rapide par moments, surtout sur les concepts plus complexes. J'ai dû mettre en pause et revenir en arrière souvent."

Limite pour utilisateurs avancés et experts.

"Il offre une bonne vue d'ensemble, mais les experts pourraient souhaiter plus de profondeur sur des cas d'usage avancés ou des optimisations."

"Le cours est bien comme introduction, mais je trouve que certains aspects comme Cloud Data Fusion ne sont pas suffisamment détaillés."

"Si vous avez déjà une expérience avec d'autres plateformes cloud, il n'apportera peut-être pas énormément. Idéal si vous êtes complètement nouveau sur GCP."

Certains aspects du contenu sont parfois dépassés.

"Je trouve que certains aspects comme Cloud Data Fusion ne sont pas suffisamment détaillés. De plus, les informations sur Dataproc semblent parfois un peu datées..."

"J'ai trouvé quelques coquilles dans les labs ou des instructions qui n'étaient plus exactement à jour, mais rien de bloquant."

"Contenu superficiel et manque de mise à jour flagrant. J'ai rencontré de nombreux problèmes avec les labs qui ne fonctionnaient pas comme décrit."

Activities

Be better prepared before your course. Deepen your understanding during and after it. Supplement your coursework and achieve mastery of the topics covered in Building Batch Data Pipelines on GCP en Français with these activities:

Review Introduction to Data Warehousing

Show steps

Review the fundamentals of data warehousing and its role in managing and analyzing large datasets, ensuring a strong foundation for the course topics.

Show steps

Read assigned sections from a data warehousing textbook
Summarize key concepts and terminologies in data warehousing
Complete practice exercises on data modeling and data storage techniques

Practice Data Munging and Cleaning Techniques

Show steps

Strengthen foundational skills in data munging and cleaning by practicing techniques, ensuring proficiency in preparing data for processing and analysis.

Show steps

Find online resources or tutorials on data munging and cleaning
Practice data cleaning techniques on sample datasets
Apply data munging techniques to real-world datasets

Review Apache Spark Fundamentals

Show steps

Solidify your understanding of Apache Spark core concepts to enhance your comprehension of Spark processing in this course.

Browse courses on Apache Spark

Show steps

Go through Apache Spark documentation.
Practice writing Spark applications.

Six other activities

Expand to see all activities and additional details

Show all nine activities

Participate in Discussion Forums on Data Processing

Show steps

Engage with peers in online discussions, exchanging ideas, sharing insights, and clarifying concepts to enhance understanding of data processing techniques.

Show steps

Join online forums related to data processing
Participate in discussions by asking questions and sharing knowledge
Collaborate with peers on solving data processing challenges

Explore Cloud Data Fusion Concepts and Tutorials

Show steps

Gain hands-on experience with Cloud Data Fusion by following guided tutorials, deepening understanding of its capabilities and use cases.

Browse courses on Cloud Data Fusion

Show steps

Follow Google Cloud tutorials on Cloud Data Fusion
Build a simple data pipeline using Cloud Data Fusion
Troubleshoot common issues while working with Cloud Data Fusion

Solve Data Processing Coding Challenges

Show steps

Enhance problem-solving and coding skills by attempting data processing coding challenges, reinforcing the practical application of course concepts.

Show steps

Identify relevant data processing coding challenge platforms
Solve coding challenges related to data ingestion, transformation, and analysis
Review solutions and learn from different approaches

Organize and Summarize Course Materials

Show steps

Enhance understanding and retention by organizing and summarizing key concepts, notes, and assignments from the course, facilitating efficient review and recall.

Show steps

Consolidate notes, assignments, and quizzes into a central location
Summarize key concepts and takeaways from each module
Create diagrams or mind maps to visualize relationships and connections

Attend a Workshop on Data Transformation with BigQuery

Show steps

Deepen understanding of BigQuery's capabilities by attending a workshop, gaining practical experience in data transformation techniques and best practices.

Browse courses on BigQuery

Show steps

Research and identify relevant BigQuery workshops
Register and attend a workshop focusing on data transformation
Apply the learned techniques in hands-on exercises

Develop a Data Pipeline for a Real-World Dataset

Show steps

Apply course concepts by building a complete data pipeline for a real-world dataset, demonstrating proficiency in data processing and integration techniques.

Show steps

Identify a suitable real-world dataset
Design and implement a data pipeline using the techniques learned in the course
Document the pipeline architecture and implementation

Career center

Learners who complete Building Batch Data Pipelines on GCP en Français will develop knowledge and skills that may be useful to these careers:

Data Architect

Data Architects plan and design data management systems, ensuring that data is accessible, reliable, and secure. This course helps build a foundation for a career as a Data Architect by providing an understanding of batch data pipeline construction on Google Cloud Platform (GCP). Learners will gain hands-on experience with Cloud Data Fusion and develop skills in managing data pipelines, making them well-prepared for the responsibilities of a Data Architect role.

See salaries and explore the career path for Data Architect

Data Engineer

Data Engineers design, build, and maintain data pipelines, ensuring data is processed efficiently and accurately. This course provides a comprehensive overview of batch data pipeline construction on GCP. Learners will gain experience with tools such as BigQuery, Spark on Dataproc, and Dataflow, preparing them for the technical challenges of a Data Engineer position.

See salaries and explore the career path for Data Engineer

Data Scientist

Data Scientists leverage data to uncover insights and solve business problems. This course helps build a foundation for a career as a Data Scientist by providing an understanding of batch data pipeline construction on GCP. Learners will gain hands-on experience with BigQuery and Spark on Dataproc, developing skills in data wrangling and transformation, which are essential for Data Scientists.

See salaries and explore the career path for Data Scientist

Machine Learning Engineer

Machine Learning Engineers build and deploy machine learning models to solve real-world problems. This course provides a foundation for a career as a Machine Learning Engineer by introducing batch data pipeline construction on GCP. Learners will gain experience with tools such as BigQuery and Spark on Dataproc, developing skills in data preparation and feature engineering, which are critical for Machine Learning Engineers.

See salaries and explore the career path for Machine Learning Engineer

Business Intelligence Analyst

Business Intelligence Analysts provide insights to businesses by analyzing data and identifying trends. This course helps build a foundation for a career as a Business Intelligence Analyst by providing an understanding of batch data pipeline construction on GCP. Learners will gain experience with tools such as BigQuery and Dataflow, developing skills in data extraction, transformation, and visualization, which are essential for Business Intelligence Analysts.

See salaries and explore the career path for Business Intelligence Analyst

Data Analyst

Data Analysts collect, clean, and interpret data to help businesses make informed decisions. This course may be useful for aspiring Data Analysts, as it provides an introduction to batch data pipeline construction on GCP. Learners will gain experience with tools such as BigQuery and Spark on Dataproc, developing skills in data wrangling and analysis.

See salaries and explore the career path for Data Analyst

Software Engineer

Software Engineers design, develop, and maintain software systems. This course may be useful for Software Engineers interested in specializing in data engineering or big data, as it provides an introduction to batch data pipeline construction on GCP. Learners will gain experience with tools such as BigQuery and Spark on Dataproc, developing skills in data processing and transformation.

See salaries and explore the career path for Software Engineer

Cloud Engineer

Cloud Engineers design, build, and manage cloud-based systems. This course may be useful for Cloud Engineers interested in specializing in data engineering or big data, as it provides an introduction to batch data pipeline construction on GCP. Learners will gain experience with tools such as BigQuery and Spark on Dataproc, developing skills in data processing and management.

See salaries and explore the career path for Cloud Engineer

Database Administrator

Database Administrators manage and maintain database systems. This course may be useful for Database Administrators interested in specializing in big data or cloud-based databases, as it provides an introduction to batch data pipeline construction on GCP. Learners will gain experience with tools such as BigQuery and Spark on Dataproc, developing skills in data management and optimization.

See salaries and explore the career path for Database Administrator

Data Warehouse Engineer

Data Warehouse Engineers design, build, and maintain data warehouses, which are central repositories for large amounts of data. This course may be useful for Data Warehouse Engineers interested in specializing in cloud-based data warehouses or big data, as it provides an introduction to batch data pipeline construction on GCP. Learners will gain experience with tools such as BigQuery and Spark on Dataproc, developing skills in data integration and management.

See salaries and explore the career path for Data Warehouse Engineer

ETL Developer

ETL Developers design and develop extract, transform, and load (ETL) processes, which move data from source systems to target systems. This course may be useful for ETL Developers interested in specializing in cloud-based ETL or big data, as it provides an introduction to batch data pipeline construction on GCP. Learners will gain experience with tools such as BigQuery and Spark on Dataproc, developing skills in data extraction, transformation, and loading.

See salaries and explore the career path for ETL Developer

Data Integration Engineer

Data Integration Engineers design and develop data integration solutions, which connect disparate data sources and enable data sharing. This course may be useful for Data Integration Engineers interested in specializing in cloud-based data integration or big data, as it provides an introduction to batch data pipeline construction on GCP. Learners will gain experience with tools such as BigQuery and Spark on Dataproc, developing skills in data extraction, transformation, and integration.

See salaries and explore the career path for Data Integration Engineer

Big Data Architect

Big Data Architects design and manage big data systems, which are used to process and analyze large amounts of data. This course may be useful for Big Data Architects interested in specializing in cloud-based big data or batch data processing, as it provides an introduction to batch data pipeline construction on GCP. Learners will gain experience with tools such as BigQuery and Spark on Dataproc, developing skills in data management and processing.

See salaries and explore the career path for Big Data Architect

Cloud Data Engineer

Cloud Data Engineers design and manage data systems in the cloud. This course may be useful for Cloud Data Engineers interested in specializing in batch data processing or big data, as it provides an introduction to batch data pipeline construction on GCP. Learners will gain experience with tools such as BigQuery and Spark on Dataproc, developing skills in data management and processing.

See salaries and explore the career path for Cloud Data Engineer

Data Management Consultant

Data Management Consultants help organizations improve their data management practices and leverage data to make better decisions. This course may be useful for Data Management Consultants interested in specializing in cloud-based data management or big data, as it provides an introduction to batch data pipeline construction on GCP. Learners will gain experience with tools such as BigQuery and Spark on Dataproc, developing skills in data management and consulting.

See salaries and explore the career path for Data Management Consultant

Reading list

We've selected 12 books that we think will supplement your learning. Use these to develop background knowledge, enrich your coursework, and gain a deeper understanding of the topics covered in Building Batch Data Pipelines on GCP en Français.

Hadoop: The Definitive Guide

Save

Is the definitive guide to Hadoop. It covers everything you need to know about Hadoop, from its architecture to its use cases.

(English) Hadoop: The Definitive Guide: Storage and Analysis...

Building Batch Data Pipelines on GCP en Français

Here's a deal for you

What's inside

Syllabus

Traffic lights

Save this course

Reviews summary

Pipelines de données batch sur gcp

Activities

Career center

Reading list

Share

Similar courses