We may earn an affiliate commission when you visit our partners.
Course image
Zhi Wang

本课程将重点讲解高级大数据系统的实现、优化和应用,包括分布式文件系统、MapReduce/Spark、Storm/Spark streaming、Mahout等系统的原理、实现、策略优化。

近年来,人工智能技术正在快速地渗透进各个不同领域。因大数据系统是当今数据驱动人工智能的基础,而变得至关重要。本课程旨在引导学生了解大数据系统的基本概念,包括如何有效地存储、处理和分析数据。课程从分布式系统设计的一般原理出发。之后我们提供了如何在大数据系统中评定存储、计算和网络功能的框架。最后,为了使这些设计原则便于理解,我们的案例研究将使用真实的工业系统来演示基本设计原则如何应用于实际系统,以及该如何分析它们的性能以及局限性。

Read more

本课程将重点讲解高级大数据系统的实现、优化和应用,包括分布式文件系统、MapReduce/Spark、Storm/Spark streaming、Mahout等系统的原理、实现、策略优化。

近年来,人工智能技术正在快速地渗透进各个不同领域。因大数据系统是当今数据驱动人工智能的基础,而变得至关重要。本课程旨在引导学生了解大数据系统的基本概念,包括如何有效地存储、处理和分析数据。课程从分布式系统设计的一般原理出发。之后我们提供了如何在大数据系统中评定存储、计算和网络功能的框架。最后,为了使这些设计原则便于理解,我们的案例研究将使用真实的工业系统来演示基本设计原则如何应用于实际系统,以及该如何分析它们的性能以及局限性。

Recent years have witnessed the rapid increase of the penetration of AI technology into different areas in the industry. Big data systems, the foundation that enables today’s data-driven AI, are thus becoming critically important. This course is dedicated to lead students into the basic concepts of big data systems, covering how data is effectively stored, processed and analyzed. We start from the general principles in the design of distributed systems; then we provide frameworks on how storage, computation, and network capabilities are scaled in big data systems; finally, to make such design principles easy to follow, our case studies use real industrial systems to demonstrate how the basic design principles are applied in real-world systems as well as how their performance and limitation are analyzed.

What's inside

Learning objectives

  • Basic concepts of big data systems
  • Principelsof designing distributed systems
  • Frameworks on scaling storage, computaion and network capabilities
  • Case studeis of recent industrial big data systems, including gfs, mapreduce and spark
  • Big data processing pipelines such as nosql, streaming, and graph data processing

Good to know

Know what's good
, what to watch for
, and possible dealbreakers
适合高级大数据系统相关学习者学习。
适合想要了解大数据系统基本概念的学习者。
适合想要提高数据存储、处理和分析技能的学习者学习。
授课者具有在大数据系统方面的丰富经验。
课程内容涵盖了大数据系统的设计、实现和应用等核心知识点。

Save this course

Save Advanced Big Data Systems | 高级大数据系统 to your list so you can find it easily later:
Save

Activities

Be better prepared before your course. Deepen your understanding during and after it. Supplement your coursework and achieve mastery of the topics covered in Advanced Big Data Systems | 高级大数据系统 with these activities:
学习 Spark Streaming
掌握 Spark Streaming 的基本概念和应用,增强对大数据实时处理能力的理解。
Browse courses on Spark Streaming
Show steps
  • 查找并参加 Spark Streaming 相关教程
  • 完成教程并练习示例代码
  • 尝试一个小型流数据处理项目
Show all one activities

Career center

Learners who complete Advanced Big Data Systems | 高级大数据系统 will develop knowledge and skills that may be useful to these careers:
Business Intelligence Analyst
Business Intelligence Analysts use data to help businesses make better decisions. This course can help aspiring Business Intelligence Analysts build a foundation in the concepts and principles of big data systems, including distributed file systems, MapReduce/Spark, Storm/Spark streaming, and Mahout. It also covers the optimization and application of these systems, making you a more well-rounded and effective Business Intelligence Analyst.
Quantitative Analyst
Quantitative Analysts use mathematical and statistical models to analyze data. This course can help aspiring Quantitative Analysts build a foundation in the concepts and principles of big data systems, including distributed file systems, MapReduce/Spark, Storm/Spark streaming, and Mahout. It also covers the optimization and application of these systems, making you a more well-rounded and effective Quantitative Analyst.
Data Mining Analyst
Data Mining Analysts use data mining techniques to extract knowledge and insights from data. This course can help aspiring Data Mining Analysts build a foundation in the concepts and principles of big data systems, including distributed file systems, MapReduce/Spark, Storm/Spark streaming, and Mahout. It also covers the optimization and application of these systems, making you a more well-rounded and effective Data Mining Analyst.
Software Engineer
Software Engineers design, develop and maintain software systems. This course can help aspiring Software Engineers build a foundation in the concepts and principles of big data systems, including distributed file systems, MapReduce/Spark, Storm/Spark streaming, and Mahout. It also covers the optimization and application of these systems, making you a more well-rounded and effective Software Engineer.
Data Analyst
Data Analysts collect, clean and analyze data to identify trends and patterns. This course can help aspiring Data Analysts build a foundation in the concepts and principles of big data systems, including distributed file systems, MapReduce/Spark, Storm/Spark streaming, and Mahout. It also covers the optimization and application of these systems, making you a more well-rounded and effective Data Analyst.
Cloud Engineer
Cloud Engineers design, build and maintain cloud computing systems and applications. This course can help aspiring Cloud Engineers build a foundation in the concepts and principles of big data systems, including distributed file systems, MapReduce/Spark, Storm/Spark streaming, and Mahout. It also covers the optimization and application of these systems, making you a more well-rounded and effective Cloud Engineer.
Data Engineer
Data Engineers design, build, and maintain data pipelines and infrastructure to support data-driven decision-making. This course can help aspiring Data Engineers build a foundation in the concepts and principles of big data systems, including distributed file systems, MapReduce/Spark, Storm/Spark streaming, and Mahout. It also covers the optimization and application of these systems, making you a more well-rounded and effective Data Engineer.
Data Scientist
Data Scientists use scientific methods, processes, algorithms and systems to extract knowledge and insights from data. This course can help aspiring Data Scientists understand the underlying principles and architectures of big data systems, which are essential for working with and analyzing large datasets. By taking this course, you will gain a competitive edge in the field of Data Science.
Machine Learning Engineer
Machine Learning Engineers design, develop and deploy machine learning models to solve real-world problems. This course can help aspiring Machine Learning Engineers understand the underlying principles and architectures of big data systems, which are essential for working with and analyzing large datasets used in machine learning. By taking this course, you will gain a competitive edge in the field of Machine Learning.
Cloud Architect
Cloud Architects design, build and manage cloud computing systems and applications. This course may be useful for aspiring Cloud Architects, as it covers the principles of designing distributed systems, as well as frameworks for scaling storage, computation and network capabilities, which are essential for designing and managing cloud-based systems and applications at scale.
DevOps Engineer
DevOps Engineers work to bridge the gap between development and operations teams, ensuring that software is built, tested, and deployed efficiently and reliably. This course may be useful for aspiring DevOps Engineers, as it covers the principles of designing distributed systems, as well as frameworks for scaling storage, computation and network capabilities, which are essential for designing and managing the infrastructure that supports software development and deployment.
Systems Administrator
Systems Administrators are responsible for the management and maintenance of computer systems and networks. This course may be useful for aspiring Systems Administrators, as it covers the principles of designing distributed systems, as well as frameworks for scaling storage, computation and network capabilities, which are essential for managing and maintaining complex systems at scale.
Software Architect
Software Architects design, build and maintain the overall architecture of software systems. This course may be useful for aspiring Software Architects, as it covers the principles of designing distributed systems, as well as frameworks for scaling storage, computation and network capabilities, which are essential for designing and managing complex software systems at scale.
Data Architect
Data Architects plan, design, and build an organization's data systems. This course may be useful for aspiring Data Architects, as it will teach you the principles of designing distributed systems, as well as frameworks for scaling storage, computation and network capabilities, which are critical concepts for designing efficient and reliable data systems at scale. Furthermore, this course provides case studies of recent industrial big data systems such as GFS, MapReduce and Spark, which will give you practical insights into the design and implementation of real-world data systems, making you a more competitive candidate for Data Architect roles.
Database Administrator
Database Administrators are responsible for the design, implementation, maintenance and security of database management systems. This course may be useful for aspiring Database Administrators, as it covers the principles of designing distributed systems, as well as frameworks for scaling storage, computation and network capabilities, which are essential for designing and managing efficient and reliable database systems at scale.

Reading list

We've selected 11 books that we think will supplement your learning. Use these to develop background knowledge, enrich your coursework, and gain a deeper understanding of the topics covered in Advanced Big Data Systems | 高级大数据系统.
A comprehensive guide to Hadoop, covering its architecture, programming model, and ecosystem of tools. Provides a solid understanding of Hadoop's core concepts and how to use it effectively.
The official guide to Spark, providing a comprehensive overview of its architecture, programming model, and use cases. Offers detailed explanations of Spark's core concepts and how to use it for data processing and analytics.
Provides a comprehensive overview of the principles and patterns for designing and building data-intensive applications. Covers topics such as data modeling, data storage, and data processing.
Focuses on using Hadoop and MapReduce for natural language processing tasks. Provides practical examples and techniques for text processing, sentiment analysis, and machine learning with big data.
本书系统地介绍了分布式系统的概念和设计原则,对于理解大数据系统中分布式计算和存储的基础知识非常有帮助。
An introduction to NoSQL databases, providing a clear explanation of the different types of NoSQL databases and their use cases. Offers guidance on choosing the right NoSQL database for a particular application.
Provides a comprehensive overview of distributed algorithms, covering fundamental concepts, models, and algorithms. Offers detailed explanations of distributed agreement, consensus, and fault tolerance.
Provides a foundational overview of data science, covering topics such as probability, statistics, linear algebra, and optimization. Offers insights into how these concepts are used in data analysis and machine learning.
本书为数据挖掘领域经典教材,系统地介绍了数据挖掘的基本概念和技术,对于理解大数据处理和分析的基础知识非常有帮助。
A practical guide to using Python for data analysis and data manipulation. Provides detailed explanations of Python's data structures, libraries, and tools for data analysis.
本书是机器学习领域的经典教材,全面介绍了机器学习的基本原理、算法和应用。

Share

Help others find this course page by sharing it with your friends and followers:

Similar courses

Here are nine courses similar to Advanced Big Data Systems | 高级大数据系统.
计算机操作系统
Most relevant
用Python玩转数据 Data Processing Using Python
Most relevant
Windows可视化程序设计
Most relevant
有限元分析与应用 | Finite Element Method (FEM) Analysis and...
Most relevant
操作系统原理(Operating Systems)
Most relevant
Data Structures and Algorithm Design Part II |...
Most relevant
Structural Equation Model and its Applications |...
Most relevant
Structural Equation Model and its Applications |...
Most relevant
操作系统与虚拟化安全
Most relevant
Our mission

OpenCourser helps millions of learners each year. People visit us to learn workspace skills, ace their exams, and nurture their curiosity.

Our extensive catalog contains over 50,000 courses and twice as many books. Browse by search, by topic, or even by career interests. We'll match you to the right resources quickly.

Find this site helpful? Tell a friend about us.

Affiliate disclosure

We're supported by our community of learners. When you purchase or subscribe to courses and programs or purchase books, we may earn a commission from our partners.

Your purchases help us maintain our catalog and keep our servers humming without ads.

Thank you for supporting OpenCourser.

© 2016 - 2024 OpenCourser