Cloud Computing Applications, Part 2: Big Data and Applications in the Cloud from Coursera

Welcome to the Cloud Computing Applications course, the second part of a two-course series designed to give you a comprehensive view on the world of Cloud Computing and Big Data!

In this second course we continue Cloud Computing Applications by exploring how the Cloud opens up data analytics of huge volumes of data that are static or streamed at high velocity and represent an enormous variety of information. Cloud applications and data analytics represent a disruptive change in the ways that society is informed by, and uses information. We start the first week by introducing some major systems for data analysis including Spark and the major frameworks and distributions of analytics applications including Hortonworks, Cloudera, and MapR. By the middle of week one we introduce the HDFS distributed and robust file system that is used in many applications like Hadoop and finish week one by exploring the powerful MapReduce programming model and how distributed operating systems like YARN and Mesos support a flexible and scalable environment for Big Data analytics. In week two, our course introduces large scale data storage and the difficulties and problems of consensus in enormous stores that use quantities of processors, memories and disks. We discuss eventual consistency, ACID, and BASE and the consensus algorithms used in data centers including Paxos and Zookeeper. Our course presents Distributed Key-Value Stores and in memory databases like Redis used in data centers for performance. Next we present NOSQL Databases. We visit HBase, the scalable, low latency database that supports database operations in applications that use Hadoop. Then again we show how Spark SQL can program SQL queries on huge data. We finish up week two with a presentation on Distributed Publish/Subscribe systems using Kafka, a distributed log messaging system that is finding wide use in connecting Big Data and streaming applications together to form complex systems. Week three moves to fast data real-time streaming and introduces Storm technology that is used widely in industries such as Yahoo. We continue with Spark Streaming, Lambda and Kappa architectures, and a presentation of the Streaming Ecosystem. Week four focuses on Graph Processing, Machine Learning, and Deep Learning. We introduce the ideas of graph processing and present Pregel, Giraph, and Spark GraphX. Then we move to machine learning with examples from Mahout and Spark. Kmeans, Naive Bayes, and fpm are given as examples. Spark ML and Mllib continue the theme of programmability and application construction. The last topic we cover in week four introduces Deep Learning technologies including Theano, Tensor Flow, CNTK, MXnet, and Caffe on Spark.

What's inside

Syllabus

Course Orientation

You will become familiar with the course, your classmates, and our learning environment. The orientation will also help you obtain the technical skills required for the course.

Traffic lights

Read about what's good

what should give you pause

and possible dealbreakers

Starts with rudiments of data analytics and Big Data technologies

Develops an understanding of data analytics tools and methods

Provides hands-on experience with Apache Spark, Hadoop, and other popular Big Data tools

Reviews summary

Big data and cloud applications overview

According to learners, this course offers a comprehensive overview of big data technologies and applications in the cloud. Students appreciate the coverage of frameworks like Spark, various storage systems, and streaming architectures. While many find the content highly relevant for professionals in data science and engineering, some note that the pace is fast and it covers a wide breadth of topics rather than deep dives into each. A recurring point is the potential difficulty with the lab environments, which some reviewers found challenging to set up or get working correctly. Despite this, the course is largely seen as providing a solid foundation in cloud-based big data processing.

Assumes prior technical knowledge.

"Be aware that this course assumes you have a strong background in programming and possibly some familiarity with cloud concepts."

"I struggled a bit because I wasn't as comfortable with the Linux command line as needed for the labs."

"It helps a lot if you already know Java or Scala before diving into the programming assignments."

"This is definitely not an introductory course; it builds on existing technical skills."

Strong emphasis on Spark framework.

"The focus on Spark throughout the course was very useful, as it's a widely used technology in the industry today."

"I particularly appreciated the modules covering Spark for both batch processing and streaming data."

"Learning Spark SQL and MLlib through this course provided practical skills."

"The examples using Spark were clear and helped solidify the concepts taught in the lectures."

Content is useful for careers.

"As a data engineer, I found the topics directly applicable to my work and gained valuable insights into different systems."

"This course helped me understand the architecture behind many big data platforms I encounter professionally."

"The knowledge gained, especially regarding distributed systems and streaming, is highly relevant in today's tech job market."

"It's a great course if you're looking to advance your career in cloud or data-related fields."

Explores many key Big Data technologies.

"This course provides a fantastic overview of a wide array of big data technologies including Spark, Kafka, and various NoSQL databases."

"I really liked how it covered streaming systems like Storm and Spark Streaming, plus machine learning and deep learning frameworks."

"It gives you a solid grasp of the landscape, touching on storage, processing, and applications across the cloud."

"The breadth of topics covered, from HDFS to ML algorithms, is impressive for one course."

Hands-on assignments can be difficult.

"The lab setup was frustrating and took significant time to troubleshoot before I could even start the assignments."

"I found the practical exercises quite challenging, requiring a good understanding of Linux and the specific environments used."

"Getting the Spark and Hadoop environments configured for the assignments was definitely the hardest part for me."

"The hands-on part is valuable but be prepared for potential issues with setting up the required software."

Activities

Be better prepared before your course. Deepen your understanding during and after it. Supplement your coursework and achieve mastery of the topics covered in Cloud Computing Applications, Part 2: Big Data and Applications in the Cloud with these activities:

Read 'Programming for Big Data' by Michael Gannon

Show steps

Provides hands-on experience and a deeper understanding of big data principles, programming frameworks, and applications.

Show steps

Purchase or borrow the book 'Programming for Big Data'
Read the book carefully, taking notes and identifying key concepts
Practice the examples provided in the book to apply your understanding
Summarize and reflect on the main ideas covered in each chapter

Join a study group for Big Data fundamentals

Show steps

Provides a supportive environment for discussing concepts, sharing knowledge, and clarifying doubts.

Browse courses on Hadoop

Show steps

Find or create a study group with peers taking the same course
Set regular meeting times and discuss assigned topics or questions
Engage in active listening, ask questions, and share insights
Summarize key points and follow up on any unresolved questions

Solve Apache Spark practice problems

Show steps

Enhances understanding of Apache Spark's syntax, functionality, and use cases.

Browse courses on Apache Spark

Show steps

Find a website or platform offering Apache Spark practice problems
Attempt to solve the problems independently, referring to documentation or tutorials when needed
Review solutions and identify areas for improvement

Three other activities

Expand to see all activities and additional details

Show all six activities

Complete tutorials on Apache Kafka for real-time data processing

Show steps

Deepens understanding of Apache Kafka's architecture, functionality, and practical applications.

Browse courses on Apache Kafka

Show steps

Find online tutorials or courses on Apache Kafka
Follow the tutorials step-by-step, setting up Kafka and practicing its features
Experiment with different data sources and use cases to gain practical experience

Develop a Big Data use case presentation

Show steps

Encourages critical thinking, practical application, and communication skills in the context of Big Data.

Browse courses on Big Data Applications

Show steps

Identify a specific industry or problem that could benefit from Big Data
Research and gather relevant data, including sources, types, and formats
Develop a data analysis plan, including the tools and techniques to be used
Analyze the data, draw insights, and formulate recommendations
Create a presentation to communicate the findings, including visuals, charts, and a clear narrative

Create a comprehensive course summary

Show steps

Reinforces learning by synthesizing and organizing course materials.

Show steps

Review all lecture notes, readings, and assignments
Identify key concepts, definitions, and examples
Create a structured outline or summary document
Reference specific pages or sections in the original materials for easy retrieval

Career center

Learners who complete Cloud Computing Applications, Part 2: Big Data and Applications in the Cloud will develop knowledge and skills that may be useful to these careers:

Data Engineer

Data Engineers are responsible for designing, building, testing, and maintaining the data pipelines that collect, transform, and store data for use by data analysis applications. This course provides a strong foundation for a career as a Data Engineer by teaching you the core concepts of Big Data analytics, including data storage, data processing, and data analysis. You will also learn how to use popular Big Data tools and technologies, such as Apache Spark, Hadoop, and HDFS.

See salaries and explore the career path for Data Engineer

Data Scientist

Data Scientists use data to solve business problems. They collect, clean, and analyze data to identify patterns and trends. They then use these insights to develop data-driven solutions. This course provides a strong foundation for a career as a Data Scientist by teaching you the core concepts of Big Data analytics, including data storage, data processing, and data analysis. You will also learn how to use popular Big Data tools and technologies, such as Apache Spark, Hadoop, and HDFS.

See salaries and explore the career path for Data Scientist

Data Analyst

Data Analysts use data to understand and improve business performance. They collect, clean, and analyze data to identify trends and patterns. They then use these insights to make recommendations to stakeholders. This course provides a strong foundation for a career as a Data Analyst by teaching you the core concepts of Big Data analytics, including data storage, data processing, and data analysis. You will also learn how to use popular Big Data tools and technologies, such as Apache Spark, Hadoop, and HDFS.

See salaries and explore the career path for Data Analyst

Machine Learning Engineer

Machine Learning Engineers build, deploy, and maintain machine learning models. They work with data scientists to identify the appropriate machine learning algorithms and models for a given problem. They then develop and deploy the models and monitor their performance. This course provides a strong foundation for a career as a Machine Learning Engineer by teaching you the core concepts of Big Data analytics, including data storage, data processing, and data analysis. You will also learn how to use popular Big Data tools and technologies, such as Apache Spark, Hadoop, and HDFS.

See salaries and explore the career path for Machine Learning Engineer

Cloud Architect

Cloud Architects design, build, and maintain cloud computing environments. They work with businesses to identify their cloud computing needs and develop solutions that meet those needs. This course provides a strong foundation for a career as a Cloud Architect by teaching you the core concepts of cloud computing, including cloud computing infrastructure, cloud computing services, and cloud computing security. You will also learn how to use popular cloud computing platforms, such as AWS, Azure, and GCP.

See salaries and explore the career path for Cloud Architect

Big Data Architect

Big Data Architects design, build, and maintain Big Data systems. They work with businesses to identify their Big Data needs and develop solutions that meet those needs. This course provides a strong foundation for a career as a Big Data Architect by teaching you the core concepts of Big Data analytics, including data storage, data processing, and data analysis. You will also learn how to use popular Big Data tools and technologies, such as Apache Spark, Hadoop, and HDFS.

See salaries and explore the career path for Big Data Architect

Software Engineer

Software Engineers design, build, and maintain software applications. They work with businesses to identify their software needs and develop solutions that meet those needs. This course may be useful for a career as a Software Engineer by teaching you the core concepts of software engineering, including software design, software development, and software testing. You will also learn how to use popular software engineering tools and technologies, such as Java, Python, and C++.

See salaries and explore the career path for Software Engineer

Database Administrator

Database Administrators design, build, and maintain databases. They work with businesses to identify their database needs and develop solutions that meet those needs. This course may be useful for a career as a Database Administrator by teaching you the core concepts of database management, including database design, database development, and database administration. You will also learn how to use popular database management systems, such as MySQL, PostgreSQL, and Oracle.

See salaries and explore the career path for Database Administrator

Systems Engineer

Systems Engineers design, build, and maintain computer systems. They work with businesses to identify their computing needs and develop solutions that meet those needs. This course may be useful for a career as a Systems Engineer by teaching you the core concepts of computer engineering, including computer hardware, computer software, and computer networks. You will also learn how to use popular computer engineering tools and technologies, such as Linux, Windows, and Cisco.

See salaries and explore the career path for Systems Engineer

Network Engineer

Network Engineers design, build, and maintain computer networks. They work with businesses to identify their networking needs and develop solutions that meet those needs. This course may be useful for a career as a Network Engineer by teaching you the core concepts of computer networking, including network design, network development, and network administration. You will also learn how to use popular networking tools and technologies, such as Cisco, Juniper, and F5.

See salaries and explore the career path for Network Engineer

Security Engineer

Security Engineers design, build, and maintain computer security systems. They work with businesses to identify their security needs and develop solutions that meet those needs. This course may be useful for a career as a Security Engineer by teaching you the core concepts of computer security, including computer security design, computer security development, and computer security administration. You will also learn how to use popular computer security tools and technologies, such as firewalls, intrusion detection systems, and antivirus software.

See salaries and explore the career path for Security Engineer

Business Analyst

Business Analysts work with businesses to identify their business needs and develop solutions that meet those needs. This course may be useful for a career as a Business Analyst by teaching you the core concepts of business analysis, including business analysis techniques, business analysis tools, and business analysis deliverables. You will also learn how to use popular business analysis tools and technologies, such as Microsoft Visio, IBM Rational Rose, and Oracle Business Process Analysis Suite.

See salaries and explore the career path for Business Analyst

Project Manager

Project Managers plan, execute, and close projects. They work with businesses to identify their project needs and develop solutions that meet those needs. This course may be useful for a career as a Project Manager by teaching you the core concepts of project management, including project management techniques, project management tools, and project management deliverables. You will also learn how to use popular project management tools and technologies, such as Microsoft Project, Asana, and Trello.

See salaries and explore the career path for Project Manager

Technical Writer

Technical Writers write technical documentation. They work with businesses to identify their documentation needs and develop solutions that meet those needs. This course may be useful for a career as a Technical Writer by teaching you the core concepts of technical writing, including technical writing techniques, technical writing tools, and technical writing deliverables. You will also learn how to use popular technical writing tools and technologies, such as Microsoft Word, Adobe FrameMaker, and MadCap Flare.

See salaries and explore the career path for Technical Writer

IT Auditor

IT Auditors audit computer systems and networks. They work with businesses to identify their IT audit needs and develop solutions that meet those needs. This course may be useful for a career as an IT Auditor by teaching you the core concepts of IT auditing, including IT audit techniques, IT audit tools, and IT audit deliverables. You will also learn how to use popular IT audit tools and technologies, such as ACL, IDEA, and Argh.

See salaries and explore the career path for IT Auditor