We may earn an affiliate commission when you visit our partners.
Reza Farivar and Roy H. Campbell

Welcome to the Cloud Computing Applications course, the second part of a two-course series designed to give you a comprehensive view on the world of Cloud Computing and Big Data!

Read more

Welcome to the Cloud Computing Applications course, the second part of a two-course series designed to give you a comprehensive view on the world of Cloud Computing and Big Data!

In this second course we continue Cloud Computing Applications by exploring how the Cloud opens up data analytics of huge volumes of data that are static or streamed at high velocity and represent an enormous variety of information. Cloud applications and data analytics represent a disruptive change in the ways that society is informed by, and uses information. We start the first week by introducing some major systems for data analysis including Spark and the major frameworks and distributions of analytics applications including Hortonworks, Cloudera, and MapR. By the middle of week one we introduce the HDFS distributed and robust file system that is used in many applications like Hadoop and finish week one by exploring the powerful MapReduce programming model and how distributed operating systems like YARN and Mesos support a flexible and scalable environment for Big Data analytics. In week two, our course introduces large scale data storage and the difficulties and problems of consensus in enormous stores that use quantities of processors, memories and disks. We discuss eventual consistency, ACID, and BASE and the consensus algorithms used in data centers including Paxos and Zookeeper. Our course presents Distributed Key-Value Stores and in memory databases like Redis used in data centers for performance. Next we present NOSQL Databases. We visit HBase, the scalable, low latency database that supports database operations in applications that use Hadoop. Then again we show how Spark SQL can program SQL queries on huge data. We finish up week two with a presentation on Distributed Publish/Subscribe systems using Kafka, a distributed log messaging system that is finding wide use in connecting Big Data and streaming applications together to form complex systems. Week three moves to fast data real-time streaming and introduces Storm technology that is used widely in industries such as Yahoo. We continue with Spark Streaming, Lambda and Kappa architectures, and a presentation of the Streaming Ecosystem. Week four focuses on Graph Processing, Machine Learning, and Deep Learning. We introduce the ideas of graph processing and present Pregel, Giraph, and Spark GraphX. Then we move to machine learning with examples from Mahout and Spark. Kmeans, Naive Bayes, and fpm are given as examples. Spark ML and Mllib continue the theme of programmability and application construction. The last topic we cover in week four introduces Deep Learning technologies including Theano, Tensor Flow, CNTK, MXnet, and Caffe on Spark.

Enroll now

What's inside

Syllabus

Course Orientation
You will become familiar with the course, your classmates, and our learning environment. The orientation will also help you obtain the technical skills required for the course.
Read more

Traffic lights

Read about what's good
what should give you pause
and possible dealbreakers
Starts with rudiments of data analytics and Big Data technologies
Develops an understanding of data analytics tools and methods
Provides hands-on experience with Apache Spark, Hadoop, and other popular Big Data tools

Save this course

Create your own learning path. Save this course to your list so you can find it easily later.
Save

Reviews summary

Big data and cloud applications overview

According to learners, this course offers a comprehensive overview of big data technologies and applications in the cloud. Students appreciate the coverage of frameworks like Spark, various storage systems, and streaming architectures. While many find the content highly relevant for professionals in data science and engineering, some note that the pace is fast and it covers a wide breadth of topics rather than deep dives into each. A recurring point is the potential difficulty with the lab environments, which some reviewers found challenging to set up or get working correctly. Despite this, the course is largely seen as providing a solid foundation in cloud-based big data processing.
Assumes prior technical knowledge.
"Be aware that this course assumes you have a strong background in programming and possibly some familiarity with cloud concepts."
"I struggled a bit because I wasn't as comfortable with the Linux command line as needed for the labs."
"It helps a lot if you already know Java or Scala before diving into the programming assignments."
"This is definitely not an introductory course; it builds on existing technical skills."
Strong emphasis on Spark framework.
"The focus on Spark throughout the course was very useful, as it's a widely used technology in the industry today."
"I particularly appreciated the modules covering Spark for both batch processing and streaming data."
"Learning Spark SQL and MLlib through this course provided practical skills."
"The examples using Spark were clear and helped solidify the concepts taught in the lectures."
Content is useful for careers.
"As a data engineer, I found the topics directly applicable to my work and gained valuable insights into different systems."
"This course helped me understand the architecture behind many big data platforms I encounter professionally."
"The knowledge gained, especially regarding distributed systems and streaming, is highly relevant in today's tech job market."
"It's a great course if you're looking to advance your career in cloud or data-related fields."
Explores many key Big Data technologies.
"This course provides a fantastic overview of a wide array of big data technologies including Spark, Kafka, and various NoSQL databases."
"I really liked how it covered streaming systems like Storm and Spark Streaming, plus machine learning and deep learning frameworks."
"It gives you a solid grasp of the landscape, touching on storage, processing, and applications across the cloud."
"The breadth of topics covered, from HDFS to ML algorithms, is impressive for one course."
Hands-on assignments can be difficult.
"The lab setup was frustrating and took significant time to troubleshoot before I could even start the assignments."
"I found the practical exercises quite challenging, requiring a good understanding of Linux and the specific environments used."
"Getting the Spark and Hadoop environments configured for the assignments was definitely the hardest part for me."
"The hands-on part is valuable but be prepared for potential issues with setting up the required software."

Activities

Be better prepared before your course. Deepen your understanding during and after it. Supplement your coursework and achieve mastery of the topics covered in Cloud Computing Applications, Part 2: Big Data and Applications in the Cloud with these activities:
Read 'Programming for Big Data' by Michael Gannon
Provides hands-on experience and a deeper understanding of big data principles, programming frameworks, and applications.
Show steps
  • Purchase or borrow the book 'Programming for Big Data'
  • Read the book carefully, taking notes and identifying key concepts
  • Practice the examples provided in the book to apply your understanding
  • Summarize and reflect on the main ideas covered in each chapter
Join a study group for Big Data fundamentals
Provides a supportive environment for discussing concepts, sharing knowledge, and clarifying doubts.
Browse courses on Hadoop
Show steps
  • Find or create a study group with peers taking the same course
  • Set regular meeting times and discuss assigned topics or questions
  • Engage in active listening, ask questions, and share insights
  • Summarize key points and follow up on any unresolved questions
Solve Apache Spark practice problems
Enhances understanding of Apache Spark's syntax, functionality, and use cases.
Browse courses on Apache Spark
Show steps
  • Find a website or platform offering Apache Spark practice problems
  • Attempt to solve the problems independently, referring to documentation or tutorials when needed
  • Review solutions and identify areas for improvement
Three other activities
Expand to see all activities and additional details
Show all six activities
Complete tutorials on Apache Kafka for real-time data processing
Deepens understanding of Apache Kafka's architecture, functionality, and practical applications.
Browse courses on Apache Kafka
Show steps
  • Find online tutorials or courses on Apache Kafka
  • Follow the tutorials step-by-step, setting up Kafka and practicing its features
  • Experiment with different data sources and use cases to gain practical experience
Develop a Big Data use case presentation
Encourages critical thinking, practical application, and communication skills in the context of Big Data.
Browse courses on Big Data Applications
Show steps
  • Identify a specific industry or problem that could benefit from Big Data
  • Research and gather relevant data, including sources, types, and formats
  • Develop a data analysis plan, including the tools and techniques to be used
  • Analyze the data, draw insights, and formulate recommendations
  • Create a presentation to communicate the findings, including visuals, charts, and a clear narrative
Create a comprehensive course summary
Reinforces learning by synthesizing and organizing course materials.
Show steps
  • Review all lecture notes, readings, and assignments
  • Identify key concepts, definitions, and examples
  • Create a structured outline or summary document
  • Reference specific pages or sections in the original materials for easy retrieval

Career center

Learners who complete Cloud Computing Applications, Part 2: Big Data and Applications in the Cloud will develop knowledge and skills that may be useful to these careers:
Data Engineer
Data Engineers are responsible for designing, building, testing, and maintaining the data pipelines that collect, transform, and store data for use by data analysis applications. This course provides a strong foundation for a career as a Data Engineer by teaching you the core concepts of Big Data analytics, including data storage, data processing, and data analysis. You will also learn how to use popular Big Data tools and technologies, such as Apache Spark, Hadoop, and HDFS.
Data Scientist
Data Scientists use data to solve business problems. They collect, clean, and analyze data to identify patterns and trends. They then use these insights to develop data-driven solutions. This course provides a strong foundation for a career as a Data Scientist by teaching you the core concepts of Big Data analytics, including data storage, data processing, and data analysis. You will also learn how to use popular Big Data tools and technologies, such as Apache Spark, Hadoop, and HDFS.
Data Analyst
Data Analysts use data to understand and improve business performance. They collect, clean, and analyze data to identify trends and patterns. They then use these insights to make recommendations to stakeholders. This course provides a strong foundation for a career as a Data Analyst by teaching you the core concepts of Big Data analytics, including data storage, data processing, and data analysis. You will also learn how to use popular Big Data tools and technologies, such as Apache Spark, Hadoop, and HDFS.
Machine Learning Engineer
Machine Learning Engineers build, deploy, and maintain machine learning models. They work with data scientists to identify the appropriate machine learning algorithms and models for a given problem. They then develop and deploy the models and monitor their performance. This course provides a strong foundation for a career as a Machine Learning Engineer by teaching you the core concepts of Big Data analytics, including data storage, data processing, and data analysis. You will also learn how to use popular Big Data tools and technologies, such as Apache Spark, Hadoop, and HDFS.
Cloud Architect
Cloud Architects design, build, and maintain cloud computing environments. They work with businesses to identify their cloud computing needs and develop solutions that meet those needs. This course provides a strong foundation for a career as a Cloud Architect by teaching you the core concepts of cloud computing, including cloud computing infrastructure, cloud computing services, and cloud computing security. You will also learn how to use popular cloud computing platforms, such as AWS, Azure, and GCP.
Big Data Architect
Big Data Architects design, build, and maintain Big Data systems. They work with businesses to identify their Big Data needs and develop solutions that meet those needs. This course provides a strong foundation for a career as a Big Data Architect by teaching you the core concepts of Big Data analytics, including data storage, data processing, and data analysis. You will also learn how to use popular Big Data tools and technologies, such as Apache Spark, Hadoop, and HDFS.
Software Engineer
Software Engineers design, build, and maintain software applications. They work with businesses to identify their software needs and develop solutions that meet those needs. This course may be useful for a career as a Software Engineer by teaching you the core concepts of software engineering, including software design, software development, and software testing. You will also learn how to use popular software engineering tools and technologies, such as Java, Python, and C++.
Database Administrator
Database Administrators design, build, and maintain databases. They work with businesses to identify their database needs and develop solutions that meet those needs. This course may be useful for a career as a Database Administrator by teaching you the core concepts of database management, including database design, database development, and database administration. You will also learn how to use popular database management systems, such as MySQL, PostgreSQL, and Oracle.
Systems Engineer
Systems Engineers design, build, and maintain computer systems. They work with businesses to identify their computing needs and develop solutions that meet those needs. This course may be useful for a career as a Systems Engineer by teaching you the core concepts of computer engineering, including computer hardware, computer software, and computer networks. You will also learn how to use popular computer engineering tools and technologies, such as Linux, Windows, and Cisco.
Network Engineer
Network Engineers design, build, and maintain computer networks. They work with businesses to identify their networking needs and develop solutions that meet those needs. This course may be useful for a career as a Network Engineer by teaching you the core concepts of computer networking, including network design, network development, and network administration. You will also learn how to use popular networking tools and technologies, such as Cisco, Juniper, and F5.
Security Engineer
Security Engineers design, build, and maintain computer security systems. They work with businesses to identify their security needs and develop solutions that meet those needs. This course may be useful for a career as a Security Engineer by teaching you the core concepts of computer security, including computer security design, computer security development, and computer security administration. You will also learn how to use popular computer security tools and technologies, such as firewalls, intrusion detection systems, and antivirus software.
Business Analyst
Business Analysts work with businesses to identify their business needs and develop solutions that meet those needs. This course may be useful for a career as a Business Analyst by teaching you the core concepts of business analysis, including business analysis techniques, business analysis tools, and business analysis deliverables. You will also learn how to use popular business analysis tools and technologies, such as Microsoft Visio, IBM Rational Rose, and Oracle Business Process Analysis Suite.
Project Manager
Project Managers plan, execute, and close projects. They work with businesses to identify their project needs and develop solutions that meet those needs. This course may be useful for a career as a Project Manager by teaching you the core concepts of project management, including project management techniques, project management tools, and project management deliverables. You will also learn how to use popular project management tools and technologies, such as Microsoft Project, Asana, and Trello.
Technical Writer
Technical Writers write technical documentation. They work with businesses to identify their documentation needs and develop solutions that meet those needs. This course may be useful for a career as a Technical Writer by teaching you the core concepts of technical writing, including technical writing techniques, technical writing tools, and technical writing deliverables. You will also learn how to use popular technical writing tools and technologies, such as Microsoft Word, Adobe FrameMaker, and MadCap Flare.
IT Auditor
IT Auditors audit computer systems and networks. They work with businesses to identify their IT audit needs and develop solutions that meet those needs. This course may be useful for a career as an IT Auditor by teaching you the core concepts of IT auditing, including IT audit techniques, IT audit tools, and IT audit deliverables. You will also learn how to use popular IT audit tools and technologies, such as ACL, IDEA, and Argh.

Reading list

We've selected 25 books that we think will supplement your learning. Use these to develop background knowledge, enrich your coursework, and gain a deeper understanding of the topics covered in Cloud Computing Applications, Part 2: Big Data and Applications in the Cloud.
Definitive guide to Spark, the unified analytics engine for big data processing. It would be a valuable resource for this course as it provides a comprehensive overview of Spark's capabilities.
Comprehensive guide to big data analytics, covering everything from strategic planning to enterprise integration. It would be a valuable addition to this course as it provides a broader perspective on the field of big data.
Practical guide to learning Spark. It would be a valuable resource for this course as it provides a hands-on approach to learning Spark's capabilities.
Provides a comprehensive overview of Spark, including its architecture, components, and applications. It valuable resource for anyone who wants to learn more about Spark and how to use it to process big data.
Provides a comprehensive overview of cloud computing. It would be a valuable addition to this course as it provides a foundation in the principles of cloud computing.
Provides a comprehensive overview of data management for big data. It would be a valuable addition to this course as it provides a foundation in the principles of data management.
Provides a practical introduction to deep learning for programmers, using the Fastai library and PyTorch, with a focus on building and deploying real-world applications.
Provides a practical introduction to data analysis with Pandas. It would be a valuable addition to this course as it provides a foundation in the principles of data analysis.
Provides a practical introduction to deep learning with Python, covering fundamental concepts and popular deep learning libraries.
Introduces NoSQL databases and their advantages over traditional relational databases, providing a practical guide to their use and implementation.
Provides a comprehensive overview of NoSQL databases, including their design, architecture, and applications. It valuable resource for anyone who wants to learn more about NoSQL databases and how to use them to build scalable, high-performance applications.
Offers a non-technical introduction to data science, focusing on how to apply data-driven insights to business problems.
Provides a comprehensive overview of big data analytics with Java, including its methods, tools, and applications. It valuable resource for anyone who wants to learn more about big data analytics and how to use Java to build data-intensive applications.
Provides a comprehensive overview of Hadoop operations, including its architecture, components, and administration. It valuable resource for anyone who wants to learn more about Hadoop operations and how to manage Hadoop clusters.
Provides a comprehensive overview of graph databases, including their design, architecture, and applications. It valuable resource for anyone who wants to learn more about graph databases and how to use them to build scalable, high-performance applications.
Provides a comprehensive overview of cloud computing for data analysis, including its benefits, challenges, and applications. It valuable resource for anyone who wants to learn more about cloud computing and how to use it to build data-intensive applications.
A practical guide to using Redis, an open-source, in-memory data structure store, for caching and other applications.
A comprehensive guide to deep learning, providing a theoretical and practical foundation for this rapidly growing field.
A classic textbook on statistical learning, providing a comprehensive overview of the field and its applications.
An introduction to big data analytics, providing a high-level overview of the field and its applications.
A comprehensive textbook on machine learning, providing a probabilistic approach to the field.

Share

Help others find this course page by sharing it with your friends and followers:

Similar courses

Similar courses are unavailable at this time. Please try again later.
Our mission

OpenCourser helps millions of learners each year. People visit us to learn workspace skills, ace their exams, and nurture their curiosity.

Our extensive catalog contains over 50,000 courses and twice as many books. Browse by search, by topic, or even by career interests. We'll match you to the right resources quickly.

Find this site helpful? Tell a friend about us.

Affiliate disclosure

We're supported by our community of learners. When you purchase or subscribe to courses and programs or purchase books, we may earn a commission from our partners.

Your purchases help us maintain our catalog and keep our servers humming without ads.

Thank you for supporting OpenCourser.

© 2016 - 2025 OpenCourser