We may earn an affiliate commission when you visit our partners.
Course image
Ilkay Altintas and Amarnath Gupta

Interested in increasing your knowledge of the Big Data landscape? This course is for those new to data science and interested in understanding why the Big Data Era has come to be. It is for those who want to become conversant with the terminology and the core concepts behind big data problems, applications, and systems. It is for those who want to start thinking about how Big Data might be useful in their business or career. It provides an introduction to one of the most common frameworks, Hadoop, that has made big data analysis easier and more accessible -- increasing the potential for data to transform our world!

Read more

Interested in increasing your knowledge of the Big Data landscape? This course is for those new to data science and interested in understanding why the Big Data Era has come to be. It is for those who want to become conversant with the terminology and the core concepts behind big data problems, applications, and systems. It is for those who want to start thinking about how Big Data might be useful in their business or career. It provides an introduction to one of the most common frameworks, Hadoop, that has made big data analysis easier and more accessible -- increasing the potential for data to transform our world!

At the end of this course, you will be able to:

* Describe the Big Data landscape including examples of real world big data problems including the three key sources of Big Data: people, organizations, and sensors.

* Explain the V’s of Big Data (volume, velocity, variety, veracity, valence, and value) and why each impacts data collection, monitoring, storage, analysis and reporting.

* Get value out of Big Data by using a 5-step process to structure your analysis.

* Identify what are and what are not big data problems and be able to recast big data problems as data science questions.

* Provide an explanation of the architectural components and programming models used for scalable big data analysis.

* Summarize the features and value of core Hadoop stack components including the YARN resource and job management system, the HDFS file system and the MapReduce programming model.

* Install and run a program using Hadoop!

This course is for those new to data science. No prior programming experience is needed, although the ability to install applications and utilize a virtual machine is necessary to complete the hands-on assignments.

Hardware Requirements:

(A) Quad Core Processor (VT-x or AMD-V support recommended), 64-bit; (B) 8 GB RAM; (C) 20 GB disk free. How to find your hardware information: (Windows): Open System by clicking the Start button, right-clicking Computer, and then clicking Properties; (Mac): Open Overview by clicking on the Apple menu and clicking “About This Mac.” Most computers with 8 GB RAM purchased in the last 3 years will meet the minimum requirements.You will need a high speed internet connection because you will be downloading files up to 4 Gb in size.

Software Requirements:

This course relies on several open-source software tools, including Apache Hadoop. All required software can be downloaded and installed free of charge. Software requirements include: Windows 7+, Mac OS X 10.10+, Ubuntu 14.04+ or CentOS 6+ VirtualBox 5+.

Enroll now

What's inside

Syllabus

Welcome
Welcome to the Big Data Specialization! We're excited for you to get to know us and we're looking forward to learning about you!
Big Data: Why and Where
Read more
Data -- it's been around (even digitally) for a while. What makes data "big" and where does this big data come from?
Characteristics of Big Data and Dimensions of Scalability
You may have heard of the "Big Vs". We'll give examples and descriptions of the commonly discussed 5. But, we want to propose a 6th V and we'll ask you to practice writing Big Data questions targeting this V -- value.
Data Science: Getting Value out of Big Data
We love science and we love computing, don't get us wrong. But the reality is we care about Big Data because it can bring value to our companies, our lives, and the world. In this module we'll introduce a 5 step process for approaching data science problems.
Foundations for Big Data Systems and Programming
Big Data requires new programming frameworks and systems. For this course, we don't programming knowledge or experience -- but we do want to give you a grounding in some of the key concepts.
Systems: Getting Started with Hadoop
Let's look at some details of Hadoop and MapReduce. Then we'll go "hands on" and actually perform a simple MapReduce task in the Cloudera VM. Pay attention - as we'll guide you in "learning by doing" in diagramming a MapReduce task as a Peer Review.

Good to know

Know what's good
, what to watch for
, and possible dealbreakers
Explores foundational concepts of Big Data, providing a strong starting point for beginners
Studies Big Data's dimensions of scalability, enhancing understanding of its complexities
Develops critical skills for approaching data science problems using a structured methodology
Provides valuable insights into the core components of Hadoop stack, empowering learners with essential knowledge for working with Big Data
Offers hands-on experience with Hadoop, enabling learners to apply their understanding practically

Save this course

Save Introduction to Big Data to your list so you can find it easily later:
Save

Reviews summary

Big data: a beginner's guide for decision makers

learners say this introductory course in Big Data provides a solid foundation for those with no prior knowledge. It covers concepts such as why Big Data is important, the 5 V's of Big Data, HDFS, Hadoop, and the MapReduce algorithm. The course also features a hands-on lab where you can practice using Hadoop. However, some reviewers note that the content is a bit outdated and that the quizzes can be challenging.
Students report that the quizzes can be challenging, but they also appreciate that the quizzes help them to retain the information.
"The quizzes in the first two weeks are not practical at all."
"Sometime the video duration takes to long but overall is good."
This course earns largely positive reviews from students, who find it engaging and informative.
"I loved this course which covers the concepts very well with adequate dose of hands on and quizzes."
"Very good introductory training,nevertheless is a little outdated and needs to keep up with changes on the respective technology domain"
The course includes a practical lab where students can apply their knowledge of Hadoop.
"I Loved the course and Expecting a much more detailed course with hadoop specially and spark as well ,both the instructors were very well thank you"
"It was an amazing experience taking this course and I must say I'll love to take more.Coursera is indeed a nice platform for online classes."
Some students mention that the course material is outdated and could use updates.
"You can easily get away with skipping the first two weeks of the course."
"The whole specialisation is a joke."

Activities

Be better prepared before your course. Deepen your understanding during and after it. Supplement your coursework and achieve mastery of the topics covered in Introduction to Big Data with these activities:
Read "Big Data: A Revolution That Will Transform How We Live, Work, and Think" by Viktor Mayer-Schönberger and Kenneth Cukier
Gain valuable insights into the history and potential of big data to enhance your understanding of the course material.
Show steps
  • Read the book thoroughly, taking notes on key concepts and examples
  • Identify the main themes and arguments presented by the authors
  • Summarize the book's key points in your own words
Review Foundations of Data Science
Solidify your foundational data science concepts before class starts to set yourself up for successful learning.
Browse courses on Data Science
Show steps
  • Review probability theory concepts (e.g., Bayes' theorem, conditional probability, random variables)
  • Review statistical concepts (e.g., mean, variance, standard deviation, hypothesis testing)
  • Practice applying statistics to real-world scenarios
Organize and Review Course Materials
Enhance your learning by organizing and reviewing course materials regularly to improve retention and understanding.
Show steps
  • Create a system for organizing lecture notes, readings, and assignments
  • Review materials regularly, summarizing key concepts and identifying areas for further study
  • Seek clarification on any unclear concepts with the instructor or classmates
Five other activities
Expand to see all activities and additional details
Show all eight activities
Complete Hadoop Tutorials
Develop practical skills in using Hadoop to enhance your understanding of big data systems.
Browse courses on Hadoop
Show steps
  • Find tutorials on Hadoop concepts and installation
  • Follow the tutorials step-by-step, practicing the concepts
  • Complete exercises or projects to test your understanding
Participate in a Peer Discussion on Big Data Applications
Connect with classmates and share insights to broaden your understanding of big data applications in different industries.
Browse courses on Big Data Applications
Show steps
  • Join or create a peer discussion group
  • Prepare by researching case studies or industry trends related to big data
  • Engage in discussions, sharing your perspectives and learning from others
Solve Big Data Problems Using MapReduce Simulations
Strengthen your understanding of MapReduce and develop problem-solving skills by practicing simulations.
Browse courses on MapReduce
Show steps
  • Find online simulators or exercises for MapReduce
  • Practice solving data processing problems using MapReduce
  • Review your solutions and identify areas for improvement
Design a Scalable Data Architecture for a Real-World Problem
Apply your knowledge of big data systems to design scalable solutions for real-world data challenges.
Browse courses on Data Architecture
Show steps
  • Identify a real-world problem that involves big data
  • Design a scalable data architecture to address the problem using Hadoop or similar technologies
  • Present your design, including its components, data flow, and scalability considerations
Contribute to the Apache Hadoop Project
Deepen your understanding of Hadoop and contribute to its development by participating in the open-source community.
Browse courses on Open Source
Show steps
  • Review the Hadoop documentation and codebase
  • Identify a bug or feature enhancement to work on
  • Submit a pull request with your proposed changes

Career center

Learners who complete Introduction to Big Data will develop knowledge and skills that may be useful to these careers:
Data Scientist
A Data Scientist gathers, organizes, and analyzes big data, often using Hadoop, to create insights that help businesses make better decisions. This course provides an introduction to big data and the Hadoop ecosystem. It offers hands-on exercises that will help you get started with Hadoop and gain a foundation in big data analysis. This course may be useful for aspiring Data Scientists who want to learn more about big data and Hadoop.
Machine Learning Engineer
A Machine Learning Engineer designs and builds machine learning models. This course provides an introduction to big data and the Hadoop ecosystem. It offers hands-on exercises that will help you get started with Hadoop and gain a foundation in big data analysis. This course may be useful for aspiring Machine Learning Engineers who want to learn more about big data and Hadoop.
Business Analyst
A Business Analyst helps businesses understand and solve problems by analyzing data. This course provides an introduction to big data and the Hadoop ecosystem. It offers hands-on exercises that will help you get started with Hadoop and gain a foundation in big data analysis. This course may be useful for aspiring Business Analysts who want to learn more about big data and Hadoop.
Data Architect
A Data Architect designs and builds data systems. This course provides an introduction to big data and the Hadoop ecosystem. It offers hands-on exercises that will help you get started with Hadoop and gain a foundation in big data analysis. This course may be useful for aspiring Data Architects who want to learn more about big data and Hadoop.
Software Engineer
A Software Engineer designs, develops, and maintains software systems. This course provides an introduction to big data and the Hadoop ecosystem. It offers hands-on exercises that will help you get started with Hadoop and gain a foundation in big data analysis. This course may be useful for aspiring Software Engineers who want to learn more about big data and Hadoop.
Data Analyst
A Data Analyst collects, processes, and analyzes data to help businesses make informed decisions. This course provides an introduction to big data and the Hadoop ecosystem. It offers hands-on exercises that will help you get started with Hadoop and gain a foundation in big data analysis. This course may be useful for aspiring Data Analysts who want to learn more about big data and Hadoop.
Database Administrator
A Database Administrator manages and maintains databases. This course provides an introduction to big data and the Hadoop ecosystem. It offers hands-on exercises that will help you get started with Hadoop and gain a foundation in big data analysis. This course may be useful for aspiring Database Administrators who want to learn more about big data and Hadoop.
Data Engineer
A Data Engineer builds and maintains data pipelines. This course provides an introduction to big data and the Hadoop ecosystem. It offers hands-on exercises that will help you get started with Hadoop and gain a foundation in big data analysis. This course may be useful for aspiring Data Engineers who want to learn more about big data and Hadoop.
Data Manager
A Data Manager manages data assets. This course provides an introduction to big data and the Hadoop ecosystem. It offers hands-on exercises that will help you get started with Hadoop and gain a foundation in big data analysis. This course may be useful for aspiring Data Managers who want to learn more about big data and Hadoop.
Big Data Engineer
The Big Data Engineer designs, builds, and maintains big data systems. This course provides an introduction to big data and the Hadoop ecosystem. It offers hands-on exercises that will help you get started with Hadoop and gain a foundation in big data analysis. This course may be useful for aspiring Big Data Engineers who want to learn more about big data and Hadoop.
Data Administrator
A Data Administrator manages and maintains data. This course provides an introduction to big data and the Hadoop ecosystem. It offers hands-on exercises that will help you get started with Hadoop and gain a foundation in big data analysis. This course may be useful for aspiring Data Administrators who want to learn more about big data and Hadoop.
Data Visualization Engineer
A Data Visualization Engineer designs and builds data visualizations. This course provides an introduction to big data and the Hadoop ecosystem. It offers hands-on exercises that will help you get started with Hadoop and gain a foundation in big data analysis. This course may be useful for aspiring Data Visualization Engineers who want to learn more about big data and Hadoop.

Reading list

We've selected 12 books that we think will supplement your learning. Use these to develop background knowledge, enrich your coursework, and gain a deeper understanding of the topics covered in Introduction to Big Data.
Provides a comprehensive overview of the big data landscape, including its history, key concepts, and potential applications. It valuable read for anyone who wants to understand the big data revolution and its implications for our lives and work.
Provides a comprehensive overview of big data analytics, from strategic planning to enterprise integration. It covers the key concepts, technologies, and use cases of big data analytics.
Provides a comprehensive overview of speech and language processing, covering the key concepts and techniques of speech recognition, natural language understanding, and natural language generation.
Provides a comprehensive overview of pattern recognition and machine learning, covering the key concepts and techniques of supervised learning, unsupervised learning, and reinforcement learning.
Provides a comprehensive overview of information theory, inference, and learning algorithms, covering the key concepts and techniques of information theory, Bayesian inference, and machine learning.
Provides a practical introduction to data science, with a focus on business applications. It covers the key concepts and techniques of data mining and data-analytic thinking.
Provides a practical introduction to data science, with a focus on using Python. It covers the key concepts and techniques of data science, using real-world examples.
Provides a comprehensive overview of computer vision, covering the key concepts and techniques of image processing, feature extraction, and object recognition.
Provides a practical introduction to natural language processing, with a focus on using the Natural Language Toolkit (NLTK). It covers the key concepts and techniques of natural language processing, using real-world examples.
Provides a practical introduction to machine learning, with a focus on using Scikit-Learn, Keras, and TensorFlow. It covers the key concepts and techniques of machine learning, using real-world examples.
Provides a practical introduction to big data analytics, with a focus on exploratory data analysis and data mining. It covers the key concepts and techniques of big data analytics, using real-world examples.
Is the definitive guide to Hadoop, the open-source framework for big data processing. It covers the architecture, programming models, and use cases of Hadoop.

Share

Help others find this course page by sharing it with your friends and followers:

Similar courses

Here are nine courses similar to Introduction to Big Data.
Our mission

OpenCourser helps millions of learners each year. People visit us to learn workspace skills, ace their exams, and nurture their curiosity.

Our extensive catalog contains over 50,000 courses and twice as many books. Browse by search, by topic, or even by career interests. We'll match you to the right resources quickly.

Find this site helpful? Tell a friend about us.

Affiliate disclosure

We're supported by our community of learners. When you purchase or subscribe to courses and programs or purchase books, we may earn a commission from our partners.

Your purchases help us maintain our catalog and keep our servers humming without ads.

Thank you for supporting OpenCourser.

© 2016 - 2024 OpenCourser