We may earn an affiliate commission when you visit our partners.
Course image
Frank Neumann, Wanru (Kelly) Gao, Aneta Neumann, and Vahid Roostapour

Organizations now have access to massive amounts of data and it’s influencing the way they operate. They are realizing in order to be successful they must leverage their data to make effective business decisions.

In this course, part of the Big Data MicroMasters program, you will learn how big data is driving organisational change and the key challenges organizations face when trying to analyse massive data sets.

Read more

Organizations now have access to massive amounts of data and it’s influencing the way they operate. They are realizing in order to be successful they must leverage their data to make effective business decisions.

In this course, part of the Big Data MicroMasters program, you will learn how big data is driving organisational change and the key challenges organizations face when trying to analyse massive data sets.

You will learn fundamental techniques, such as data mining and stream processing. You will also learn how to design and implement PageRank algorithms using MapReduce, a programming paradigm that allows for massive scalability across hundreds or thousands of servers in a Hadoop cluster. You will learn how big data has improved web search and how online advertising systems work.

By the end of this course, you will have a better understanding of the various applications of big data methods in industry and research.

What's inside

Learning objectives

  • Knowledge and application of mapreduce
  • Understanding the rate of occurrences of events in big data
  • How to design algorithms for stream processing and counting of frequent elements in big data
  • Understand and design pagerank algorithms
  • Understand underlying random walk algorithms

Syllabus

Section 1: The basics of working with big data Understand the four V’s of Big Data (Volume, Velocity, and Variety); Build models for data; Understand the occurrence of rare events in random data.
Read more
Section 2: Web and social networks Understand characteristics of the web and social networks; Model social networks; Apply algorithms for community detection in networks.
Section 3: Clustering big data Clustering social networks; Apply hierarchical clustering; Apply k-means clustering.
Section 4: Google web search Understand the concept of PageRank; Implement the basic; PageRank algorithm for strongly connected graphs; Implement PageRank with taxation for graphs that are not strongly connected.
Section 5: Parallel and distributed computing using MapReduce Understand the architecture for massive distributed and parallel computing; Apply MapReduce using Hadoop; Compute PageRank using MapReduce.
Section 6: Computing similar documents in big data Measure importance of words in a collection of documents; Measure similarity of sets and documents; Apply local sensitivity hashing to compute similar documents.
Section 7: Products frequently bought together in stores Understand the importance of frequent item sets; Design association rules; Implement the A-priori algorithm.
Section 8: Movie and music recommendations Understand the differences of recommendation systems; Design content-based recommendation systems; Design collaborative filtering recommendation systems.
Section 9: Google's AdWordsTM System Understand the AdWords System; Analyse online algorithms in terms of competitive ratio; Use online matching to solve the AdWords problem.
Section 10: Mining rapidly arriving data streams Understand types of queries for data streams; Analyse sampling methods for data streams; Count distinct elements in data streams; Filter data streams.

Good to know

Know what's good
, what to watch for
, and possible dealbreakers
Develops MapReduce skills, which are required for Big Data Engineering roles
Explores big data and its applications in real-world scenarios
Taught by experienced instructors in the field of Big Data
Covers fundamental concepts of data analysis, including data mining and stream processing
Provides hands-on exercises to reinforce learning
Requires students to have prior programming experience

Save this course

Save Big Data Fundamentals to your list so you can find it easily later:
Save

Activities

Be better prepared before your course. Deepen your understanding during and after it. Supplement your coursework and achieve mastery of the topics covered in Big Data Fundamentals with these activities:
Connect with industry professionals
Expand your network and gain insights from experienced professionals in the field of big data.
Browse courses on Networking
Show steps
  • Identify potential mentors through online platforms like LinkedIn or industry events.
  • Reach out to potential mentors and introduce yourself.
  • Schedule meetings or calls to connect with your mentors and seek their guidance.
Review linear algebra
Refresh your knowledge of linear algebra, which is essential for understanding data mining and machine learning algorithms.
Browse courses on Linear Algebra
Show steps
  • Review the concepts of vectors, matrices, and linear transformations.
  • Practice solving systems of linear equations.
  • Familiarize yourself with the basics of matrix operations.
Explore Apache Spark
Expand your knowledge of big data processing by exploring Apache Spark, a popular open-source framework.
Browse courses on Apache Spark
Show steps
  • Find online tutorials or courses on Apache Spark.
  • Follow the tutorials and practice using Spark to process and analyze large datasets.
  • Explore Spark's features and capabilities, such as its distributed computing engine and machine learning library.
Four other activities
Expand to see all activities and additional details
Show all seven activities
Collaborate on data analysis projects
Work with peers to analyze real-world datasets and gain practical experience in solving big data problems.
Browse courses on Data Analysis
Show steps
  • Form a team with other classmates and choose a dataset to analyze.
  • Plan and design your data analysis project, including defining research questions and methodology.
  • Execute your data analysis plan and present your findings to the class.
Implement a MapReduce algorithm
Gain hands-on experience implementing MapReduce algorithms to process large datasets.
Browse courses on MapReduce
Show steps
  • Choose a dataset and design a MapReduce algorithm to process it.
  • Implement the algorithm using a programming language like Python or Java.
  • Test and debug your implementation to ensure it runs correctly.
Contribute to open-source big data projects
Gain practical experience and contribute to the big data community by participating in open-source projects.
Browse courses on Open Source
Show steps
  • Identify open-source big data projects that align with your interests.
  • Review the project's documentation and familiarize yourself with its codebase.
  • Identify areas where you can contribute and submit pull requests with your proposed changes.
Design a web search engine
Apply your understanding of PageRank and web graph analysis to design and implement a web search engine.
Browse courses on Web Search
Show steps
  • Gather a corpus of web pages and build an index.
  • Design and implement a PageRank algorithm to rank the pages in the index.
  • Develop a user interface for the search engine and integrate it with the PageRank algorithm.

Career center

Learners who complete Big Data Fundamentals will develop knowledge and skills that may be useful to these careers:
Data Scientist
Data Scientists use their expertise in mathematics and computer science to analyze large amounts of data. They work in a variety of industries, including finance, healthcare, and retail. The course Big Data Fundamentals would be a useful foundation for a career as a Data Scientist. It provides an overview of the basic concepts of big data, including data mining and stream processing. The course also covers specific applications of big data methods in industry and research.
Data Architect
Data Architects design and build data systems that meet the needs of businesses. They work with a variety of stakeholders, including business users, IT professionals, and database administrators. The course Big Data Fundamentals would be a helpful introduction to the field of Data Architecture. It provides an overview of the different types of big data systems and the challenges involved in designing and building them.
Data Engineer
Data Engineers build and maintain the infrastructure that supports big data systems. They work with a variety of hardware and software technologies, including Hadoop, Spark, and NoSQL databases. The course Big Data Fundamentals would be a helpful introduction to the field of Data Engineering. It provides an overview of the different types of big data systems and the challenges involved in building and maintaining them.
Database Administrator
Database Administrators maintain and manage databases. They work with a variety of database technologies, including SQL and NoSQL databases. The course Big Data Fundamentals would be a helpful introduction to the field of Database Administration. It provides an overview of the different types of databases and the challenges involved in maintaining and managing them.
Software Engineer
Software Engineers design, develop, and maintain software applications. They work in a variety of industries, including finance, healthcare, and retail. The course Big Data Fundamentals would be a helpful introduction to the field of Software Engineering. It provides an overview of the different types of software applications and the challenges involved in designing, developing, and maintaining them.
Statistician
Statisticians use data to make inferences about the world. They work with a variety of data sources, including surveys, experiments, and observational studies. The course Big Data Fundamentals would be a helpful introduction to the field of Statistics. It provides an overview of the different types of statistical techniques and the challenges involved in using data to make inferences about the world.
Data Analyst
Data Analysts use data to identify trends and patterns. They work with a variety of data sources, including structured and unstructured data. The course Big Data Fundamentals would be a helpful introduction to the field of Data Analysis. It provides an overview of the different types of data analysis techniques and the challenges involved in using data to identify trends and patterns.
Operations Research Analyst
Operations Research Analysts use data to improve the efficiency of operations. They work with a variety of industries, including manufacturing, healthcare, and transportation. The course Big Data Fundamentals would be a helpful introduction to the field of Operations Research. It provides an overview of the different types of operations research techniques and the challenges involved in using data to improve the efficiency of operations.
Business Analyst
Business Analysts use data to help businesses make better decisions. They work with a variety of stakeholders, including business users, IT professionals, and data scientists. The course Big Data Fundamentals would be a helpful introduction to the field of Business Analysis. It provides an overview of the different types of data analysis techniques and the challenges involved in using data to make better decisions.
Market Research Analyst
Market Research Analysts use data to understand the needs of customers. They work with a variety of data sources, including surveys, focus groups, and sales data. The course Big Data Fundamentals would be a helpful introduction to the field of Market Research. It provides an overview of the different types of market research techniques and the challenges involved in using data to understand the needs of customers.
User Experience Researcher
User Experience Researchers use data to improve the user experience of products and services. They work with a variety of data sources, including user feedback, surveys, and analytics data. The course Big Data Fundamentals would be a helpful introduction to the field of User Experience Research. It provides an overview of the different types of user experience research techniques and the challenges involved in using data to improve the user experience of products and services.
Financial Analyst
Financial Analysts use data to make investment decisions. They work with a variety of financial data, including stock prices, bond yields, and economic indicators. The course Big Data Fundamentals would be a helpful introduction to the field of Financial Analysis. It provides an overview of the different types of financial data and the challenges involved in using data to make investment decisions.
Web Developer
Web Developers design and develop websites. They work with a variety of programming languages and technologies, including HTML, CSS, and JavaScript. The course Big Data Fundamentals would be a helpful introduction to the field of Web Development. It provides an overview of the different types of web development technologies and the challenges involved in designing and developing websites.
Data Management Analyst
Data Management Analysts manage data for organizations. They work with a variety of data sources, including structured and unstructured data. The course Big Data Fundamentals would be a helpful introduction to the field of Data Management. It provides an overview of the different types of data management techniques and the challenges involved in managing data for organizations.
Mobile Developer
Mobile Developers design and develop mobile applications. They work with a variety of programming languages and technologies, including Java, Swift, and Kotlin. The course Big Data Fundamentals would be a helpful introduction to the field of Mobile Development. It provides an overview of the different types of mobile development technologies and the challenges involved in designing and developing mobile applications.

Reading list

We've selected 16 books that we think will supplement your learning. Use these to develop background knowledge, enrich your coursework, and gain a deeper understanding of the topics covered in Big Data Fundamentals.
A comprehensive textbook on data mining techniques for massive datasets. Provides a rigorous and in-depth treatment of various algorithms and techniques used in big data analysis.
A classic textbook on data mining techniques, providing a comprehensive overview of the field. Offers a solid foundation for understanding the concepts and algorithms used in big data analysis.
A thought-provoking exploration of the impact of big data on organizations and society. Provides a broad perspective on the challenges and opportunities of big data, and how to leverage it effectively.
Covers a wide range of data mining techniques, including clustering, classification, and association rule mining. It is useful as a reference for implementing algorithms for processing big data.
A comprehensive handbook on recommender systems, covering various techniques and applications. Provides a comprehensive overview of the field and includes case studies and examples from industry.
Focuses on the application of machine learning algorithms to big data, including topics such as natural language processing and predictive analytics. It is useful for understanding how to use machine learning to analyze and extract insights from big data.
Provides a comprehensive guide to Hadoop, a popular framework for processing big data. It valuable resource for understanding the architecture and implementation of Hadoop.
Focuses on the MapReduce programming model, which is commonly used for processing big data. It provides a practical guide to designing and implementing MapReduce applications.
Provides a business-oriented perspective on data science, covering topics such as data visualization, data mining, and predictive analytics. It is useful for understanding the role of big data in business decision-making.
Provides a hands-on approach to big data analytics, covering topics such as data cleaning, data analysis, and data visualization. It is useful for gaining practical experience with big data analysis tools and techniques.
Provides case studies of how companies have successfully used big data to achieve business success. It is useful for understanding the practical applications of big data in various industries.
Provides a comprehensive overview of big data analytics, covering topics such as data storage, data processing, and data analysis. It is useful for understanding the challenges and solutions involved in managing big data.
Provides a non-technical overview of big data, covering its impact on society, business, and government. It is useful for understanding the broader implications of big data.
Provides a practical guide to data analytics, covering topics such as data visualization, data analysis, and data storytelling. It is useful for understanding the basics of data analytics and how to communicate insights effectively.
Provides a beginner-friendly introduction to big data analytics, covering topics such as data storage, data processing, and data analysis. It is useful for those who are new to big data and want to learn the basics.

Share

Help others find this course page by sharing it with your friends and followers:
Our mission

OpenCourser helps millions of learners each year. People visit us to learn workspace skills, ace their exams, and nurture their curiosity.

Our extensive catalog contains over 50,000 courses and twice as many books. Browse by search, by topic, or even by career interests. We'll match you to the right resources quickly.

Find this site helpful? Tell a friend about us.

Affiliate disclosure

We're supported by our community of learners. When you purchase or subscribe to courses and programs or purchase books, we may earn a commission from our partners.

Your purchases help us maintain our catalog and keep our servers humming without ads.

Thank you for supporting OpenCourser.

© 2016 - 2024 OpenCourser