We may earn an affiliate commission when you visit our partners.
Course image
Lewis Mitchell, Simon Tuke, and David Suter

Gain essential skills in today’s digital age to store, process and analyse data to inform business decisions.

In this course, part of the Big Data MicroMasters program, you will develop your knowledge of big data analytics and enhance your programming and mathematical skills. You will learn to use essential analytic tools such as Apache Spark and R.

Topics covered in this course include:

  • cloud-based big data analysis;
  • predictive analytics, including probabilistic and statistical models;
  • application of large-scale data analysis;
  • analysis of problem space and data needs.
Read more

Gain essential skills in today’s digital age to store, process and analyse data to inform business decisions.

In this course, part of the Big Data MicroMasters program, you will develop your knowledge of big data analytics and enhance your programming and mathematical skills. You will learn to use essential analytic tools such as Apache Spark and R.

Topics covered in this course include:

  • cloud-based big data analysis;
  • predictive analytics, including probabilistic and statistical models;
  • application of large-scale data analysis;
  • analysis of problem space and data needs.

By the end of this course, you will be able to approach large-scale data science problems with creativity and initiative.

What's inside

Learning objectives

  • How to develop algorithms for the statistical analysis of big data;
  • Knowledge of big data applications;
  • How to use fundamental principles used in predictive analytics;
  • Evaluate and apply appropriate principles, techniques and theories to large-scale data science problems.

Syllabus

Section 1: Simple linear regressionFit a simple linear regression between two variables in R;Interpret output from R;Use models to predict a response variable;Validate the assumptions of the model.
Read more
Section 2: Modelling dataAdapt the simple linear regression model in R to deal with multiple variables;Incorporate continuous and categorical variables in their models;Select the best-fitting model by inspecting the R output.
Section 3: Many modelsManipulate nested dataframes in R;Use R to apply simultaneous linear models to large data frames by stratifying the data;Interpret the output of learner models.
Section 4: ClassificationAdapt linear models to take into account when the response is a categorical variable;Implement Logistic regression (LR) in R;Implement Generalised linear models (GLMs) in R;Implement Linear discriminant analysis (LDA) in R.
Section 5: Prediction using modelsImplement the principles of building a model to do prediction using classification;Split data into training and test sets, perform cross validation and model evaluation metrics;Use model selection for explaining data with models;Analyse the overfitting and bias-variance trade-off in prediction problems.
Section 6: Getting biggerSet up and apply sparklyr;Use logical verbs in R by applying native sparklyr versions of the verbs.
Section 7: Supervised machine learning with sparklyrApply sparklyr to machine learning regression and classification models;Use machine learning models for prediction;Illustrate how distributed computing techniques can be used for “bigger” problems.
Section 8: Deep learningUse massive amounts of data to train multi-layer networks for classification;Understand some of the guiding principles behind training deep networks, including the use of autoencoders, dropout, regularization, and early termination;Use sparklyr and H2O to train deep networks.
Section 9: Deep learning applications and scaling upUnderstand some of the ways in which massive amounts of unlabelled data, and partially labelled data, is used to train neural network models;Leverage existing trained networks for targeting new applications;Implement architectures for object classification and object detection and assess their effectiveness.
Section 10: Bringing it all togetherConsolidate your understanding of relationships between the methodologies presented in this course, theirrelative strengths, weaknesses and range of applicability of these methods.

Good to know

Know what's good
, what to watch for
, and possible dealbreakers
Designed for learners with some background in data science or programming
Designed for learners who want to enhance their programming and mathematical skills
Taught by Lewis Mitchell, Simon Tuke, and David Suter, who are recognized for their work in the field
Develops skills in cloud-based big data analysis, predictive analytics, and large-scale data analysis

Save this course

Save Big Data Analytics to your list so you can find it easily later:
Save

Activities

Be better prepared before your course. Deepen your understanding during and after it. Supplement your coursework and achieve mastery of the topics covered in Big Data Analytics with these activities:
Review the book 'Applied Statistics and Probability for Engineers'
This book provides a comprehensive overview of the fundamental concepts of statistics and probability, with a focus on applications in engineering
Show steps
  • Read the book carefully
  • Take notes on the key concepts
  • Work through the practice problems
Follow the tutorials on the 'StatQuest' YouTube channel
These tutorials provide clear and concise explanations of statistical concepts, with a focus on practical applications
Browse courses on Statistics
Show steps
  • Find a tutorial on the topic you are interested in
  • Watch the tutorial carefully
  • Take notes on the key concepts
Review the fundamentals of statistics
In order to understand the concepts covered in this course you should review the basic principles of statistics
Browse courses on Statistics
Show steps
  • Review the basics of probability distributions
  • Review the basics of statistical inference
  • Review the basics of regression analysis
Six other activities
Expand to see all activities and additional details
Show all nine activities
Create a cheat sheet of statistical formulas
Creating a cheat sheet will help you organize your knowledge and make it easier to remember the formulas you need
Browse courses on Statistics
Show steps
  • List all of the statistical formulas you need to know
  • Organize the formulas into a logical order
  • Create a visually appealing cheat sheet
Participate in a peer study group
Studying with peers can help you to learn the material more effectively and identify areas where you need additional support
Browse courses on Statistics
Show steps
  • Find a group of students who are also taking the course
  • Meet with the group regularly to discuss the material
  • Work together on practice problems
Practice solving statistical problems
Solving statistical problems will help you develop your problem-solving skills and improve your understanding of the material
Browse courses on Statistics
Show steps
  • Find a collection of statistical problems
  • Attempt to solve the problems on your own
  • Check your solutions against the provided answer key
Mentor other students who are struggling with statistics
Helping others to learn statistics will reinforce your own understanding of the material
Browse courses on Statistics
Show steps
  • Find a student who is struggling with statistics
  • Offer to help them with their studies
  • Meet with the student regularly to provide guidance and support
Start a project that uses statistical analysis to solve a real-world problem
Working on a project will give you the opportunity to apply your statistical knowledge to a practical problem and develop your problem-solving skills
Browse courses on Statistics
Show steps
  • Identify a real-world problem that can be solved using statistical analysis
  • Collect data on the problem
  • Analyze the data using statistical methods
  • Develop a solution to the problem based on your analysis
Participate in a statistics competition
Participating in a competition can help you to test your knowledge and skills, and to learn from others
Browse courses on Statistics
Show steps
  • Find a statistics competition that you are interested in
  • Register for the competition
  • Prepare for the competition by studying and practicing
  • Compete in the competition

Career center

Learners who complete Big Data Analytics will develop knowledge and skills that may be useful to these careers:
Data Analyst
A Data Analyst designs and builds systems and tools that retrieve, scrub, interpret, and visualize large amounts of raw data from a variety of sources. Part of this job is to develop new data mining techniques that will improve the quality and efficiency of data analysis. Data Analysts require knowledge of analytics, modeling, and databases, which are all skills that you will gain from taking the Big Data Analytics course.
Data Scientist
Data Scientists refine existing machine learning and statistical models using data provided from business analysts to identify risks and make predictions. They often wear many hats and are expected to be skilled in machine learning, data wrangling, communication, and programming. These are skills that will be refined taking the Big Data Analytics course.
Deep Learning Scientist
Deep Learning Scientists design and implement deep learning algorithms and models to solve complex problems in areas like image recognition and natural language processing. They use large data sets to train neural networks and other deep learning models. This course will introduce you to deep learning, enabling you to pursue a career as a Deep Learning Scientist.
Machine Learning Engineer
Machine Learning Engineers tune and implement algorithms that learn from big data, then develop and maintain the systems that power machine learning applications. They help companies stay competitive by improving the effectiveness of products and services through the use of data. While this course does not directly teach machine learning, you will learn about models that can be applied to machine learning.
Business Intelligence Analyst
Business Intelligence Analysts analyze data and trends to help businesses make informed decisions about strategy, operations, and marketing. They interpret complex data and communicate their insights to stakeholders in a clear and concise way. This course will give you the skills you need to become a successful Business Intelligence Analyst.
Data Architect
Data Architects design, build, and maintain data architectures that support the needs of an organization. They work with business stakeholders to understand their data requirements and then design and implement data solutions that meet those requirements. This course will help you develop a solid understanding of the principles of data architecture.
Database Administrator
Database Administrators maintain and optimize databases to ensure that they are running efficiently and securely. They also work with database users to provide support and training. This course will give you the skills you need to become a successful Database Administrator.
Data Engineer
Data Engineers build and maintain the systems that store and process data. They work with data scientists and other stakeholders to understand the data needs of the organization and then design and implement data solutions that meet those needs. This course will give you the skills you need to become a successful Data Engineer.
Software Engineer
Software Engineers design, develop, and maintain software applications. They work with stakeholders to understand their needs and then design and implement software solutions that meet those needs. This course will help you develop a solid foundation in software engineering, which will be helpful if you want to pursue a career as a Software Engineer.
Statistician
Statisticians collect, analyze, and interpret data. They use their knowledge of statistics to draw conclusions about the world around them. This course will help you develop a solid foundation in statistics, which will be helpful if you want to pursue a career as a Statistician.
Operations Research Analyst
Operations Research Analysts use mathematical and analytical techniques to solve problems in business and industry. They work with stakeholders to understand their needs and then develop and implement solutions that improve efficiency and productivity. This course will help you develop a solid foundation in operations research, which will be helpful if you want to pursue a career as an Operations Research Analyst.
Financial Analyst
Financial Analysts use data to make investment decisions. They analyze financial data to identify trends and make recommendations about which investments to buy or sell. This course will help you develop a solid foundation in financial analysis, which will be helpful if you want to pursue a career as a Financial Analyst.
Market Research Analyst
Market Research Analysts conduct research to understand consumer behavior and trends. They use their findings to help businesses make informed decisions about product development and marketing. This course will help you develop a solid foundation in market research, which will be helpful if you want to pursue a career as a Market Research Analyst.
Actuary
Actuaries use mathematics and statistics to assess risk and uncertainty. They work with insurance companies and other organizations to develop and implement strategies to manage risk. This course will help you develop a solid foundation in actuarial science, which will be helpful if you want to pursue a career as an Actuary.
Economist
Economists study how people make decisions in the face of scarcity. They use economic models to analyze data and make predictions about the economy. This course will help you develop a solid foundation in economics, which will be helpful if you want to pursue a career as an Economist.

Reading list

We've selected 11 books that we think will supplement your learning. Use these to develop background knowledge, enrich your coursework, and gain a deeper understanding of the topics covered in Big Data Analytics.
Provides a comprehensive overview of big data analytics, from strategic planning to enterprise integration. It covers both the theoretical foundations and practical applications of big data analytics, and valuable resource for both students and practitioners.
Classic textbook on data mining, and valuable resource for students and practitioners who want to learn about the foundations of data mining. It covers a wide range of topics, including data preprocessing, clustering, classification, and association rule mining.
Classic textbook on reinforcement learning, and valuable resource for students and practitioners who want to learn about the foundations of reinforcement learning.
Provides a practical introduction to predictive analytics, and valuable resource for students and practitioners who want to learn how to use predictive analytics to solve business problems.
Provides a practical introduction to machine learning with Scikit-Learn, Keras, and TensorFlow, and valuable resource for students and practitioners who want to learn how to use these libraries to build and train machine learning models.
Provides a practical introduction to data science for business, and valuable resource for students and practitioners who want to learn how to use data science to solve business problems.
Provides a practical introduction to TensorFlow, and valuable resource for students and practitioners who want to learn how to use TensorFlow to build and train deep learning models.
Gentle introduction to data analytics, and good starting point for students and practitioners who are new to the field. It covers the basic concepts of data analytics, including data collection, cleaning, and analysis.
Gentle introduction to big data analytics, and good starting point for students and practitioners who are new to the field.

Share

Help others find this course page by sharing it with your friends and followers:

Similar courses

Here are nine courses similar to Big Data Analytics.
Big Data Analytics
Most relevant
Apache Spark 2.0 with Java -Learn Spark from a Big Data...
Most relevant
Developing Spark Applications Using Scala & Cloudera
Most relevant
Distributed Computing with Spark SQL
Most relevant
Scalable Machine Learning on Big Data using Apache Spark
Most relevant
Introduction to Big Data with Spark and Hadoop
Most relevant
Big Data, Hadoop, and Spark Basics
Most relevant
Apache Spark Fundamentals
Most relevant
Apache Spark 3 Fundamentals
Most relevant
Our mission

OpenCourser helps millions of learners each year. People visit us to learn workspace skills, ace their exams, and nurture their curiosity.

Our extensive catalog contains over 50,000 courses and twice as many books. Browse by search, by topic, or even by career interests. We'll match you to the right resources quickly.

Find this site helpful? Tell a friend about us.

Affiliate disclosure

We're supported by our community of learners. When you purchase or subscribe to courses and programs or purchase books, we may earn a commission from our partners.

Your purchases help us maintain our catalog and keep our servers humming without ads.

Thank you for supporting OpenCourser.

© 2016 - 2024 OpenCourser