Save For Later

NoSQL, Big Data and Spark Fundamentals

Save For Later

Data engineers and Big Data professionals are in overwhelming demand. NoSQL and Big Data technology skills such as Apache Spark are a must-have for modern day data-driven decision-making. This three-course Professional Certificate from IBM opens the door for data engineering and big data careers.

Starting with

, this course introduces you to NoSQL fundamentals, including the four key non-relational database categories. By the end of the course, you will have hands-on skills working with MongoDB, Cassandra, and IBM Cloudant NoSQL databases.

A crucial aspect of data engineering is the acquisition and management of Big Data and Big Data Analytics scalability and performance. When you enroll in

, you'll discover the characteristics, features, benefits, limitations, and applications of some of the more popular Big Data processing tools. You explore the open-source ecosystem of Apache tools, including Apache Hadoop, Apache Hive, and Apache Spark, including Spark on Kubernetes. Discover how to leverage Spark to deliver reliable insights. You'll gain hands-on data analysis skills using PySpark and Spark SQL and create a streaming analytics application using Spark Streaming, and more.

Then enroll in

to discover how data and machine learning engineers use Spark Structured Streaming, GraphFrames, Regression, Classification, and clustering. Learn about clustering and how to apply the k-means clustering algorithm using Spark MLlib. Extraction Transformation and Loading, (ETL) is at the heart of data and machine learning engineering, and you'll gain skills using Spark to perform extract, transform and load (ETL) tasks. This course culminates with a hands-on Spark project.

This Professional Certificate does not require any prior programming or data science skills; however, prior basic data literacy and SQL skills will prove valuable in completing this program.

What you'll learn

  • Differentiate between the four main categories of NoSQL repositories and work hands-on with MongoDB, Cassandra and IBM Cloudant.
  • Apply your knowledge of the characteristics, features, benefits, limitations, and applications of the more popular Big Data processing tools, including Hadoop, HDFS, Hive and HBase.
  • Describe parallel programming using Resilient Distributed Datasets (RDDs), DataFrames and SparkSQL. Understand how Catalyst and Tungsten benefit Spark programmer and see how ETL work using DataFrames.
  • Acquire real-world data engineering and machine learning skills using Spark Structured Streaming, DataFrames, GraphFrames, Spark ML, Regression, Classification, and clustering, including the k-means algorithm and ETL using Spark.
  • Gain hands-on experience using SparkSQL, Apache Spark on IBM Cloud.
  • Learn about scaling out using the IBM Spark Environment in Watson Studio, running Spark on Kubernetes, setting Spark configurations, and performing monitoring and performance tuning.

Read More

OpenCourser is an affiliate partner of edX and may earn a commission when you buy through our links.

From IBM via edX
Hours 42
Instructors Rav Ahuja, Ramesh Sannareddy, Steve Ryan, Karthik Muthuraman, Aije Egwaikhide, Romeo Kienzler
Language English
Subjects Programming

Similar Courses

Sorted by relevance

Careers

An overview of related careers and their average salaries in the US. Bars indicate income percentile (33rd - 99th).

Volunteer Big Data Engineer $48k

Data Scientist - Big Data $68k

Big Data and AWS Data Lake $73k

Big Data Developer (Streaming Data) $77k

Big data developer with AWS $78k

Research Scientist Big Data $94k

Big Data Developer Consultant $98k

Big Data Engineer 6 $107k

Big data and ETL specialist $121k

Big Data Specialist $149k

Principal Big Data Architect $180k

Senior Big Data Sales $181k

Courses in this Professional Certificate

Listed in the order in which they should be taken

Starts Course Information

On Demand

NoSQL Database Basics

This course will provide you with technical hands-on knowledge of NoSQL databases and Database-as-a-Service (DaaS) offerings. With the advent of Big Data and agile development...

edX | IBM

Save

On Demand

Big Data, Hadoop, and Spark Basics

Organizations need skilled, forward-thinking Big Data practitioners who can apply their business and technical skills to unstructured data such as tweets, posts, pictures, audio...

edX | IBM

Save

On Demand

Apache Spark for Data Engineering and Machine Learning

Apache® Spark™ is a fast, flexible, and developer-friendly open-source platform for large-scale SQL, batch processing, stream processing, and machine learning. Users can take...

edX | IBM

Save

edX

&

IBM

From IBM via edX
Hours 42
Instructors Rav Ahuja, Ramesh Sannareddy, Steve Ryan, Karthik Muthuraman, Aije Egwaikhide, Romeo Kienzler
Language English
Subjects Programming

Careers

An overview of related careers and their average salaries in the US. Bars indicate income percentile (33rd - 99th).

Volunteer Big Data Engineer $48k

Data Scientist - Big Data $68k

Big Data and AWS Data Lake $73k

Big Data Developer (Streaming Data) $77k

Big data developer with AWS $78k

Research Scientist Big Data $94k

Big Data Developer Consultant $98k

Big Data Engineer 6 $107k

Big data and ETL specialist $121k

Big Data Specialist $149k

Principal Big Data Architect $180k

Senior Big Data Sales $181k

Similar Courses

Sorted by relevance