About this Professional Certificate
Data engineers and Big Data professionals are in overwhelming demand. NoSQL and Big Data technology skills such as Apache Spark are a must-have for modern day data-driven decision-making. This three-course Professional Certificate from IBM opens the door for data engineering and big data careers.
Starting with
, this course introduces you to NoSQL fundamentals, including the four key non-relational database categories. By the end of the course, you will have hands-on skills working with MongoDB, Cassandra, and IBM Cloudant NoSQL databases.
A crucial aspect of data engineering is the acquisition and management of Big Data and Big Data Analytics scalability and performance. When you enroll in
, you'll discover the characteristics, features, benefits, limitations, and applications of some of the more popular Big Data processing tools. You explore the open-source ecosystem of Apache tools, including Apache Hadoop, Apache Hive, and Apache Spark, including Spark on Kubernetes. Discover how to leverage Spark to deliver reliable insights. You'll gain hands-on data analysis skills using PySpark and Spark SQL and create a streaming analytics application using Spark Streaming, and more.
Then enroll in
to discover how data and machine learning engineers use Spark Structured Streaming, GraphFrames, Regression, Classification, and clustering. Learn about clustering and how to apply the k-means clustering algorithm using Spark MLlib. Extraction Transformation and Loading, (ETL) is at the heart of data and machine learning engineering, and you'll gain skills using Spark to perform extract, transform and load (ETL) tasks. This course culminates with a hands-on Spark project.
This Professional Certificate does not require any prior programming or data science skills; however, prior basic data literacy and SQL skills will prove valuable in completing this program.
What you'll learn
- Differentiate between the four main categories of NoSQL repositories and work hands-on with MongoDB, Cassandra and IBM Cloudant.
- Apply your knowledge of the characteristics, features, benefits, limitations, and applications of the more popular Big Data processing tools, including Hadoop, HDFS, Hive and HBase.
- Describe parallel programming using Resilient Distributed Datasets (RDDs), DataFrames and SparkSQL. Understand how Catalyst and Tungsten benefit Spark programmer and see how ETL work using DataFrames.
- Acquire real-world data engineering and machine learning skills using Spark Structured Streaming, DataFrames, GraphFrames, Spark ML, Regression, Classification, and clustering, including the k-means algorithm and ETL using Spark.
- Gain hands-on experience using SparkSQL, Apache Spark on IBM Cloud.
- Learn about scaling out using the IBM Spark Environment in Watson Studio, running Spark on Kubernetes, setting Spark configurations, and performing monitoring and performance tuning.
From | IBM via edX |
---|---|
Hours | 42 |
Instructors | Rav Ahuja, Ramesh Sannareddy, Steve Ryan, Karthik Muthuraman, Aije Egwaikhide, Romeo Kienzler |
Language | English |
Subjects | Programming |
Similar Courses
Sorted by relevance
Careers
An overview of related careers and their average salaries in the US. Bars indicate income percentile (33rd - 99th).
Volunteer Big Data Engineer $48k
Data Scientist - Big Data $68k
Big Data and AWS Data Lake $73k
Big Data Developer (Streaming Data) $77k
Big data developer with AWS $78k
Research Scientist Big Data $94k
Big Data Developer Consultant $98k
Big Data Engineer 6 $107k
Big data and ETL specialist $121k
Big Data Specialist $149k
Principal Big Data Architect $180k
Senior Big Data Sales $181k
Courses in this Professional Certificate
Listed in the order in which they should be taken
Starts | Course Information | |
---|---|---|
On Demand |
This course will provide you with technical hands-on knowledge of NoSQL databases and Database-as-a-Service (DaaS) offerings. With the advent of Big Data and agile development... edX | IBM |
Save
|
On Demand |
Big Data, Hadoop, and Spark Basics Organizations need skilled, forward-thinking Big Data practitioners who can apply their business and technical skills to unstructured data such as tweets, posts, pictures, audio... edX | IBM |
Save
|
On Demand |
Apache Spark for Data Engineering and Machine Learning Apache® Spark™ is a fast, flexible, and developer-friendly open-source platform for large-scale SQL, batch processing, stream processing, and machine learning. Users can take... edX | IBM |
Save
|
&
From | IBM via edX |
---|---|
Hours | 42 |
Instructors | Rav Ahuja, Ramesh Sannareddy, Steve Ryan, Karthik Muthuraman, Aije Egwaikhide, Romeo Kienzler |
Language | English |
Subjects | Programming |
Careers
An overview of related careers and their average salaries in the US. Bars indicate income percentile (33rd - 99th).
Volunteer Big Data Engineer $48k
Data Scientist - Big Data $68k
Big Data and AWS Data Lake $73k
Big Data Developer (Streaming Data) $77k
Big data developer with AWS $78k
Research Scientist Big Data $94k
Big Data Developer Consultant $98k
Big Data Engineer 6 $107k
Big data and ETL specialist $121k
Big Data Specialist $149k
Principal Big Data Architect $180k
Senior Big Data Sales $181k
Similar Courses
Sorted by relevance