Save for later

Big Data Essentials

HDFS, MapReduce and Spark RDD

This course is a part of Big Data for Data Engineers, a 5-course Specialization series from Coursera.

Have you ever heard about such technologies as HDFS, MapReduce, Spark? Always wanted to learn these new tools but missed concise starting material? Don’t miss this course either! In this 6-week course you will: - learn some basic technologies of the modern Big Data landscape, namely: HDFS, MapReduce and Spark; - be guided both through systems internals and their applications; - learn about distributed file systems, why they exist and what function they serve; - grasp the MapReduce framework, a workhorse for many modern Big Data applications; - apply the framework to process texts and solve sample business cases; - learn about Spark, the next-generation computational framework; - build a strong understanding of Spark basic concepts; - develop skills to apply these tools to creating solutions in finance, social networks, telecommunications and many other fields. Your learning experience will be as close to real life as possible with the chance to evaluate your practical assignments on a real cluster. No mocking, a friendly considerate atmosphere to make the process of your learning smooth and enjoyable. Get ready to work with real datasets alongside with real masters! Special thanks to: - Prof. Mikhail Roytberg, APT dept., MIPT, who was the initial reviewer of the project, the supervisor and mentor of half of the BigData team. He was the one, who helped to get this show on the road. - Oleg Sukhoroslov (PhD, Senior Researcher at IITP RAS), who has been teaching MapReduce, Hadoop and friends since 2008. Now he is leading the infrastructure team. - Oleg Ivchenko (PhD student APT dept., MIPT), Pavel Akhtyamov (MSc. student at APT dept., MIPT) and Vladimir Kuznetsov (Assistant at P.G. Demidov Yaroslavl State University), superbrains who have developed and now maintain the infrastructure used for practical assignments in this course. - Asya Roitberg, Eugene Baulin, Marina Sudarikova. These people never sleep to babysit this course day and night, to make your learning experience productive, smooth and exciting.

Get Details and Enroll Now

OpenCourser is an affiliate partner of Coursera.

Set Reminder Save for later

Get a Reminder

Not ready to enroll yet? We'll send you an email reminder for this course

Send to:

Coursera

&

Yandex

Rating 3.3 based on 91 ratings
Length 7 weeks
Effort 6 weeks of study, 6-8 hours/week
Starts Aug 12 (last week)
Cost $49
From Yandex via Coursera
Instructors Ivan Puzyrevskiy, Alexey A. Dral, Emeli Dral, Evgeniy Ryabenko, Evgeniy Riabenko
Download Videos On all desktop and mobile devices
Language English
Subjects Programming Data Science
Tags Computer Science Data Science Data Analysis Software Development

Get a Reminder

Get an email reminder about this course

Send to:

What people are saying

According to other learners, here's what you need to know

grading system in 6 reviews

This week was pretty good and insightful around Map Reduce good course but grading system has some trouble This course very nice and cool, sometimes I want just stop it =) The course fills an important gap between software engineering and data engineering.

Also, the assignments have a 'bottleneck' at the grading system where you know the answer is correct yet the grader won't accept it because your route to the answer is different than standard.

The only thing that could be better is the grading system I am very glad that I completed this course, everything is extremely affordable.

), not so good topics (for introductory course), paranoid grading system.

The subject is very interesting but the grading system is very problematic and difficult.

Read more

lot of time in 5 reviews

I still do not understand map-side and reduce-side joins, and I do not feel comfortable writing a MapReduce job without a lot of time.The lectures over Hadoop were ok, but strange.

Assignments are not difficult but it takes a lot of time and attempts to figure out what exactly the authors wanted.

Task is easy, but takes a lot of time for debbuging on hdfs and understending whats wrong with submission.

This will save a lot of time!

Read more

for beginners in 4 reviews

This is definitely not a course for beginners.

Very great course for beginner in mapreduce...In detail and working map reduce knowledge Too quick to follow EXCELLENT This course is for beginners which have a couple of years of BigData experience Great Content if you are a beginner.

Too advanced for beginners.

Read more

more time in 4 reviews

The assignments are described minimalistically, passing the automatic checking of the assignments cost more time than actually getting the right answer for the assignment and often the external assignment environment is down or not functioning correctly.

So far, I have spent more time dealing with these troubleshooting issues than actually focusing on the content.

Read more

figure out in 3 reviews

I do feel like I spent much more time trying to figure out how to make my answers pass the autograder rather than learning how to structure my code to solve big data problems.

Submissions had lot of issues.I could not figure out and left the course in the middle(even the demo assignment was not working).The instructors were great but somehow I thought they were not very involved.Too much information (stated fast) out of which you may not be caring a lot.

very interesting in 3 reviews

The contend is very interesting.

Other than that, the material of the course is very interesting.

Careers

An overview of related careers and their average salaries in the US. Bars indicate income percentile.

Dept Chair & Teacher $40k

Instructor, Dept. of English $54k

Senior Teacher/Dept. Chair $66k

Marketing Dept. $70k

Emergency Dept Liaison $70k

Instructor, Music Dept. $71k

Colorado Internet Media Sales for Apt and Rentals $72k

Extrusion Dept Leader $74k

Assistant SAFETY & TRAINING DEPT $75k

Dept. of Anesthesia $77k

Quality Dept. $87k

Senior APT Analyst $142k

Write a review

Your opinion matters. Tell us what you think.

Coursera

&

Yandex

Rating 3.3 based on 91 ratings
Length 7 weeks
Effort 6 weeks of study, 6-8 hours/week
Starts Aug 12 (last week)
Cost $49
From Yandex via Coursera
Instructors Ivan Puzyrevskiy, Alexey A. Dral, Emeli Dral, Evgeniy Ryabenko, Evgeniy Riabenko
Download Videos On all desktop and mobile devices
Language English
Subjects Programming Data Science
Tags Computer Science Data Science Data Analysis Software Development

Similar Courses

Sorted by relevance

Like this course?

Here's what to do next:

  • Save this course for later
  • Get more details from the course provider
  • Enroll in this course
Enroll Now