We may earn an affiliate commission when you visit our partners.
Course image
Natalia Pritykovskaya, Pavel Klemenkov, Pavel Mezentsev, and Alexey A. Dral
No doubt working with huge data volumes is hard, but to move a mountain, you have to deal with a lot of small stones. But why strain yourself? Using Mapreduce and Spark you tackle the issue partially, thus leaving some space for high-level tools. Stop struggling to make your big data workflow productive and efficient, make use of the tools we are offering you. This course will teach you how to: - Warehouse your data efficiently using Hive, Spark SQL and Spark DataFframes. - Work with large graphs, such as social graphs or networks. - Optimize your Spark applications for maximum performance. Precisely, you will master your...
Read more
No doubt working with huge data volumes is hard, but to move a mountain, you have to deal with a lot of small stones. But why strain yourself? Using Mapreduce and Spark you tackle the issue partially, thus leaving some space for high-level tools. Stop struggling to make your big data workflow productive and efficient, make use of the tools we are offering you. This course will teach you how to: - Warehouse your data efficiently using Hive, Spark SQL and Spark DataFframes. - Work with large graphs, such as social graphs or networks. - Optimize your Spark applications for maximum performance. Precisely, you will master your knowledge in: - Writing and executing Hive & Spark SQL queries; - Reasoning how the queries are translated into actual execution primitives (be it MapReduce jobs or Spark transformations); - Organizing your data in Hive to optimize disk space usage and execution times; - Constructing Spark DataFrames and using them to write ad-hoc analytical jobs easily; - Processing large graphs with Spark GraphFrames; - Debugging, profiling and optimizing Spark application performance. Still in doubt? Check this out. Become a data ninja by taking this course! Special thanks to: - Prof. Mikhail Roytberg, APT dept., MIPT, who was the initial reviewer of the project, the supervisor and mentor of half of the BigData team. He was the one, who helped to get this show on the road. - Oleg Sukhoroslov (PhD, Senior Researcher at IITP RAS), who has been teaching MapReduce, Hadoop and friends since 2008. Now he is leading the infrastructure team. - Oleg Ivchenko (PhD student APT dept., MIPT), Pavel Akhtyamov (MSc. student at APT dept., MIPT) and Vladimir Kuznetsov (Assistant at P.G. Demidov Yaroslavl State University), superbrains who have developed and now maintain the infrastructure used for practical assignments in this course. - Asya Roitberg, Eugene Baulin, Marina Sudarikova. These people never sleep to babysit this course day and night, to make your learning experience productive, smooth and exciting.
Enroll now

Good to know

Know what's good
, what to watch for
, and possible dealbreakers
Explores MapReduce and Spark, which are standard in industry y
Taught by Pavel Mezentsev, Alexey A. Dral, Natalia Pritykovskaya, and Pavel Klemenkov, who are recognized for their work in big data
Develops warehousing and large-graph data processing skills, which are core skills for big data engineers
Offers hands-on labs and interacive materials
This course is explicitly stated to be no longer accessible

Save this course

Save Big Data Analysis: Hive, Spark SQL, DataFrames and GraphFrames to your list so you can find it easily later:
Save

Reviews summary

Essential for big data

This challenging course is heavy on concepts and light on support. Rigorous and thorough, it is not for beginners. If you already have some knowledge and experience in big data, you will find this course to be a solid addition to your toolkit.
Best for those with some big data knowledge.
"Great course if you have a little bit of experience in the big data world."
In-depth and complex material.
"This one was not for playing."
"Worst course very difficult to pass"
Limited support materials and resources.
"Le cours n'est pas bien structuré"
"Bugged grader and complete lack of support from the course admins"
Frequent technical problems with software.
"I could not complete this course as jupyter notebook is freezing while uploading the assignments"
"And finally: not working instrument - docker images, labs, assignment tools - and unresponsive support."
Grader often malfunctions.
"Grader is awful"
"There is many issues in LTI grader."

Activities

Coming soon We're preparing activities for Big Data Analysis: Hive, Spark SQL, DataFrames and GraphFrames. These are activities you can do either before, during, or after a course.

Career center

Learners who complete Big Data Analysis: Hive, Spark SQL, DataFrames and GraphFrames will develop knowledge and skills that may be useful to these careers:
Data Scientist
Data Scientists can improve their usage of Hive, Spark SQL, DataFrames, and GraphFrames, which will help them to wrangle complex big data sets and draw actionable insights. This is key to delivering more value from their projects and producing more accurate results.
Big Data Engineer
Apache Hive, Spark SQL, DataFrames, and GraphFrames are all essential tools for Big Data Engineers. This course can help them master these tools and develop the skills they need to design, build, and manage big data systems.
Data Analyst
Data Analysts can use the skills they learn in this course to improve their ability to analyze large datasets and identify trends and patterns for actionable insights. This can greatly improve their ability to add value to their organization.
Software Engineer
Software Engineers who specialize in Big Data would greatly benefit from this course. It can help them develop the skills and knowledge they need to design, build, and maintain big data systems.
Data Architect
Data Architects can use the skills they learn in this course to improve their ability to design and manage big data systems. This can help them ensure that their organizations can effectively use big data to achieve their business goals.
Machine Learning Engineer
Machine Learning Engineers can use the skills they learn in this course to improve their ability to build and deploy machine learning models on big data sets. This can help them to develop more accurate and effective models.
Cloud Architect
Cloud Architects can use the skills they learn in this course to improve their ability to design and manage big data systems in the cloud. This can help them to ensure that their organizations can effectively use the cloud to achieve their business goals.
Data Engineer
Data Engineers can use the skills they learn in this course to improve their ability to design, build, and maintain big data systems. This can help them to ensure that their organizations can effectively use big data to achieve their business goals.
Software Developer
Software Developers who specialize in Big Data would greatly benefit from this course. It can help them develop the skills and knowledge they need to design, build, and maintain big data systems.
Business Analyst
Business Analysts can use the skills they learn in this course to improve their ability to understand and analyze big data. This can help them to make better decisions and develop more effective business strategies.
Database Administrator
Database Administrators can use the skills they learn in this course to improve their ability to manage big data systems. This can help them to ensure that their organizations can effectively use big data to achieve their business goals.
Systems Administrator
Systems Administrators can use the skills they learn in this course to improve their ability to manage big data systems. This can help them to ensure that their organizations can effectively use big data to achieve their business goals.
Information Technology Manager
Information Technology Managers can use the skills they learn in this course to improve their ability to manage big data systems. This can help them to ensure that their organizations can effectively use big data to achieve their business goals.

Featured in The Course Notes

This course is mentioned in our blog, The Course Notes. Read one article that features Big Data Analysis: Hive, Spark SQL, DataFrames and GraphFrames:

Reading list

We haven't picked any books for this reading list yet.

Share

Help others find this course page by sharing it with your friends and followers:

Similar courses

Similar courses are unavailable at this time. Please try again later.
Our mission

OpenCourser helps millions of learners each year. People visit us to learn workspace skills, ace their exams, and nurture their curiosity.

Our extensive catalog contains over 50,000 courses and twice as many books. Browse by search, by topic, or even by career interests. We'll match you to the right resources quickly.

Find this site helpful? Tell a friend about us.

Affiliate disclosure

We're supported by our community of learners. When you purchase or subscribe to courses and programs or purchase books, we may earn a commission from our partners.

Your purchases help us maintain our catalog and keep our servers humming without ads.

Thank you for supporting OpenCourser.

© 2016 - 2024 OpenCourser