We may earn an affiliate commission when you visit our partners.
Course image
Vivek Sarkar

This course teaches learners (industry professionals and students) the fundamental concepts of Distributed Programming in the context of Java 8. Distributed programming enables developers to use multiple nodes in a data center to increase throughput and/or reduce latency of selected applications. By the end of this course, you will learn how to use popular distributed programming frameworks for Java programs, including Hadoop, Spark, Sockets, Remote Method Invocation (RMI), Multicast Sockets, Kafka, Message Passing Interface (MPI), as well as different approaches to combine distribution with multithreading.

Read more

This course teaches learners (industry professionals and students) the fundamental concepts of Distributed Programming in the context of Java 8. Distributed programming enables developers to use multiple nodes in a data center to increase throughput and/or reduce latency of selected applications. By the end of this course, you will learn how to use popular distributed programming frameworks for Java programs, including Hadoop, Spark, Sockets, Remote Method Invocation (RMI), Multicast Sockets, Kafka, Message Passing Interface (MPI), as well as different approaches to combine distribution with multithreading.

Why take this course?

• All data center servers are organized as collections of distributed servers, and it is important for you to also learn how to use multiple servers for increased bandwidth and reduced latency.

• In addition to learning specific frameworks for distributed programming, this course will teach you how to integrate multicore and distributed parallelism in a unified approach.

• Each of the four modules in the course includes an assigned mini-project that will provide you with the necessary hands-on experience to use the concepts learned in the course on your own, after the course ends.

• During the course, you will have online access to the instructor and the mentors to get individualized answers to your questions posted on forums.

The desired learning outcomes of this course are as follows:

• Distributed map-reduce programming in Java using the Hadoop and Spark frameworks

• Client-server programming using Java's Socket and Remote Method Invocation (RMI) interfaces

• Message-passing programming in Java using the Message Passing Interface (MPI)

• Approaches to combine distribution with multithreading, including processes and threads, distributed actors, and reactive programming

Mastery of these concepts will enable you to immediately apply them in the context of distributed Java programs, and will also provide the foundation for mastering other distributed programming frameworks that you may encounter in the future (e.g., in Scala or C++).

Enroll now

What's inside

Syllabus

Welcome to the Course!
Welcome to Distributed Programming in Java! This course is designed as a three-part series and covers a theme or body of knowledge through various video lectures, demonstrations, and coding projects.
Read more
DISTRIBUTED MAP REDUCE
In this module, we will learn about the MapReduce paradigm, and how it can be used to write distributed programs that analyze data represented as key-value pairs. A MapReduce program is defined via user-specified map and reduce functions, and we will learn how to write such programs in the Apache Hadoop and Spark projects. TheMapReduce paradigm can be used to express a wide range of parallel algorithms. One example that we will study is computation of the TermFrequency – Inverse Document Frequency (TF-IDF) statistic used in document mining; this algorithm uses a fixed (non-iterative) number of map and reduce operations. Another MapReduce example that we will study is parallelization of the PageRank algorithm. This algorithm is an example of iterative MapReduce computations, and is also the focus of the mini-project associated with this module.
CLIENT-SERVER PROGRAMMING
In this module, we will learn about client-server programming, and how distributed Java applications can communicate with each other using sockets. Since communication via sockets occurs at the level of bytes, we will learn how to serialize objects into bytes in the sender process and to deserialize bytes into objects in the receiver process. Sockets and serialization provide the necessary background for theFile Server mini-project associated with this module. We will also learn about Remote Method Invocation (RMI), which extends the notion of method invocation in a sequential program to a distributed programming setting. Likewise, we will learn about multicast sockets,which generalize the standard socket interface to enable a sender to send the same message to a specified set of receivers; this capability can be very useful for a number of applications, including news feeds,video conferencing, and multi-player games. Finally, we will learn about distributed publish-subscribe applications, and how they can be implemented using the Apache Kafka framework.
Talking to Two Sigma: Using it in the Field
Join Professor Vivek Sarkar as he talks with Two Sigma Managing Director, Jim Ward, and Senior Vice President, Dr. Eric Allen at their downtown Houston, Texas office about the importance of distributed programming.
MESSAGE PASSING
In this module, we will learn how to write distributed applications in the Single Program Multiple Data (SPMD) model, specifically by using the Message Passing Interface (MPI) library. MPI processes can send and receive messages using primitives for point-to-point communication, which are different in structure and semantics from message-passing with sockets. We will also learn about the message ordering and deadlock properties of MPI programs. Non-blocking communications are an interesting extension of point-to-point communications, since they can be used to avoid delays due to blocking and to also avoid deadlock-related errors. Finally, we will study collective communication, which can involve multiple processes in a manner that is more powerful than multicast and publish-subscribe operations. The knowledge of MPI gained in this module will be put to practice in the mini-project associated with this module on implementing a distributed matrix multiplication program in MPI.
COMBINING DISTRIBUTION AND MULTITHREADING
In this module, we will study the roles of processes and threads as basic building blocks of parallel, concurrent, and distributed Java programs. With this background, we will then learn how to implement multithreaded servers for increased responsiveness in distributed applications written using sockets, and apply this knowledge in the mini-project on implementing a parallel file server using both multithreading and sockets. An analogous approach can also be used to combine MPI and multithreading, so as to improve the performance of distributed MPI applications. Distributed actors serve as yet another example of combining distribution and multithreading. A notable property of the actor model is that the same high-level constructs can be used to communicate among actors running in the same process and among actors in different processes; the difference between the two cases depends on the application configuration, rather the application code. Finally, we will learn about the reactive programming model,and its suitability for implementing distributed service oriented architectures using asynchronous events.
Continue Your Journey with the Specialization "Parallel, Concurrent, and Distributed Programming in Java"
The next two videos will showcase the importance of learning about Parallel Programming and Concurrent Programming in Java. Professor Vivek Sarkar will speak with industry professionals at Two Sigma about how the topics of our other two courses are utilized in the field.

Good to know

Know what's good
, what to watch for
, and possible dealbreakers
Teaches Java developers how to use popular frameworks to create distributed programming systems, such as Apache Kafka and Apache Hadoop
Appropriate for industry professionals and students with experience in Java programming
Suitable for individuals aiming to enhance their skills in distributed programming and multithreading
Provides practical experience through mini-projects assigned in each module
Led by Vivek Sarkar, an experienced professor with expertise in distributed programming
May require familiarity with basic distributed programming concepts

Save this course

Save Distributed Programming in Java to your list so you can find it easily later:
Save

Reviews summary

Informative java concurrency course

Learners say this Java course provides a great intro to distributed programming concepts like locks, threads, and distributed actors. They also highly rate the expert instruction and helpful lectures, although some wish the challenging mini-projects were more in depth.
Interesting mini-projects to apply concepts
"learning-friendly and oriented "
"Avery good course to take on coursera "
"Great experience and all the lectures are really interesting and the concepts are precise and perfect."
Enlightening lectures by knowledgeable instructor
"The teacher is clearly very knowledgeable about the subject and is great at articulating things in a way that is understandable."
"Great explanations. Great foundations."
"The hints and tutorial for mini projects in this section can be reduced a little bit."
Reported issues with the grading system
"The grading system is awful."
"They require you to optmize an algorithm, but do not provide enough resources on the grader, therefore you have to keep repeting the same submission for hours until finally you have the resources available on the server."
"The quizes also have some nonsense questions."
Concepts presented at too basic of a level
"Useful course overall, but a bit too short and too basic in covering a number of concepts."
"Course does not go too deep in arguments, but gives a quite basic knowledge about distributed data structures and algorithms."
"The implementation assignments (mini projects) could be more challenging though."

Activities

Be better prepared before your course. Deepen your understanding during and after it. Supplement your coursework and achieve mastery of the topics covered in Distributed Programming in Java with these activities:
Join a Study Group for the Course
Joining a study group will provide you with opportunities to collaborate with peers, discuss course concepts, and reinforce your understanding through peer-to-peer learning.
Show steps
  • Reach out to classmates through the course forum or online platforms.
  • Organize regular study sessions to discuss course material and work on assignments together.
Organize and Review Course Materials
Organizing and reviewing course materials will help you stay organized and ensure that you have a comprehensive understanding of the concepts covered in the course.
Show steps
  • Create a folder or notebook to store course materials.
  • Download and organize lecture notes, assignments, and other resources.
  • Review the materials regularly to refresh your memory and reinforce your understanding.
Review Java Programming Essentials
Reviewing the fundamentals of Java will strengthen your foundation for the course, making it easier to grasp the concepts of distributed programming.
Show steps
  • Read Chapters 1-3 of Head First Java to refresh your understanding of Java syntax and object-oriented programming concepts.
  • Complete the practice exercises at the end of each chapter to test your comprehension.
  • Create a simple Java program to demonstrate your understanding of the material.
Four other activities
Expand to see all activities and additional details
Show all seven activities
Complete the Apache Hadoop Tutorial
The Apache Hadoop Tutorial will provide you with a hands-on introduction to distributed programming using Hadoop, which is one of the key frameworks covered in the course.
Browse courses on Apache Hadoop
Show steps
  • Visit the Apache Hadoop website and follow the tutorial instructions.
  • Install Hadoop on your local machine.
  • Create a simple Hadoop program to analyze a dataset.
Solve MapReduce Problems on LeetCode
Solving MapReduce problems on LeetCode will challenge your understanding of distributed programming concepts and algorithms and help you prepare for real-world applications.
Browse courses on MapReduce
Show steps
  • Create a LeetCode account and search for MapReduce problems.
  • Attempt to solve the problems on your own.
  • Review the solutions provided by other users to learn alternative approaches.
Contribute to an Open-Source Project Related to Distributed Programming
Contributing to an open-source project will give you practical experience in working on real-world distributed programming systems and collaborating with other developers.
Browse courses on Open Source
Show steps
  • Identify an open-source project related to distributed programming.
  • Review the project's documentation and codebase.
  • Find a suitable issue or feature to work on.
  • Implement and test your changes.
  • Submit a pull request to the project's repository.
Develop a Java Program to Implement a Distributed File Server
Developing a Java program to implement a distributed file server will provide you with hands-on experience in designing and implementing distributed systems, a key skill in this course.
Show steps
  • Design the architecture of the distributed file server.
  • Implement the server-side and client-side components.
  • Test the file server using various scenarios.
  • Write a report summarizing your design and implementation.

Career center

Learners who complete Distributed Programming in Java will develop knowledge and skills that may be useful to these careers:
Software Engineer
Software Engineers analyzing data for key-value pairs may be interested in the **Distributed Programming in Java** course. The course covers Hadoop and Spark, which are frameworks used to conduct MapReduce analysis on data in this format. Additionally, the course explores the topic of using multithreading to optimize the performance of distributed software.
Data Scientist
Data Scientists interested in applying a multithreaded approach to their data analysis can leverage the teachings of the **Distributed Programming in Java** course. The course covers how processes and threads can be harnessed in combination with multithreading to optimize the performance of distributed software.
Software Architect
The **Distributed Programming in Java** course can be a valuable asset for Software Architects looking to expand their expertise in designing and developing distributed systems. The course covers the fundamental concepts of distributed programming and various frameworks and techniques used in this field.
Systems Analyst
Systems Analysts seeking to gain proficiency in the design and implementation of distributed systems may benefit from the **Distributed Programming in Java** course. The course covers a range of topics relevant to distributed systems, including MapReduce programming, client-server programming, message passing, and combining distribution with multithreading.
Big Data Architect
The **Distributed Programming in Java** course may be of interest to Big Data Architects seeking to expand their skill set in the realm of distributed systems. The course delves into the use of Hadoop and Spark for processing large datasets, as well as techniques for combining distribution with multithreading to optimize performance.
Cloud Architect
Cloud Architects may find the **Distributed Programming in Java** course to be a valuable resource for enhancing their understanding of distributed systems. The course covers the use of cloud-based frameworks for distributed programming, such as Hadoop and Spark.
Full Stack Engineer
Full Stack Engineers wishing to enhance their understanding of distributed programming may find the **Distributed Programming in Java** course beneficial. The course provides a comprehensive overview of distributed programming concepts and frameworks, including Hadoop, Spark, and MPI.
Database Administrator
Database Administrators looking to enhance their understanding of distributed data management may benefit from the **Distributed Programming in Java** course. The course covers techniques for processing and managing large datasets using distributed frameworks such as Hadoop and Spark.
Backend Engineer
The **Distributed Programming in Java** course can be of value to Backend Engineers seeking to develop their skills in designing and implementing distributed systems. The course covers a range of topics relevant to distributed programming, including MapReduce, client-server programming, message passing, and combining distribution with multithreading.
Cloud Engineer
Cloud Engineers aiming to enhance their understanding of distributed systems and cloud computing may benefit from the **Distributed Programming in Java** course. The course covers various frameworks and techniques used in distributed programming, including Hadoop and Spark.
DevOps Engineer
DevOps Engineers seeking to expand their knowledge of distributed systems and cloud computing may find the **Distributed Programming in Java** course useful. The course covers various frameworks and techniques used in distributed programming, including Hadoop and Spark.
Systems Engineer
Systems Engineers seeking to gain a deeper understanding of distributed systems and cloud computing may find the **Distributed Programming in Java** course beneficial. The course covers the fundamental concepts of distributed programming and frameworks commonly used in this field.
Data Engineer
The **Distributed Programming in Java** course can provide Data Engineers with a solid foundation in the principles and practices of distributed programming. The course covers a range of topics relevant to data engineering, including MapReduce programming, client-server programming, message passing, and combining distribution with multithreading.
Network Engineer
The **Distributed Programming in Java** course may be useful for Network Engineers interested in expanding their knowledge of distributed systems and cloud computing. The course covers various frameworks and techniques used in distributed programming, including Hadoop and Spark.
Security Engineer
Security Engineers looking to enhance their understanding of distributed systems and cloud computing may find the **Distributed Programming in Java** course helpful. The course covers the fundamental concepts of distributed programming and frameworks commonly used in this field.

Reading list

We've selected 12 books that we think will supplement your learning. Use these to develop background knowledge, enrich your coursework, and gain a deeper understanding of the topics covered in Distributed Programming in Java.
Provides a comprehensive overview of Hadoop, from its architecture and components to its use cases and best practices. It valuable resource for anyone who wants to learn more about Hadoop and how to use it effectively.
Provides a comprehensive overview of Spark, from its architecture and components to its use cases and best practices. It valuable resource for anyone who wants to learn more about Spark and how to use it effectively.
Provides a comprehensive overview of Java concurrency, from its basic concepts to its advanced features. It valuable resource for anyone who wants to learn more about Java concurrency and how to use it effectively.
Provides a comprehensive overview of cloud computing, from its basic concepts to its advanced features. It valuable resource for anyone who wants to learn more about cloud computing and how to use it effectively.
Provides a comprehensive overview of high-performance computing, from its basic concepts to its advanced features. It valuable resource for anyone who wants to learn more about high-performance computing and how to use it effectively.
Provides a comprehensive overview of computer networks, from their basic concepts to their advanced features. It valuable resource for anyone who wants to learn more about computer networks and how to use them effectively.
Provides a comprehensive overview of operating systems, from their basic concepts to their advanced features. It valuable resource for anyone who wants to learn more about operating systems and how to use them effectively.
Provides a comprehensive overview of distributed systems, from their basic concepts to their advanced features. It valuable resource for anyone who wants to learn more about distributed systems and how to use them effectively.
Provides a comprehensive overview of cloud computing, from its basic concepts to its advanced features. It valuable resource for anyone who wants to learn more about cloud computing and how to use it effectively.
Provides a comprehensive overview of grid computing, from its basic concepts to its advanced features. It valuable resource for anyone who wants to learn more about grid computing and how to use it effectively.
Provides a comprehensive overview of parallel programming with MPI and OpenMP, from its basic concepts to its advanced features. It valuable resource for anyone who wants to learn more about parallel programming with MPI and OpenMP and how to use them effectively.
Provides a comprehensive overview of high-performance scientific computing, from its basic concepts to its advanced features. It valuable resource for anyone who wants to learn more about high-performance scientific computing and how to use it effectively.

Share

Help others find this course page by sharing it with your friends and followers:
Our mission

OpenCourser helps millions of learners each year. People visit us to learn workspace skills, ace their exams, and nurture their curiosity.

Our extensive catalog contains over 50,000 courses and twice as many books. Browse by search, by topic, or even by career interests. We'll match you to the right resources quickly.

Find this site helpful? Tell a friend about us.

Affiliate disclosure

We're supported by our community of learners. When you purchase or subscribe to courses and programs or purchase books, we may earn a commission from our partners.

Your purchases help us maintain our catalog and keep our servers humming without ads.

Thank you for supporting OpenCourser.

© 2016 - 2024 OpenCourser