We may earn an affiliate commission when you visit our partners.
Course image
Udemy logo

Fundamentals of Database Engineering

Hussein Nasser

Database Engineering is a very interesting sector in software engineering. If you are interested in learning about database engineering you have come to the right place. I have curated this course carefully to discuss the Fundamental concepts of database engineering.

Read more

Database Engineering is a very interesting sector in software engineering. If you are interested in learning about database engineering you have come to the right place. I have curated this course carefully to discuss the Fundamental concepts of database engineering.

This course will not teach you SQL or programming languages, however, it will teach you skillsets and patterns that you can apply in database engineering. A few of the things that you will learn are Indexing, Partitioning, Sharding, Replication, b-trees in-depth indexing, Concurrency control, database engines and security, and much more.

I believe that learning the fundamentals of database engineering will equip you with the necessary means to tackle difficult and challenging problems yourself. I always compare engineering to math, you never memorize specific formulas and equations, you know the basic proves and derive and solve any equation one throws at you. Database engineering is similar, you can't possibly say MongoDB is better than MySQL or Postgres is better than Oracle. Instead, you learn your use case and by understanding how each database platform does its own trade-offs you will be able to make optimal decisions.

One other thing you will learn in this course is the lowest database interface that talks to the OS which is the database engine. Database engines or storage engines or sometimes even called embedded databases is a software library that a database management software uses to store data on disk and do CRUD (create update delete) Embedded means move everything in one software no network client-server. In this video course, I want to go through the few popular database engines, explain the differences between them, and finally, I want to spin up a database and change its engine and show the different features of each engine.

Enjoy the course.

Enroll now

What's inside

Learning objectives

  • Learn and understand acid properties
  • Database indexing
  • Database partitioning
  • Database replication
  • Database sharding
  • Database cursors
  • Concurrency control (optimistic, pessimistic)
  • B-trees in production database systems
  • Database system designs
  • Difference between database management system, database engine and embedded database
  • Database engines such as myisam, innodb, rocksdb, leveldb and more
  • Benefits of using one database engine over the other
  • Switching database engines with mysql
  • Database security
  • Homomorphic encryption
  • Show more
  • Show less

Syllabus

Section dedicates to Course Updates and Welcome
Welcome to the Course
Course Note 1
Course Note 2
Read more
Course Note 3
Note about Docker
Slides
Learn about Atomicity, Consistency, Isolation and Durability

ACID which stands for Atomicity, consistency, isolation, and durability are four critical properties of relational database. I think any engineer working with a relational database like postgres, mysql, sqlserver oracle, should understand these properties.


In this course, we will go through the four properties and explain why each is critical to build and use a relational database successfully.

What is a Transaction?
Atomicity
Isolation
Consistency
Durability

In this video we will demonstrate Atomicity, Isolation, Consistency and Durability on Postgres, fully practical example.



Phantom Reads
Serializable vs Repeatable Read
Eventual Consistency

Answer the following questions about ACID properties in databases

This section is critical to understand the internals of the database storage

This lecture details the inner working of database systems with regards to storage. It is a must watch to understand the difference between tables, pages, IO, rows, indexes and data files.

Row-Based vs Column-Based Databases

In this lecture I will discuss the difference between Primary Key and a Secondary Key and how it can affect your performance.


Databases Pages (Article)
Learn about the basic concepts of Indexing

Lots of you asked me how to create a table with millions of rows in postgres, here are the details


Getting Started with Indexing
Understanding The SQL Query Planner and Optimizer with Explain

In this video, I explain the benefits of Bitmap Index Scan and how it differs from Index scan and table sequential scan. 

Key vs Non-Key Column Database Indexing
Index Scan vs Index Only Scan
Combining Database Indexes for Better Performance
How Database Optimizers Decide to Use Indexes

If you create an index on a large production table in postgres, the operations blocks writes in order to make sure to pull all the field entries to the index. However most of the time you can't afford to block writes on an active production database table. Postgres new feature which allows create index concurrently allows writes and reads to go in the expense of cpu/memory, time and chance for the index to be invalid. A small price to pay for fast production writes! https://www.postgresql.org/docs/9.1/sql-createindex.html#SQL-CREATEINDEX-CONCURRENTLY

Bloom Filters
Working with Billion-Row Table
Article - The Cost of Long running Transactions
Article - Microsoft SQL Server Clustered Index Design
Indexing Quiz
Understanding the internals of B-Trees and how different database systems implements it. This helps you build effective and performant indexes

B-tree is a self-balancing tree data structure that maintains sorted data and allows searches, sequential access, insertions, and deletions in logarithmic time. However, most contents explain this data structure from a theoretical point of view, in this lecture I’d like to shed some light on the practical considerations of B-Tree and B+Trees in real production database systems such as Postgres and MySQL.


Link to the original paper https://infolab.usc.edu/csci585/Spring2010/den_ar/indexing.pdf

Full Table Scans
Original B-Tree
How the Original B-Tree Helps Performance

B-Tree limitation

B+Tree
B+Tree DBMS Considerations
B+Tree Storage Cost in MySQL vs Postgres
B-Tree Section's Summary
Entire Section discussing Database Partitioning By Example
Introduction to Database Partitioning
What is Partitioning?
Vertical vs Horizontal Partitioning
Partitioning Types
The Difference Between Partitioning and Sharding
Preparing: Postgres, Database, Table, Indexes
Execute Multiple Queries on the Table
Create and Attach Partitioned Tables
Populate the Partitions and Create Indexes
Class Project - Querying and Checking the Size of Partitions
The Advantages of Partitioning
The Disadvantages of Partitioning
Section Summary - Partitioning


Assume you have a table that is partitioned on the customer_id field serial 32bit, and you want to partition by range, how do you create all the necessary partitions? this is what I discuss in this video


Source Code

https://github.com/hnasr/javascript_playground/tree/master/automate_partitions


In this section we will explore database sharding, its pros and cons with practical examples
Introduction to Database Sharding
What is Database Sharding?
Consistent Hashing
Horizontal partitioning vs Sharding
Sharding with Postgres
Spin up Docker Postgres Shards
Writing to a Shard
Reading from a Shard
Advantages of Database Sharding
Disadvantages of Database Sharding
Database Sharding Section Summary
When Should you consider Sharding your Database?
Learn about locks, dead locks, connection pooling and distributed transactions

In this lecture we explain the difference between exclusive (write locks) and shared locks (read locks)

Dead Locks
Two-phase Locking

In this video, I demonstrate how is it possible to get double booking in database-backed web applications and how to prevent double booking and race conditions with row-level locks.


Source Code https://github.com/hnasr/javascript_playground/tree/master/booking-system

Double Booking Problem Part 2 ( Alternative Solution and explination)

In this video I’ll explain why you should avoid using SQL offset when implementing any kind of paging. I’ll explain what offset does, why is it slow and what is the alternative for better performance This video is inspired by Use the index luke, i’ll have a link to the blog and slides to learn more. Let say you have a web application with an API that supports paging, you user want to request 10 news articles in page 10, this is performed via a simple GET request as shown here The API server receives the GET request and builds the SQL in order to send it to the database hopefully a pool of connections exist here. Page 10 translates to offset 100 assuming each page has 10 records and now the database is ready to execute the query against the table. Offset by design means fetch and drop the first x number of rows, so in this case the database will fetch the first 110 rows and physically drop the first 100 leaving the limit of 10 which the user will get. As the offset increase, the database is doing more work which makes this operation extremely expensive. Furthermore, the problem with offset is you might accidentally read duplicate records. consider the user now want to read page 11 and meanwhile someone inserted a new row in the table, row 111 will be read twice Let us jump and test this against postgres



Use the Index Luke Blog https://use-the-index-luke.com/no-offset


Slides in this video https://payhip.com/b/B6o1


Connection pooling is a pattern of creating a pool of available connections (usually TCP) and allow multiple clients to share the same pool of connections. This pattern is usually used when connection establishment and tearing down is costly, and the server has a limited number of connections. In this video we will learn how to use connection pooling in NodeJs when working with a Postgres Database, we will learn how to spin up a pool of database connections and use stateless pool queries and transactional queries begin/end, and finally, we will


  • Node JS Source Code used in this lecture here https://github.com/hnasr/javascript_playground/tree/master/postgresnode-pool


Introduction to Database Replication with Postgres 13

Scripts and commands

docker run --name pgmaster -v /Users/HusseinNasser/postgres/v/master_data:/var/lib/postgresql/data -p 5432:5432 -e POSTGRES_PASSWORD=postgres -d postgres


docker run --name pgstandby -v /Users/HusseinNasser/postgres/v/standby_data:/var/lib/postgresql/data -p 5433:5432 -e POSTGRES_PASSWORD=postgres -d postgres

In standby node update postgresql.conf

primary_conninfo = 'application_name=standby host=husseinmac port=5432 user=postgres password=postgres’

add file standby.signal

touch standby.signal

In master update postgresql.conf

first 1 (standby1)

select * from pg_stat_replication

Master/Standby Replication
Multi-master Replication
Synchronous vs Asynchronous Replication

Scripts and commands

docker run --name pgmaster -v /Users/HusseinNasser/postgres/v/master_data:/var/lib/postgresql/data -p 5432:5432 -e POSTGRES_PASSWORD=postgres -d postgres


docker run --name pgstandby -v /Users/HusseinNasser/postgres/v/standby_data:/var/lib/postgresql/data -p 5433:5432 -e POSTGRES_PASSWORD=postgres -d postgres

In standby node update postgresql.conf

primary_conninfo = 'application_name=standby host=husseinmac port=5432 user=postgres password=postgres’

add file standby.signal

touch standby.signal

In master update postgresql.conf

first 1 (standby1)

select * from pg_stat_replication



If using the hostname doesn't work, use the IP address of the container itself. You can get the local IP address of the container by running docker inspect container name

Pros and Cons of Replication
Database System Design

We got through a practical system design exercises, this lecture is two parts. Part 1 is all about backend engineering and scaling and Part 2 focuses on database design.

Building a Short URL System Database Backend
Database Engines

Database engines or storage engines or sometimes even called embedded databases is software library that a database management software uses to store data on disk and do CRUD (create update delete)


Resources
https://youtu.be/V_C-T5S-w8g

https://mariadb.com/kb/en/library/changes-improvements-in-mariadb-102/

https://mariadb.com/kb/en/library/why-does-mariadb-102-use-innodb-instead-of-xtradb/

https://github.com/facebook/rocksdb/wiki/Features-Not-in-LevelDB

https://mariadb.com/kb/en/library/aria-storage-engine/

https://dev.mysql.com/doc/refman/8.0/en/innodb-index-types.html https://eng.uber.com/mysql-migration/

What is a Database Engine?
MyISAM
InnoDB

Good to know

Know what's good
, what to watch for
, and possible dealbreakers
Develops skills, knowledge, and tools that are core skills for database engineers
This is a multi-modal course including a mix of videos, readings, and discussions
Develops skills, knowledge, and tools that are highly relevant to industry
Develops a strong foundation for beginners
Taught by instructors with recognized expertise in database engineering
Covers unique perspectives and ideas that may add color to other topics and subjects

Save this course

Save Fundamentals of Database Engineering to your list so you can find it easily later:
Save

Activities

Coming soon We're preparing activities for Fundamentals of Database Engineering. These are activities you can do either before, during, or after a course.

Career center

Learners who complete Fundamentals of Database Engineering will develop knowledge and skills that may be useful to these careers:
Data Scientist
A Data Scientist is responsible for developing and applying statistical and machine learning models to data. This course can help you learn the fundamentals of database engineering, which is essential for success in this role. You will learn about database indexing, partitioning, replication, and sharding. You will also learn about database security and how to protect your data from unauthorized access.
Software Engineer
A Software Engineer is responsible for designing, developing, and maintaining software applications. This course can help you learn the fundamentals of database engineering, which is essential for success in this role. You will learn about database indexing, partitioning, replication, and sharding. You will also learn about database security and how to protect your data from unauthorized access.
Data Analyst
A Data Analyst is responsible for analyzing data to identify trends and patterns. This course can help you learn the fundamentals of database engineering, which is essential for success in this role. You will learn about database indexing, partitioning, replication, and sharding. You will also learn about database security and how to protect your data from unauthorized access.
Database Administrator
A Database Administrator is responsible for the day-to-day maintenance and configuration of databases. This course can help you learn the fundamentals of database engineering, which is essential for success in this role. You will learn about database indexing, partitioning, replication, and sharding. You will also learn about database security and how to protect your data from unauthorized access.
Product Manager
A Product Manager is responsible for managing the development and launch of new products. This course can help you learn the fundamentals of database engineering, which is essential for success in this role. You will learn about database indexing, partitioning, replication, and sharding. You will also learn about database security and how to protect your data from unauthorized access.
Consultant
A Consultant is responsible for providing advice and guidance to clients on a variety of topics. This course can help you learn the fundamentals of database engineering, which is essential for success in this role. You will learn about database indexing, partitioning, replication, and sharding. You will also learn about database security and how to protect your data from unauthorized access.
Technical Writer
A Technical Writer is responsible for writing technical documentation, such as user manuals, white papers, and training materials. This course can help you learn the fundamentals of database engineering, which is essential for success in this role. You will learn about database indexing, partitioning, replication, and sharding. You will also learn about database security and how to protect your data from unauthorized access.
Museum curator
A Museum Curator is responsible for managing and preserving museum collections. This course can help you learn the fundamentals of database engineering, which is essential for success in this role. You will learn about database indexing, partitioning, replication, and sharding. You will also learn about database security and how to protect your data from unauthorized access.
Teacher
A Teacher is responsible for teaching students about a variety of subjects. This course can help you learn the fundamentals of database engineering, which is essential for success in this role. You will learn about database indexing, partitioning, replication, and sharding. You will also learn about database security and how to protect your data from unauthorized access.
Researcher
A Researcher is responsible for conducting research on a variety of topics. This course can help you learn the fundamentals of database engineering, which is essential for success in this role. You will learn about database indexing, partitioning, replication, and sharding. You will also learn about database security and how to protect your data from unauthorized access.
Systems Engineer
A Systems Engineer is responsible for designing, developing, and maintaining computer systems. This course can help you learn the fundamentals of database engineering, which is essential for success in this role. You will learn about database indexing, partitioning, replication, and sharding. You will also learn about database security and how to protect your data from unauthorized access.
Historian
A Historian is responsible for studying and interpreting the past. This course can help you learn the fundamentals of database engineering, which is essential for success in this role. You will learn about database indexing, partitioning, replication, and sharding. You will also learn about database security and how to protect your data from unauthorized access.
Archivist
An Archivist is responsible for managing and preserving historical records. This course can help you learn the fundamentals of database engineering, which is essential for success in this role. You will learn about database indexing, partitioning, replication, and sharding. You will also learn about database security and how to protect your data from unauthorized access.
Librarian
A Librarian is responsible for managing a library and providing access to information. This course can help you learn the fundamentals of database engineering, which is essential for success in this role. You will learn about database indexing, partitioning, replication, and sharding. You will also learn about database security and how to protect your data from unauthorized access.
Database Engineer
A Database Engineer is responsible for designing, developing, and maintaining databases. This course can help you learn the fundamentals of database engineering, which is essential for success in this role. You will learn about database indexing, partitioning, replication, and sharding. You will also learn about database security and how to protect your data from unauthorized access.

Reading list

We've selected seven books that we think will supplement your learning. Use these to develop background knowledge, enrich your coursework, and gain a deeper understanding of the topics covered in Fundamentals of Database Engineering.
Is another classic textbook on database systems. It covers a similar range of topics as Elmasri and Navathe's book, but it is more concise and less technical. It good choice for students who want to learn the basics of database systems without getting too bogged down in the details.
Comprehensive guide to MySQL, one of the most popular open-source database management systems. It covers a wide range of topics, including installation, configuration, query processing, and performance tuning. It valuable resource for anyone who wants to learn more about MySQL.
Comprehensive guide to PostgreSQL, another popular open-source database management system. It covers a wide range of topics, including installation, configuration, query processing, and performance tuning. It valuable resource for anyone who wants to learn more about PostgreSQL.
Comprehensive guide to SQL Server, a popular commercial database management system from Microsoft. It covers a wide range of topics, including installation, configuration, query processing, and performance tuning. It valuable resource for anyone who wants to learn more about SQL Server.
Provides a comprehensive overview of data mining. It covers a wide range of topics, including data preprocessing, data mining algorithms, and data mining applications. It valuable resource for anyone who wants to learn more about how to use data mining to gain insights from data.
Provides a comprehensive overview of machine learning. It covers a wide range of topics, including machine learning algorithms, machine learning applications, and machine learning tools. It valuable resource for anyone who wants to learn more about how to use machine learning to solve problems.
Provides a comprehensive overview of deep learning. It covers a wide range of topics, including deep learning algorithms, deep learning applications, and deep learning tools. It valuable resource for anyone who wants to learn more about how to use deep learning to solve problems.

Share

Help others find this course page by sharing it with your friends and followers:

Similar courses

Here are nine courses similar to Fundamentals of Database Engineering.
Automotive Engine Fundamentals | 汽车发动机原理
Most relevant
Aerospace Engineering: Aircraft Fundamentals and Advanced
Most relevant
Creating, Connecting, and Monitoring Databases with...
Most relevant
How to Use SQL with Large Datasets
Complete Houdini FX Bootcamp
C# Design Patterns: Rules Engine Pattern
Aerospace Engineering: Airplanes, Airlines and Airports
Optimizing a Data Warehouse on the Microsoft SQL Server...
Introduction to Qdrant (Vector Database) Using Python
Our mission

OpenCourser helps millions of learners each year. People visit us to learn workspace skills, ace their exams, and nurture their curiosity.

Our extensive catalog contains over 50,000 courses and twice as many books. Browse by search, by topic, or even by career interests. We'll match you to the right resources quickly.

Find this site helpful? Tell a friend about us.

Affiliate disclosure

We're supported by our community of learners. When you purchase or subscribe to courses and programs or purchase books, we may earn a commission from our partners.

Your purchases help us maintain our catalog and keep our servers humming without ads.

Thank you for supporting OpenCourser.

© 2016 - 2024 OpenCourser