SQL Server High Availability and Disaster Recovery (HA/DR) from Udemy

In this video, I answer submitted questions about performing maintenance - whether it's installing patches or dealing with emergencies - on SQL Server high availability solutions.

In this video, I answer submitted questions about monitoring SQL Server high availability solutions.

With the different high availability features that SQL Server provides, it can be tempting to implement Always On Availability Groups because that’s what most people recommend. However, just because you are implementing a SQL Server high availability solution doesn’t mean it will meet your high availability requirements. You don’t want to be in a situation where you spent a lot of money, effort, and resources building a SQL Server high availability solution that does not meet the business needs. The first step in implementing a high availability solution is to start with the basics.

In this session, you will learn the basics of high availability that every DBA needs to know. These basics form the foundation of every high availability solution you will deploy, regardless of the technology or feature you choose.

In this video, I answer submitted questions about backups for your SQL Server databases - be it enterprise-wide backups, native backups, snapshots, etc.

It's inevitable that you will be migrating your SQL Server databases throughout it's lifecycle. But migrating a standalone database is not the same as when they're in a high availability solution.

In this video, I answer submitted questions about migrating and/or upgrading SQL Server high availability solutions - be it from older version to newer, physical to virtual, on-premises to cloud, etc.

I couldn't answer all of the submitted questions in an hour. So I decided to break it down into two parts. This is Part 2 of the previous video on migrating and/or upgrading SQL Server high availability solutions.

If you need high availability for SQL Server databases, you need a cluster resource manager. And if you're not aware of how Windows Server Failover Clustering works, you're putting your job as a database administrator at risk. Because your responsibilities are now dependent on the things that are outside SQL Server.

In this video, I answer submitted questions about Windows Server Failover Clustering in the context of SQL Server high availability solutions.

Do you find it very confusing dealing with SQL Server Always On Availability Groups and not knowing what questions to ask? Even worse, you start having more questions the moment you get an answer to your previous one. Is it...

How many replicas do I need?
How do I configure the cluster quorum settings?
How do I configure the networking for the failover cluster?
What do I do when the failover cluster becomes unavailable?

Think about it, you can get instant “training” almost anywhere nowadays that can help you answer these questions:

You can watch videos on YouTube.
You can get Udemy or PluralSight or Lynda.com subscriptions
You can read it on the Microsoft documentation, blog posts, and articles

All the information you need is at your fingertips on a moment’s notice, right? But unless you ask the most important question, it is going to cost you more confusion, more frustration, and more uncertainty.

This video talks about the most important question that you need to ask about your SQL Server Always On Availability Groups.

Every SQL Server high availability solution fall into these two (2) categories - ones that you built and ones that others built. Simple, right?

The problem lies when somebody else - a different team, a consultant, a managed services provider - built the solution, didn't follow design and implementation best practices, and left the care and feeding to YOU. You struggle with addressing issues that should have been avoided if only the implementation was done properly. And you think the instructions that they left for you was good enough. All it did was create more confusion and introduced more issues.

Even worse is when it was you who designed the solution but were not aware of the proper design and deployment strategies needed for a stable and reliable environment.

This video talks about how to deal with a crappy environment - whether it was you who built it or somebody else. I also cover making sure that your SQL Server high availability environment is rock-solid, stable, and reliable so you don't have to struggle with unexpected outages and a lot of sleepless nights.

I was on the phone with a client where they had to push back on a database migration scheduled for the weekend. They were migrating from a SQL Server 2008 standalone instance and into SQL Server 2019 Always On Availability Groups. I can feel the tension in his voice over the phone because of missed deadlines, pressure from the stakeholders (they are a large healthcare provider and could not afford extended downtime), and unresolved issues. Upon further investigation, it turned out that there were some miscommunication between different teams involved in the migration.

Isn't it frustrating when a project you're involved in failed or missed a deadline and you couldn't do anything about it? You waste a huge amount of time on conference calls with other teams involved, wishing and hoping they would solve the problems much faster. And what about the finger-pointing? The joke's on the DBA (default blame accepter) because everyone assumes it's always a database problem. You're now getting blamed for things you have no control over. These things make matter worse.

Having worked with hundreds of customers and clients for more than a decade now, I see this as one of the main reasons why HA/DR projects fail or get delayed. And silos will continue to affect the way you do your work. If you don't know how to deal with inter-team dynamics, you will constantly be working with frustration and stress.

This video talks about the curse of silos in an organization, how they are causing division among teams and why it is one of the main causes of project failures. I also cover how to leverage your role as a DBA to successfully lead HA/DR projects despite the different silos in your organization. You'll be surprised at how important your role is to make this happen.

Are you dealing with too many incidents and outages with your SQL Server Always On Availability Groups?
Are you in constant firefighting mode trying to keep the lights on?
Do you get woken up in the middle of the night when the databases unexpectedly become unavailable?
Do you feel like your limited technical experience is hurting your ability to resolve, if not prevent, further incidents and outages from happening?
Are you concerned that a critical incident could become a resume-generating event if you don't get it resolved?

Tech problems are part of normal day-to-day IT operations. It's when problems turn into incidents that cause outages do they become dead serious. And when you have too many of them, dealing with them could be a daily nightmare. If you don't deal with too many incidents, they will literally start taking over your life.

This video talks about the real problem behind dealing with too many incidents. I also cover the reasons why a highly available system like a SQL Server Always On Availability Group is experiencing outages and how you can prevent them from happening in the future.

I was reviewing a client's Always On Availability Group architecture last week and spoke to their internal team about their design choices. They brought me in to find out what's causing unexpected outages and how to prevent them from happening in the future.

The DBA team is top-notch and one of the smartest group of people I have ever had the chance to work with. They have years of experience and worked on the most challenging environments. So, imagine their surprise when the manager brought me in to review their environment. They've done all the work to build and maintain the infrastructure. Still, they struggled with identifying the cause of the outages. The senior engineer who designed the system did not even want to be in the same room with me until my last day.

If senior DBAs and engineers with years of experience deploying and managing SQL Server Always On environments still struggle with these challenges, imagine what it feels like when you don't even know what SQL Server Always On is? And the common theme that I see all the time is that they are making these 3 mistakes - regardless of their technical expertise or years of experience. What's worse, they're not even aware of these 3 things until issues like unexpected outages occur.

This video talks about the 3 biggest mistakes most DBAs make when deploying and managing SQL Server Always On.

As SQL Server DBAs working with Always On Availability Groups, sometimes our hands are tied when working with "black box" systems.

the host machine running your Always On Availability Group VMs that is causing outages. Your Always On Availability Groups are experiencing a lot of unexpected failovers and outages. But since it's a VM, you don't know exactly what's going on at the hypervisor layer that is causing these outrages.
the storage area network (SAN) that you do not have access to but is causing performance issues. Customers and end-users are complaining about database performance issues and you can perform diagnostics on SQL Server but not the underlying storage. You're trying to figure out what to say to a SAN admin when you think that the expensive SAN "may" be the cause of the performance problem and you just want to look into it.
the network that is causing replication latency and puts your databases at risk of going offline. Always On Availability Groups secondary replica could not get caught up with the replication, the transaction log file is growing so fast that you have to constantly remove the secondary replica just so you can shrink the log file.

A black box can lead to finger-pointing and in-fighting within the team, causing issues to stay unresolved and costing the business potential revenue. I've seen this happen a lot, especially in very large organizations. If you're a SQL Server DBA who has to deal with a black box to manage your Always On Availability Group solutions, this is for you.

This video talks about how to deal with black boxes working with SQL Server high availability solutions.

Do you sometimes get discouraged?

You've been trying to learn SQL Server Always On Availability Groups for months - even years - and you feel like giving up because it's too complicated and confusing.
You attempted to set it up from the ground up - but failed miserably. You started to question your technical skills. And now you don't even want to get anywhere near an AG.
You don't know what you don't know and it's freaking you out because you're responsible for it - you're the lead engineer.

It's easy to get discouraged when you know you're not making progress - especially when it's your responsibility. It can be frustrating when a problem happens and you don't know what to do. You're supposed to be the subject matter expert.

The truth is, trying to build and/or manage a SQL Server Always On Availability Group can be discouraging when you're starting out. And the Microsoft documentation doesn't make it easy. There's so much information available that it gets more confusing. So, it's easier to just quit.

What if learning and mastering SQL Server Always On Availability Groups can be the breakthrough you need to get you the promotion you've been waiting for?

When I created my SQL Server Always On Availability Groups: The Senior DBA's Ultimate Field Guide training program, my intent was to simply share my personal experience and help my fellow tech professionals learn this complex technology. What I didn't realize was that it also opened up opportunities for salary increase, a promotion, and even landing their dream job.

This video talks about how learning and mastering SQL Server Always On Availability Groups can open up opportunities for your career advancement.

SHOW LESS

This question came up during a conversation I had with a client. He is a consultant who wanted to become an expert in SQL Server Always On Availability Groups. He is constantly being assigned to projects that involve working with the technology. Yet he does not have the skills to confidently troubleshoot a problem nor to provide the right advice.

Becoming a subject matter expert is very important to your career whether you’re a consultant, sysadmin, or a DBA. It demonstrates that you are very good at what you do. And that can easily command respect from your peers and colleagues. Wouldn’t it feel great to know that your recommendations carry weight?

Subject matter expertise also opens up doors of opportunities. There’s a reason the “Senior” title exists. You need skills mastery whether you’re applying for your next job or wanting to be promoted.

The problem with working in tech is that the introduction of new technologies makes it very difficult to catch up. What you know today will be completely obsolete in a year or two. And all the available information is making it all the more confusing to learn the new tech.

The good news is that you can become a subject matter expert without having to deal with the overwhelming amount of information you need to learn about a specific technology.

In this video, I talk about the secret to becoming a technical subject matter expert.

A common question that I get asked a lot is whether Microsoft officially supports a specific configuration ... or not...

Can I add the secondary replica in an Always On Availability Group as a publisher in a transactional replication topology?
Is running a guest Windows cluster on VMWare supported by both VMWare and Microsoft?
Can I have named SQL Server instances as replicas in an Always On Availability Group?
Can I use a dedicated network adapter for a Distributed Availability Group data replication traffic?
Can I have more than 100+ databases in an Always On Availability Group?

People on newsgroups, forums, and online communities are asking the same questions. And everyone who responds to these questions almost always provides a technical answer.

Maybe you're like one of them. Your boss wants to implement a specific configuration and you wonder if it is supported or not. And you do your due diligence and research online to validate your idea - reading official vendor documentation, blog posts, articles, etc.

The problem with asking the question "Is this supported?" is that it often leads you to a different path, one that may not be beneficial to your organization - or your sanity. Instead of helping you achieve your goals, it may end up causing more issues. And more issues can further lead to more incidents, more tickets, more unnecessary work, more frustration, more outages, more anxiety, and so on.

This video talks about why asking the question "Is This Supported?" is the wrong question to ask about your SQL Server Always On Availability Groups. I also share the alternative - the right questions (I did say questions, meaning not just one) to ask when deploying a specific configuration.

We are just coming out of an extended power outage that started this past weekend after a destructive storm ravaged the city. And while we already got power back on, there are other nearby communities that are still either on generators or completely without electricity.

Transmission lines were down, trees uprooted, some were even trapped in their vehicles for hours as power lines came onto them while driving. It's not every day we experience natural disasters like this storm. And when they do happen, not everyone is prepared to deal with them.

Similarly, emergencies do happen in technology. Whether it's a hardware failure, a database corruption, or an overheated data center (don't ask me why I have this specific event on the list), emergencies will happen. It's never a question of IF but a question of WHEN.

There are severe consequences when you and your team are not prepared to handle these emergencies. Extended outages can cost businesses revenue loss. Reputations are damaged when customers do not get the level of service that they expect.

But that's just the consequences to the business. How does that affect you?

I've seen people burn out after dealing with a stressful emergency for 72 hours straight. I've been in war rooms where shouting matches happen while engineers try to deal with the emergency. And I know people who just quit their jobs out of frustration right in the middle of a disaster recovery incident.

Knowing how to handle emergencies can literally make or break your career. And it's even more challenging when you have very limited resources.

In this video, I talk about H.O.W. to handle emergencies - especially when you have very limited resources.

Most IT Professionals - DBAs and sysadmins alike - spend a lot of time on the technical details of an HA/DR implementation. They learn as much as they can about a specific feature, the best practices approach to deployment, and the caveats specific to their use case. And every time an outage occurs, the focus keeps coming back to the tech.

Did the configuration change recently?
Was there a missing cumulative update?
Should we increase the threshold values?
Was the server reboot that caused the issue?

While an HA/DR solution is technical in nature, focusing on the tech is the main reason why outages and other related problems keep happening. And unless you step up as the trusted technical advisor in your organization, you're going to get all the blame when an outage occurs.

In this video, I talk about the reasons why focusing on tech is the worst way to deploy a HA/DR solution. I'll also talk about a framework that I use every time I deploy a HA/DR solution that guarantees covering all bases.

Implementing a SQL Server HA/DR solution does not stop after production go-live. Once the system is ready for users, your job switches to operational support. This is where care and feeding is needed to make sure that the databases are highly available and will not be offline for an extended period of time.

This goes beyond installing patches, performing scheduled maintenance, and making configuration changes.

What if something unexpected happens? Something horribly goes wrong? How do you respond? Do you have history in your ticketing system where this has happened before? How do you deal with it?

It's not enough that you know Always On Availability Groups. Or failover clustering. Or log shipping. Or backups. Or dealing with data corruption.

When dealing with an unexpected emergency, I noticed that most tech professionals do not have these 3 things. Maybe they have one or two but not all 3. To properly support a HA/DR solution, you need all these.

In this video, I talk about the 3 things you need to properly support a HA/DR solution.

Your effectiveness and success on the job depend so much on your ability to solve problems. In fact, you get paid to solve technical problems every day. Effective problem solving is a very important skill as a SQL Server DBA and IT professional, isn't it?

But . . .

How many times have you faced a technical problem and spent too much time analyzing it? Despite having dealt with something similar before? But you're not making any progress. You're stuck, trying to figure out what you've been doing wrong and why you cannot solve this problem. Even when you have the right experience and skill set to solve it.

What's worse is when you are running out of time and your boss is bugging you for updates. Yet you're terrified to provide any response. Because you don't want him to find out that you couldn't figure out a solution. Remember that outage you were trying to troubleshoot while your boss kept asking?

It happens to the best of us - even the smartest DBAs and senior, more experienced engineers. And when it does, it affects your confidence until you're wasting time trying to research the right solution. And the one solution you think is going to solve it ends up crashing the system or causing performance issues.

In this video, I talk about the how your problem solving skills can cause more problems if not done right.

A big concern for most DBAs and sysadmins supporting SQL Server Always On Availability Groups is that they're stuck doing operational support. And they have never had the opportunity to design a solution from the ground up.

Most of the time, someone will design and implement the solution while they end up supporting it after production go-live.

And with no proper hand-off and no proper training, they are left trying to figure out what to do when problems come.

They're stuck dealing with issues that happen over and over again.

Are you wondering why you've been supporting SQL Server Always On Availability Groups for years...but have never had the opportunity to be the lead engineer on an implementation project?

And do you feel like this is causing a lot of frustration on how you do your work?

In this video, I cover how being an operational DBA is contributing to all of these and hurting your career.

When something "just works", we tend to have the attitude of "set it and forget it". This is especially true when everything is smooth sailing with the day-to-day operations.

But when a major disaster happens - a natural calamity, a hardware failure, an end-user mistake, etc. - stress levels are thru the roof. We struggle to restore everything back to normal - from restoring databases to activating the disaster recovery plan, if one even exists. It's no wonder why most technical professionals burn out after dealing with a major outage.

In this video, I will talk about your responsibilities as an IT Professional for dealing with disasters so you can minimize the stress and chaos of dealing with outages and downtime.

Deploying SQL Server Always On Availability Group solutions is not an easy task. There are a lot of moving parts outside of SQL Server that Always On Availability Groups depend on. There's the network, Active Directory, DNS, Windows Server Failover Cluster, etc.

And even when Microsoft provided a wizard that guides you through the creation process with a few mouse clicks doesn't mean you are doing it right.

What's worse is the complicated and confusing documentation that doesn't explain what the wizard is doing. Even when you do your own research, it gets even more frustrating. It's one article contradicting another YouTube video.

Whether you are using the wizards or using deployment scripts, are you aware of these deadly deployment mistakes? How sure are you that your deployment is not an outage waiting to happen?

Have you wasted a lot of time and effort trying to learn everything that you can...but not sure about whether you have a rock-solid solution?

Are you 100% sure that your solution will meet your company's recovery objectives (RPO/RTO) and service level agreements (SLA)?

One deadly mistake can cost more than an outage. It can literally become a resume generating event.

In this video, I'll talk about these deadly SQL Server Always On Availability Group deployment mistakes you should avoid.

If you’re even a tiny bit concerned that your deployments are not properly configured because you may have made these deadly mistakes, then, don't miss this episode.

One of the advantages of working as an independent consultant is the opportunity to work with SQL Server DBAs with different backgrounds from different cultures. My curiosity would lead me to asking questions about their experience working with Always On Availability Groups.

A large number of them do not have any experience working with AGs. Despite the fact that they have been a DBA for years.

The biggest challenge is that their lack of experience is preventing them from pursuing better opportunities. They could not even work on projects involving AGs.

If you've been a SQL Server DBA for years now yet still do not have any experience with AGs, it's probably because of these three things.

In this video, I'll talk about the 3 things preventing you from having any experience with SQL Server Always On Availability Groups.

We technical professionals have this strong urge to try and do EVERYTHING ourselves. And it doesn't matter whether you're the only DBA in the company or just one in a large team. Maybe you watched the TV series MacGyver when you were young and told yourself, "I want to become like him." And with the availability of information on the internet, we have all the more reason to do so.

So, we try and do it ourselves. We take pride in telling others "I did it myself" or "I'm self-taught".

But...

We forgot how difficult it was to do it ourselves. Because the reality is that it was really HHAARRDDD.

Thousands of hours wasted figuring out what is and is not supported. Reading a ton of Microsoft documentation. Not to mention third-party documentation that SQL Server relies on. SQL Server on VMWare. Or maybe Nutanix. What about running SQL Server HA on AWS? It's hard to make sense of what you are reading when they're conflicting information.

What about the months you spent trying to figure out why the failover cluster isn't communicating with the nodes on another data center? Being glued to your screen while sifting through GB worth of network activity logs. Yet, you still have no definitive answer.

This does not even include your regular day-to-day tasks. Talk about doing 14-hour days for weeks, even months, just to figure out what is causing the issues.

And when your boss starts asking how you were doing, you tell him how everything is doing great. Maybe because it's embarrassing to admit that you were not making any progress.

Being a jack-of-all-trades is great when you're only doing simple things like taking backups and performing routine tasks.

But not when you're responsible for high availability solutions.

In this video, I'll talk about how being a jack-of-all-trades is putting your high availability solutions at risk. Not to mention your career if you end up causing extended outages and could not figure out how to resolve them. I'll also share tips on how to tap into your team's skills to make sure your high availability solutions are well taken cared of.

As technical professionals, we get asked a ton of technical questions as part of our day-to-day job:

How do I configure a readable SQL Server Always On Availability Group secondary replica?
What is the best quorum configuration for my Windows Server Failover Cluster?
How do I setup Distributed Availability Groups across multiple data centers?
What is the best way to upgrade an existing database in a SQL Server Always On Availability Group with minimal downtime?

People come to us because they know we have technical answers for their technical problems. We, on the other hand, try our very best to provide as detailed enough information as much as we can in the hopes that it will solve their problem. It's a force of habit, like second nature, we respond to technical questions with technical answers.

Maybe because we want to prove that we're smart? Or that we know our stuff? Pride maybe, not wanting others to look down on us for not knowing the answer to a technical question that we're supposed to know?

But have you ever thought that by providing a technical answer to a technical question is doing a disservice to the person asking? That you are actually causing more harm instead of helping.

As IT professionals, we get paid to solve technical problems. However, the reality is that by providing technical answers to technical questions, we are contributing more to the problem rather than solving it.

In this video, I'll talk about why a technical answer is never the right response to a technical question. I'll share the strategy that I use to respond to technical questions that provides clarity to the person asking - and eventually leading them to solving the real problem.

So, your company decided to deploy a SQL Server Always On Availability Group. Since your team does not yet have the skills to take on the project, your boss decided to hire an external consultant. He came in highly recommended with dozens of social proof from online articles, blog posts and videos. You are convinced that he can get the job done.

But...

Is there something he's not telling you?

Do you feel like he's skipping some vital information about SQL Server Always On Availability Groups that you should know about?

Is he using the most common consultant tricks on your company?

As both a client and consultant myself, I've had my fair share of working with other consultants. There are a few things I observed about majority of the consultants that I've worked with.

In this video, I'll share what your SQL Server consultants and experts are probably not telling you about your high availability and disaster recovery solutions. Because I made these same mistakes when I was starting out as a consultant.

With SQL Server 2012 already out of mainstream support, customers are planning to upgrade to the latest version of the supported database platform. But upgrading and migrating come with their own set of problems if you're not prepared for them. And it can become even more problematic when you're dealing with a high availability solution.

In this video, I talk about how you can eliminate the headaches of upgrading and migrating your SQL Server databases. I also talk about the main considerations when performing an upgrade and migration so you can better prepare for them.

It's that time of the year when most organizations enforce a change freeze period. Nobody is allowed to make any significant changes to the system unless it is for an emergency.

It's tempting to slow down and take a breather during change freeze. Especially since there's not a lot of activities going on. And because of that, it's the perfect time to do these 3 things. They're not as stressful and demanding as your normal day-to-day tasks. But they make a huge impact to the environment and to your professional growth.

In this video, I'll talk about the 3 things you can do to maximize your change freeze period. The sooner you can do them, the better prepared you are to jump on the opportunity to implement them after the change freeze period ends.

The first few weeks of the new year is the perfect time to plan for the year ahead. Especially if you're coming out of a change freeze period during the holiday season. This applies to your high availability and disaster recovery solutions.

This is also a great opportunity to come up with creative ways to show your boss what you are capable of. Not only will you be able to improve your existing HA/DR solutions, you'll also prepare yourself for that next performance review.

This video gives you some ideas for making the most of your HA/DR solutions.

The news about Southwest Airlines and the FAA system outages have come to the spotlight as travelers were either stranded or delayed in their flights. Both of these incidents have one thing in common: outdated legacy systems.

The reality is that we still have a lot of legacy systems running businesses. I'm sure you have a few that you need to maintain. But there is a lot of risks associated with keeping legacy systems, especially in your high availability and disaster recovery strategy.

In this video, I'll cover the impacts of maintaining outdated legacy systems in your company's overall HA/DR strategy and what you can do to prevent costly incidents like these.

With business operations depending so much on data, disaster recovery has become a NECESSITY. Companies can no longer afford an extended outage or they risk going out of business.

As an IT professional taking care of SQL Server, it's your responsibility to protect it from disasters.

But protecting SQL Server from disasters goes beyond what version and edition to choose and features to implement. It requires setting aside your technical expertise and being very clear about why you are implementing it in the first place.

In this video, I'll talk about how to protect SQL Server from disasters so that you'll be prepared before it even happens.

The cost of software licenses and hardware are the two most common considerations when implementing an HA/DR solution. Most of the project budget will be spent on these two things plus the implementation.

What most companies don't realize is that there are hidden costs that go beyond the implementation. And unless they are included in in the budget, the solution will end up costing more in the long run.

In this video, I'll talk about the hidden costs of implementing an HA/DR solution so that you can prepare for them even before deploying to production.

We have been taking backups to protect our data for more than 70 years after the creation of the first storage device. And despite all the talks around the cloud taking over the responsibilities of doing backups, we're still doing them.

Yet, despite every best effort, we still lose data. And with new technology risks like ransomware coming out of nowhere, even the backups we thought could protect us are getting compromised. We're not only losing data, we're also being a victim of ransom for the digital assets we're supposed to protect.

There's a reason why.

In this video, I'll talk about the 3 most common mistakes that sysadmins and DBAs make when working with backups. The sooner you can deal with these, the better prepared you are for when real disaster strikes.

Most veterans in the IT industry would say, if you haven't committed a serious mistake in your job, you haven't made it yet. That's not to say you should be intentionally making mistakes at work and causing damages in your production databases. That's not the point.

Learning from someone else's mistakes is a great opportunity to hack your growth. In the context of database backups, it's a way to prevent, if not avoid, unexpected disasters that could potentially cause loss of data or - even worse - loss of revenue.

In this video, I highlight the mistakes I made dealing with backups, both for databases and infrastructures, and what you can learn from them. We can both share a laugh at the top 3 that I can remember (I have almost a dozen) so you don't have to experience them yourself.

When your organization decides to standardize on using a 3rd-party backup tool for your entire network, you need to make sure that it has support for SQL Server. This is where you start looking at capabilities for compression, deduplication, storage efficiency, etc.

But beyond supporting native SQL Server backup APIs, you need to look at what the 3rd-party backup tool cannot do for you. Because this is the part that you have to do on your own.

In this video, I cover what most 3rd-party backup tools CANNOT do for you. This will steer you in the right direction for when you implement a 3rd-party backup tool that is right for SQL Server.

The primary purpose of a backup is to have a working copy of the data that can be recovered in the event of an unexpected failure. But just because you have a backup doesn't mean it will actually meet it's purpose.

The worst situation to be in is when you desperately need the backups and realizing that they don't work. It's like jumping off a plane and not realizing that the parachute doesn't work until it's too late. You want to make sure that your backups actually work. And that starts with having the right backup strategy.

In keeping with the theme on backups, this video covers the backup strategies that can cause unexpected disasters. You can start re-evaluating your backup strategies to make sure that they actually work when you need them.

Designing and implementing a high availability and disaster recovery solution is a complex and expensive undertaking. It requires teams with different expertise to make sure that everything works the way you expect them to.

The most frustrating situation to be in is when you've spent the effort and money implementing a solution that causes more issues than the problems that it is supposed to solve. Like when a high availability solution becomes offline for hours when it's supposed to minimize downtime. Or when a disaster recovery solution could not be activated when the need arise.

In this video, I cover the 3 things that your SQL Server HA/DR solutions need for them to actually work. Avoid the temptation to implement any type of HA/DR solution without these 3 things.

Dealing with high availability and disaster recovery (HA/DR) solutions is a team sport. You cannot be a one-man team - even if you literally are the only person in the IT team.

This means you will be forced to work with different teams outside of your own. In much larger organizations, segregation of duties can cause a lot of headaches even when doing a very simple task. The SQL Server DBAs I worked with had to deal with the team that takes care of the Windows servers (I'm guessing you do, too). And most Windows and sysadmin teams dictate how the Windows Server Failover Clustering is deployed even when they don't even know what SQL Server Always On Availability Groups are and how they work.

It can be frustrating when your SQL Server databases are inaccessible, you can't properly respond to incidents, and you're wasting your time getting all the different teams on the same page on what needs to be done.

And when you need to troubleshoot an issue, it would take a long time to figure out that it's not even a SQL Server problem but one of the external dependencies.

In this video, I'll talk about how to deal with external dependencies the right way to make sure your SQL Server databases are highly available.

I didn't know this was a thing until a few weeks ago when I saw a blog post about it.

Since 2011, March 31st has become a day to create awareness on the importance of taking backups to protect data and prevent data loss.

And while all the online conversations revolve around the tools and technology needed to do backups, this week's episode of the Practical SQL Server HA/DR Show will focus on our responsibilities as IT professionals to involve non-technical end users in our pursuit to protect both enterprise and end-user data.

Because protecting data is not just the IT team's responsibility. It's everyone's responsibility.

If you're at a stage in your career where you now have responsibilities to design and build HA/DR solutions, this is for you.

If you're main responsibilities revolve around day-to-day operations, keep scrolling. This ain't for you.

In this video, I'll share lessons from past projects I've been involved with on how to design and build resilient HA/DR solutions. Because even if your only responsibility is the database platform, there are other things that you need to consider.

In keeping with the theme of showing some love to my developer and software engineer friends...

Having been a developer in my previous life, I understand the challenges and demands of the job. Most of the time, the goal is to ship code fast. My goal as a developer was to make something work. Nobody told me that my code would affect the availability of the entire stack.

When you mostly rely on the infrastructure folks to take care of high availability and disaster recovery, developers almost never think about this. Until you get woken up at 3AM because the app that you wrote crashed. And everybody else is waiting on you to release an emergency fix.

In this video, I'll talk about your responsibilities as backend developer or data engineer that affect HA/DR.

You'll hear me say this a lot: high availability and disaster recovery (HA/DR) is a team sport. Every member of the IT team - from the developers, to the database administrators, down to the operational support staff - need to be on the same page with regards to HA/DR.

But far too often, an HA/DR solution is implemented without engaging everyone on the team. With organizational silos, what the infrastructure team implemented for HA/DR might not necessarily be the right one for the databases. And this disconnect comes at a cost.

In this video, we'll explore the cost of not being on the same page with other IT teams. As a SQL Server DBA, your HA/DR solutions won't matter if other teams are not even aware of what you're doing.

When you're deploying HA/DR solutions, it's not enough to validate that your vendor officially supports the entire stack.

Beyond simply checking for support based on available documentation, these three things are key in order for your HA/DR solutions to be truly supported.

High availability and disaster recovery solutions require more than the technical specifications. In today's day and age of mission-critical and highly-sensitive data, you need more than the technical specifications to meet expectations.

In this video, I talk about the unwritten expectations that come with your HA/DR solutions. Meeting these expectations are key to making stakeholders and end users happy.

I rarely talk about this topic on the Practical SQL Server HA/DR Show. Mainly because I've been living in North America for far too long that convenience made me forget how important this topic is.

Being in California and experiencing earthquakes reminded me of how important this is in designing HA/DR solutions.

In this video, I talk about what business impact analysis means and how it needs to influence your HA/DR solutions.

One of the things I look at when evaluating a customer's high availability solution is the skillset of the operational staff.

That's because the skillset tells me a lot about how they will handle future outages.

What's scary is that the engineers responsible for supporting the mission-critical database infrastructure don't even know how the solution works.

In this video, I talk about the risks of having untrained operational staff managing your HA/DR solutions.

As Windows Server and SQL Server extended support expire, upgrade and migration projects continue to become a part of IT operations.

And if you've ever done either an upgrade or a migration project, it's never a straight-forward process.

Even more complicated is when you throw in high availability and disaster recovery solutions in the mix.

You want to make sure that upgrading and migrating HA/DR solutions don't become a nightmare.

This video covers the major considerations when upgrading and migrating HA/DR solutions

I haven't really talked about this topic. Maybe because living in North America provided comfort and convenience that this has not been a priority for my customers and clients.

But just because it isn't a priority doesn't mean it's not important.

In fact, this should be included as part of every disaster recovery strategy.

This video shines a spotlight on Business Continuity, why it needs to be a part of every DR strategy, and what every IT professional needs to know about it.

I like taking advantage of the first few weeks of the new year to plan for the year ahead. Especially if you're coming out of a change freeze period during the holiday season.

There might have been a lot of pending work that you need to take care of before the year ended. And if you don't prioritize these for your HA/DR solutions, they will end up in the back burner - and eventually forgotten.

This is also a great opportunity to come up with creative ways to showcase what you are capable of. Not only will you be able to improve your existing HA/DR solutions, you'll have talking points for that next performance review.

This video covers what you can do in the new year to re-evaluate your existing HA/DR solutions.

Need some ideas on what activities you can do to make sure that you're not caught off guard when a disaster strikes?

If you're running SQL Server on VMWare, chances are you've already heard of the "new" licensing model.

Say goodbye to perpetual licenses and hello to subscription licenses.

This has a direct impact on how you run your infrastructure. And if you're oblivious to the new licensing model, your executives will make sure this becomes your priority.

In this video, I'll provide practical and actionable steps to help you navigate this change.

Most administrators responsible for SQL Server databases tend to focus on high availability and/or disaster recovery of the data platform.

But regardless of the complexity of the solution, the data platform is just one piece of the puzzle. If you don't take the other pieces seriously, outages become inevitable.

In this video, we'll look at what else you need to consider outside of your data platform's high availability and disaster recovery.

SMH in texting jargon stands for "shaking my head".

Often times, these things are not considered in planning for HA/DR. And they make me want to shake my head.

In this lesson, we will learn who needs to be a part of our team, what other hardware have we not considered and what other storage media is missing from our toolbox. At the end of this lesson, we'll be able to create our own list of things that we need to include in our high availability and disaster recovery plan even before we see SQL Server.

This demo video is a compilation of different database recovery techniques that SQL Server DBAs should be familiar and comfortable with. We will look at recovering a database to a specific point in time, isolating critical objects or using table partitioning as an HA/DR option (more commonly called online piecemeal restore) and performing page-level restores.

It's not enough to protect your user databases. System databases are at the core of a SQL Server instance. Without them, you won't be able to start your SQL Server service. In this lesson, we will look at what they are and what they are responsible for. We will also look at how to protect the system databases and how we can prepare in case the system databases are unavailable.

This is the demo that accompanies the lesson on protecting system databases. We will look at recovering a SQL Server instance when the drive that hosts the system databases becomes unavailable and how to move the system databases to a different drive as part of your disaster recovery process.

This lesson talks about database mirroring, how it works, the different scenarios that we need to deal with and the factors affecting failover time. The concepts learned in this lesson will form the basis for understanding the new SQL Server 2012 Availability Groups feature.

This is the first demo that accompanies the lesson on database mirroring. We will look at a common database mirroring configuration - SQL Server instances that are in the same Active Directory domain. We will also simulate some of the scenarios we talked about in the lesson - failure of the mirror database and simultaneous failure of both the mirror server and the witness server - and how those failures affect database availability and the SEND the REDO queues in a database mirroring configuration.

This is the second demo that accompanies the lesson on database mirroring. We will look at configuring database mirroring across servers that are not members of an Active Directory domain. The steps in doing so are tricky so make sure to test this out in your lab prior to implementing it in your production environment.

This lesson describes how SQL Server log shipping works and the underlying concepts behind it. At a high level, log shipping is simply an automated process of taking a log backup, copying it and restoring it on a standby database. It's really no different from how transaction log backup and restore works.

This is the demo that accompanies the lesson on SQL Server log shipping. We will go thru the traditional way of configuring log shipping for a SQL Server database using SQL Server Management Studio. We will also look at a potential solution that you can use even on editions of SQL Server that don't officially support log shipping, such as the Express Editions

This lesson discusses the fundamentals of Windows Server Failover Clustering in the context of a SQL Server failover clustered instance. We will look at the different concepts to understand how Windows Server Failover Clustering works to support highly available SQL Server databases. The concepts learned in this lesson also form the foundation for understanding the SQL Server Availability Groups feature.

NOTE: This topic alone demands its own dedicated course due to the scope. If you or your team are interested in learning more about this, schedule a call with me using my online calendar

https://learnsqlserverhadr.com/call

This is the first demo that accompanies the lesson on SQL Server failover clustering fundamentals. We will build a traditional 2-node SQL Server failover clustered instance on Windows Server 2012 from start to finish and install SQL Server 2012 Service Pack 1.

SQL Server 2012 natively supports multi-subet/geographically dispersed clusters. In this second demo on failover clustering, we will build a 2-node SQL Server failover clustered instance that spans across multiple geographical locations. We will also look at the different network configurations that need to be considered to achieve our recovery objectives and service level agreements

This is an update to the SQL Server failover clustering installation. We will build a traditional 2-node SQL Server 2022 failover clustered instance on Windows Server 2022 - from start to finish.

Part 1 is all about installing, creating, and configuring the Windows Server 2022 failover cluster.

Download the Cluster Preparation Checklist from here

https://learnsqlserverhadr.com/clusterprepchecklist

This is an update to the SQL Server failover clustering installation. We will build a traditional 2-node SQL Server 2022 failover clustered instance on Windows Server 2022 - from start to finish.

Part 2 is all about installing a SQL Server 2022 failover clustered instance and adding nodes.

This is an update to the SQL Server failover clustering installation. We will build a traditional 2-node SQL Server 2022 failover clustered instance on Windows Server 2022 - from start to finish.

Part 3 is all about installing updates on a SQL Server 2022 failover clustered instance.

As an added bonus, it will also show how you can save time installing the updates by slipstreaming them during the installation.

This lesson discusses the Availability Groups feature introduced in SQL Server 2012. We will be drawing parallels between Availability Groups and the concepts behind database mirroring and failover clustering to better understand the feature. You will be surprised that you actually know some of the things covered in this lesson based on your previous knowledge of other existing technologies.

NOTE: This topic alone demands its own dedicated course due to the scope. If you or your team are interested in learning more about this, schedule a call with me using my online calendar

https://learnsqlserverhadr.com/call

This is the demo that accompanies the lesson on SQL Server 2012 Availability Groups. We will convert an existing database mirroring and log shipping configuration into an Availability Group configuration. In the process, we will configure readable secondaries and read-only routing to redirect read-only workloads to any of your chosen readable secondaries that isn't the primary.

Reviews summary

Practical sql server ha/dr strategies from experience

According to learners, this course provides a highly practical and insightful look into SQL Server High Availability and Disaster Recovery (HA/DR). Students appreciate the instructor's real-world experience and war stories, finding the content goes beyond technical documentation to cover non-technical challenges like team dynamics and business considerations. The course is seen as valuable for professionals, offering guidance on choosing and implementing the right HA/DR solutions. While demos are primarily on SQL Server 2012, recent updates adding content on SQL Server 2022 and Windows Server 2022 demonstrate the instructor's effort to keep the material relevant.

Covers core HA/DR technologies and concepts.

"The course covers the core HA/DR technologies like Availability Groups and Failover Clustering."

"I gained a solid understanding of how the different HA/DR features work."

"The demos on setting up clusters were helpful in visualizing the concepts."

Recent additions cover newer SQL Server versions.

"The inclusion of SQL Server 2022 and Windows Server 2022 installation demos is a great update."

"It's good to see the instructor is adding content to keep up with newer versions, addressing the initial focus on 2012."

"The course is evolving with the technology, which is important for long-term value."

Aimed squarely at IT professionals.

"This course is clearly designed for DBAs and IT pros who manage SQL Server in production."

"The focus on critical business needs like RPO/RTO and SLA is exactly what I needed for my work."

"It helps prepare you for real-world challenges faced by database administrators."

Addresses non-technical aspects of HA/DR.

"It was great to see coverage of things like team silos and communication, which are huge in HA/DR projects."

"The course highlights that HA/DR is a team sport and not just about the technology."

"I never considered the 'curse of silos' before, but this course showed me its impact on HA/DR success."

Instructor shares valuable personal experience.

"The instructor's depth of knowledge and willingness to share lessons learned is a major plus."

"Listening to the instructor share his extensive experience was incredibly insightful."

"I felt confident learning from someone with so much hands-on experience in the field."

Focuses on practical scenarios and experience.

"This course is packed with real-world examples and insights you won't find in official documentation."

"I really appreciated the 'war stories' and practical advice from someone who has been there."

"The instructor's personal experience makes the complex topics much more understandable and applicable to my job."

"I learned practical strategies for dealing with common HA/DR issues in a production environment."

Main technical demos use an older SQL version.

"Some core demos are based on SQL Server 2012, although the concepts are explained to be version-agnostic."

"While concepts apply, I initially wished the primary demos used a more recent version than 2012."

SQL Server High Availability and Disaster Recovery (HA/DR)

Here's a deal for you

What's inside

Learning objective

Syllabus

Traffic lights

Save this course

Reviews summary

Practical sql server ha/dr strategies from experience

Activities

Career center

Reading list

Share

Similar courses