We may earn an affiliate commission when you visit our partners.
Sean Bradley

We learn the basics of Prometheus so that you can get started as soon as possible, and to follow the exercises, try them out for yourself and you can see it working.

In this course we will quickly build a bare bones Prometheus server from scratch, in the cloud and on your own Ubuntu 20.04 LTS.

Read more

We learn the basics of Prometheus so that you can get started as soon as possible, and to follow the exercises, try them out for yourself and you can see it working.

In this course we will quickly build a bare bones Prometheus server from scratch, in the cloud and on your own Ubuntu 20.04 LTS.

We will keep it simple and set it up on a default, unrestricted, un-customised Ubuntu 20.04 LTS. You will then be able to match what you see in the videos and copy/paste directly from my documentation and see the same result. Once you have the basic experience of seeing Prometheus work, you will be able to problem solve in a more directed manner, and apply your knowledge to other operating systems in the future.

At the end of the course, you will have a basic Prometheus setup, which will be in the cloud, behind a reverse proxy, with SSL, a domain name, Basic Authentication, with several custom recording rules, several alerting rules, several node exporters local and external, an alert manager that can send emails via an external SMTP service, a Grafana install, and configured with the Prometheus Data source and several dashboards.

Enroll now

What's inside

Learning objectives

  • Install prometheus and we see it working
  • Build a bare bones prometheus server from scratch, in the cloud.
  • Learn how to set it up as a service so that it is always running in the background
  • Configure it to be behind a nginx reverse proxy
  • Configure a domain name and add ssl to ensure transport layer encryption for the user interface
  • Add basic authentication to restrict user access
  • Install several node-exporters, local and external, manage there firewall rules and compare the differences
  • Learn the basics of querying metrics from simple metrics, instant vectors, range vectors, functions, aggregates and sub queries
  • Create custom metrics from complicated queries and save them as recording rules
  • Create alerting rules and demonstrate inactive, pending and firing states
  • Setup a smtp server to send email alerts
  • Configure alert manager to send alerts from prometheus
  • Install grafana
  • Setup the prometheus datasource inside grafana
  • Setup prometheus dashboards for the main prometheus service and node exporters
  • Show more
  • Show less

Syllabus

Introduction

We will setup a dedicated Prometheus server.

Before you start, you will need a Linux server. Preferably an unrestricted Ubuntu 20.04 LTS Server with root access, since all the commands demonstrated in this course were executed on Ubuntu 20.04 LTS Server.

You can use other operating systems, such as Centos, but all commands in the course are prepared for Ubuntu 20, so you will experience some differences in syntax or equivalent commands which you may need to research yourself if I can't help you.

Once you have an Ubuntu 20.04 LTS server ready, you can start.

SSH onto your server, on windows I use Putty as my SSH client.

# sudo apt install prometheus


This will have installed 2 services being Prometheus and the Prometheus Node Exporter. You can verify there status using the commands. (Press Ctrl-C to exit the status log)

# sudo service prometheus status

# sudo service prometheus-node-exporter status


The install also created a user called Prometheus. You can see which processes it is running by using the command,

# ps -u prometheus


If Prometheus has started successfully, you can visit it at

You can visit it at http://[your ip address]:9090

Read more

Note that this is optional, but it is useful if your Prometheus server is accessible from the internet, you want it to look more professional to clients and you want to have less problems sending emails from it.

I have gone onto my domain name provider, and added an A Name record that points to the IP address of my new Prometheus server.

Example,

prometheus.sbcode.net. IN A 134.209.224.39

Your domain and IP will be different, and note that it may take some time for the DNS record to propagate across the internet.





One option to help secure our Prometheus server is to put it behind a reverse proxy so that we can later add SSL and an Authentication layer over the default unrestricted Prometheus web interface.

We can use Nginx.

# sudo apt install nginx


CD to the Nginx sites-enabled folder

# cd /etc/nginx/sites-enabled


Create a new Nginx configuration for Prometheus

# sudo nano prometheus


And copy/paste the example below

-------------------------------

server {
    listen 80;
    listen [::]:80;
    server_name  YOUR-DOMAIN-NAME;

    location / {
        proxy_pass           http://localhost:9090/;
    }
}

----------------------------

Save and test the new configuration has no errors

# nginx -t


Restart Nginx

# sudo service nginx restart

# sudo service nginx status


Test it by visiting again

http://YOUR-DOMAIN-NAME


We will now add transport encryption to the Prometheus web user interface.

Since I have already set up the domain name, I can get a free certificate using Certbot.

Certbot will install a LetsEncrypt SSL certificate for free.

Ensure your domain name has propagated before running CertBot.

Your domain and IP will be different than mine, and note that it may take some time for the DNS record to propagate across the internet.

On my server, I will run

# sudo snap install --classic certbot


Now we can run CertBot.

# sudo certbot --nginx


Follow the prompts and select the domain name I want to secure.

Next open the Nginx Prometheus config file we created earlier to see the changes.

# sudo nano /etc/nginx/sites-enabled/prometheus


Everything is great so far, but anybody in the world with the internet access and the URL can visit my Prometheus server and see my data.

To solve this problem, we will add user authentication.

We will use Basic Authentication.

SSH onto your server and CD into your /etc/nginx folder.

# cd /etc/nginx


Then install apache2-utils (on ubuntu) or httpd-tools (on centos)

# //on ubuntu

# sudo apt install apache2-utils


# // on centos

# sudo yum install httpd-tools


Now we can create a password file. In the command below, I am creating a user called 'admin'.

# htpasswd -c /etc/nginx/.htpasswd admin


I then enter a password for the user.

Next open the Nginx Prometheus config file we created.

# sudo nano /etc/nginx/sites-enabled/prometheus


And add the two authentication properties in the examples below to the existing Nginx configuration file we have already created.

-------------------

server {
    ...
    #addition authentication properties
    auth_basic  "Protected Area";
    auth_basic_user_file /etc/nginx/.htpasswd;
    location / {
        proxy_pass           http://localhost:9090/;
    }
    ...
}

-------------------------------

Save and test the new configuration has no errors

# nginx -t


Restart Nginx

# sudo service nginx restart

# sudo service nginx status

When you install Prometheus using

# apt install prometheus


It sets up two metrics endpoints.

  • Prometheus : http:127.0.0.1:9090/metrics

  • Node Exporter : http:127.0.0.1:9100/metrics

In this video, I show where the settings are configured for these metrics endpoints, how to enable them, change them and show some of the properties that can be retrieved in the graph expressions field.

Now we will install an external Prometheus Node Exporter on a different server.

# apt install prometheus-node-exporter


Now check the node exporter is running.

# sudo service node-exporter status


You can stop, start or restart a node exporter using

# sudo service node-exporter stop

# sudo service node-exporter start

# sudo service node-exporter restart


Node exporter will now be running on http://[your domain or ip]:9100/metrics

You can now block port 9100 externally, but leave it open internally for localhost.

And optionally, you can also allow a specific ip address or domain on the internet to access the port.

There may be a time when you want to delete data from the Prometheus TSDB database.

Data will be automatically deleted after the storage retention time has passed. By default it is 15 days.

If you want to delete specific data earlier, then you are able.

You need to enable the admin api in Prometheus before you can.

# sudo nano /etc/default/prometheus


Add --web.enable-admin-api to the ARGS="" variable. eg,

ARGS="--web.enable-admin-api"


Restart Prometheus and check status

# sudo service prometheus restart

# sudo service prometheus status


You can now make calls to the admin api.

In my example I want to delete all time series for the instance="sbcode.net:9100"

So I run the delete_series api endpoint providing the value to match. eg,

# curl -X POST -g 'http://localhost:9090/api/v1/admin/tsdb/delete_series?match[]={instance="sbcode.net:9100"}'


When I re execute the Prometheus query, the time series I wanted deleted no longer exists.

Now that we have at least 2 scape targets, we can begin to run some more interesting queries that involve multiple scrape targets.

We will try basic time series queries, queries with regular expressions, compare the Instant and Range Vector data types, use Functions, Aggregates and Sub Queries.

We create some Recording rules, for the more complicated common queries we may want to create time series data for.

Alerting rules are created in Prometheus very similar to how we created recording rules. We can use the same prometheus_rules.yml or, if you wish, create a different file but remember to add the reference to it in the rule_files section in prometheus.yml.

Install the Prometheus Alert Manager

# sudo apt install prometheus-alertmanager


It has started a new service called prometheus-alertmanager

# sudo service prometheus-alertmanager status


It is also managed by the user prometheus

# ps -u prometheus


Note that the service is running on port 9093

Visit http://[your domain name or ip]:9093/

We now configure the Prometheus and Alert Manager processes to communicate with each other, and to send alerts when the alerting rules fire.

I set up a new minimum spec Ubuntu 20.04 LTS server for the purpose of demonstrating install Grafana.

Once you have connected to your new server, make sure your package lists are updated.

# sudo apt update


Then ensure that the dependencies for Grafana are installed.

# sudo apt-get install -y adduser libfontconfig1


Now to download the binary, and run the debian package manager.

# wget https://dl.grafana.com/oss/release/grafana_7.2.0_amd64.deb

# sudo dpkg -i grafana_7.2.0_amd64.deb


The install has now completed. You can now start the Grafana service

# sudo service grafana-server start


Check the status

# sudo service grafana-server status


Your Grafana server will be hosted at 

http://[your Grafana server ip]:3000


The default Grafana login is

Username : **admin**

Password : **admin**


You have the option to update your password upon first login and then be presented with the option to add a new data source and create dashboards and users.

I create a new Prometheus Datasource using the Grafana user interface.

I connect to my Prometheus url, which also uses SSL and has basic auth configured.


Lets enable some of the default Dashboards provided with the Prometheus Data Source, and download one from the community specifically for the node exporters .

Thanks for taking part in my course.

We have achieved a lot in a small amount of time and you now know if you want to take Prometheus further.

We have built our dedicated Prometheus server, with multiple node exporters, with custom recording rules, alerting rules, behind a reverse proxy, with SSL and Basic Authentication. We also know how to create our own specialised queries and we know how to set up the alerting manager to at minimum send emails via SMTP. We also installed Grafana, can query using PromQL and install Prometheus dashboards.

If you decide to continue with Prometheus, then a good source of possibilities for your future direction is this Default Port Allocations page, as it show all the hundreds of exporters in development for Prometheus today.

https://github.com/prometheus/prometheus/wiki/Default-port-allocations

Save this course

Save Prometheus Alerting and Monitoring to your list so you can find it easily later:
Save

Activities

Coming soon We're preparing activities for Prometheus Alerting and Monitoring. These are activities you can do either before, during, or after a course.

Career center

Learners who complete Prometheus Alerting and Monitoring will develop knowledge and skills that may be useful to these careers:

Reading list

We haven't picked any books for this reading list yet.
Practical guide to using Prometheus. It covers topics such as installation, configuration, monitoring, and alerting. It also includes recipes for common Prometheus use cases.
While not specifically focused on alerting, this book provides a comprehensive guide to site reliability engineering (SRE) practices, including chapters on monitoring, alerting, and incident response. It is valuable for anyone involved in designing and operating reliable systems.
Provides practical advice and best practices for system and network administration, including a chapter on monitoring and alerting. It covers topics such as alert design, monitoring tools, and escalation procedures.
Provides a comprehensive guide to using Nagios, a popular open-source monitoring and alerting tool. It covers topics such as configuring Nagios, writing custom plugins, and setting up notifications.
Provides a practical guide to using Prometheus, a popular open-source monitoring and alerting system. It covers topics such as installing and configuring Prometheus, writing PromQL queries, and creating alerts.
Provides a comprehensive guide to observability engineering, a set of practices and tools that enable engineers to monitor, troubleshoot, and debug complex systems. It includes a chapter on alerting, providing guidance on how to design and implement effective alerting systems.
Provides a practical guide to implementing service level objectives (SLOs), which are used to define and measure the performance of software systems. It includes a chapter on alerting and monitoring, providing guidance on how to set up SLOs and create alerts that measure progress towards meeting them.
Provides a comprehensive overview of Site Reliability Engineering (SRE), a discipline focused on improving the reliability, performance, and efficiency of complex distributed systems. It covers topics such as service level objectives (SLOs), error budgets, monitoring and alerting, capacity planning, and incident response.
Provides a comprehensive overview of the art of monitoring. It covers topics such as the different types of monitoring tools, the principles of effective monitoring, and the challenges of monitoring complex systems.
Provides a comprehensive overview of Prometheus, an open-source monitoring system. It covers topics such as installing and configuring Prometheus, creating alerts, and using Prometheus to monitor different types of systems.
Provides a comprehensive overview of Jaeger, an open-source distributed tracing system. It covers topics such as installing and configuring Jaeger, creating traces, and using Jaeger to monitor different types of systems.
Provides a comprehensive overview of performance engineering. It covers topics such as performance metrics, data collection and analysis, and performance modeling.
Gentle introduction to Node Exporter. It covers topics such as what Node Exporter is, how it works, and how to use it.
Discusses design patterns for Node.js applications, including how to use Node Exporter to monitor system health.
Provides techniques for optimizing the performance of Node.js applications, including how to use Node Exporter to identify performance bottlenecks.
Provides a guide to testing Node.js applications, including how to use Node Exporter to monitor test coverage.
Covers web development using Node.js, including how to use Node Exporter to monitor web application performance.

Share

Help others find this course page by sharing it with your friends and followers:

Similar courses

Similar courses are unavailable at this time. Please try again later.
Our mission

OpenCourser helps millions of learners each year. People visit us to learn workspace skills, ace their exams, and nurture their curiosity.

Our extensive catalog contains over 50,000 courses and twice as many books. Browse by search, by topic, or even by career interests. We'll match you to the right resources quickly.

Find this site helpful? Tell a friend about us.

Affiliate disclosure

We're supported by our community of learners. When you purchase or subscribe to courses and programs or purchase books, we may earn a commission from our partners.

Your purchases help us maintain our catalog and keep our servers humming without ads.

Thank you for supporting OpenCourser.

© 2016 - 2025 OpenCourser