Add prometheus and promtail definitions

This commit is contained in:
Manav Rathi 2024-03-14 21:56:36 +05:30
parent 477e3fee80
commit dc29ab496f
No known key found for this signature in database
8 changed files with 205 additions and 15 deletions

View file

@ -1,25 +1,24 @@
# Services
"Services" are various Docker images that we run on our instances and manage
using systemd.
"Services" are Docker images we run on our instances and manage using systemd.
All our services (including museum itself) follow the same
pattern:
* They're meant to run on vanilla Ubuntu instances. The only expectation they
have is for Docker to be installed.
* They're run on vanilla Ubuntu instances. The only expectation they have is for
Docker to be installed.
* They log to fixed, known, locations - `/root/var/log/foo.log` - so that these
logs can get ingested by Promtail.
logs can get ingested by Promtail if needed.
* Each service should consist of a Docker image (or a Docker compose file), and a
systemd unit file.
* To start / stop / cron the service, we use the corresponding systemd command.
* To start / stop / schedule the service, we use systemd.
* Each time the service runs it should pull the latest Docker image, so there is no
separate installation/upgrade step needed. We can just restart the service, and it'll
use the latest code.
* Each time the service runs it should pull the latest Docker image, so there is
no separate installation/upgrade step needed. We can just restart the service,
and it'll use the latest code.
* Any credentials and/or configuration should be read by mounting the
appropriate file from `/root/service-name` into the running Docker container.
@ -31,15 +30,16 @@ sudo systemctl status my-service
sudo systemctl start my-service
sudo systemctl stop my-service
sudo systemctl restart my-service
sudo journalctl --unit my-service
```
## Adding a service
Create a systemd unit file (For examples, see the various `*.service` files in
this repository).
Create a systemd unit file (See the various `*.service` files in this repository
for examples).
If we want the service to start on boot, add an `[Install]` section to its
service file (_note_: there is one more step later):
service file (_note_: starting on boot requires one more step later):
```
[Install]
@ -98,6 +98,7 @@ sudo journalctl --follow --unit example
Services should log to files in `/var/logs` within the container. This should be
mounted to `/root/var/logs` on the instance (using the `-v` flag in the service
file which launches the Docker container or the Docker compose cluster).
Finally, ensure there is an entry for this log file in the
`promtail/promtail.yaml` on that instance. The logs will then get scraped by
Promtail and sent over to Grafana.
If these logs need to be sent to Grafana, then ensure that there is an entry for
this log file in the `promtail/promtail.yaml` on that instance. The logs will
then get scraped by Promtail and sent over to Grafana.

View file

@ -0,0 +1,32 @@
# Prometheus
Install `prometheus.service` on an instance if it is running something that
exports custom Prometheus metrics. In particular, museum does.
Also install `node-exporter.service` (after installing
[node-exporter](https://prometheus.io/docs/guides/node-exporter/) itself) if it
is a production instance whose metrics (CPU, disk, RAM etc) we want to monitor.
## Installing
Prometheus doesn't currently support environment variables in config file, so
remember to change the hardcoded `XX-HOSTNAME` too in addition to adding the
`remote_write` configuration.
```sh
scp -P 7426 services/prometheus/* <instance>:
nano prometheus.yml
sudo mv prometheus.yml /root/prometheus.yml
sudo mv prometheus.service /etc/systemd/system/prometheus.service
sudo mv node-exporter.service /etc/systemd/system/node-exporter.service
```
Tell systemd to pick up new service definitions, enable the units (so that they
automatically start on boot going forward), and start them.
```sh
sudo systemctl daemon-reload
sudo systemctl enable node-exporter prometheus
sudo systemctl start node-exporter prometheus
```

View file

@ -0,0 +1,12 @@
[Unit]
Documentation=https://prometheus.io/docs/guides/node-exporter/
Wants=network-online.target
After=network-online.target
[Install]
WantedBy=multi-user.target
[Service]
User=node_exporter
Group=node_exporter
ExecStart=/usr/local/bin/node_exporter

View file

@ -0,0 +1,16 @@
[Unit]
Documentation=https://prometheus.io/docs/prometheus/
Requires=docker.service
After=docker.service
[Install]
WantedBy=multi-user.target
[Service]
ExecStartPre=docker pull prom/prometheus
ExecStartPre=-docker stop prometheus
ExecStartPre=-docker rm prometheus
ExecStart=docker run --name prometheus \
--add-host=host.docker.internal:host-gateway \
-v /root/prometheus.yml:/etc/prometheus/prometheus.yml:ro \
prom/prometheus

View file

@ -0,0 +1,39 @@
# https://prometheus.io/docs/prometheus/latest/configuration/
global:
scrape_interval: 30s # Default is 1m
scrape_configs:
- job_name: museum
static_configs:
- targets: ["host.docker.internal:2112"]
relabel_configs:
- source_labels: [__address__]
regex: ".*"
target_label: instance
replacement: XX-HOSTNAME
- job_name: "prometheus"
static_configs:
- targets: ["localhost:9090"]
relabel_configs:
- source_labels: [__address__]
regex: ".*"
target_label: instance
replacement: XX-HOSTNAME
- job_name: "node"
static_configs:
- targets: ["host.docker.internal:9100"]
relabel_configs:
- source_labels: [__address__]
regex: ".*"
target_label: instance
replacement: XX-HOSTNAME
# Grafana Cloud
remote_write:
- url: https://g/api/prom/push
basic_auth:
username: foo
password: bar

View file

@ -0,0 +1,26 @@
# Promtail
Install `promtail.service` on an instance if it is running something whose logs
we want in Grafana.
## Installing
Replace `client.url` in the config file with the Loki URL that Promtail should
connect to, and move the files to their expected place.
```sh
scp -P 7426 services/promtail/* <instance>:
nano promtail.yaml
sudo mv promtail.yaml /root/promtail.yaml
sudo mv promtail.service /etc/systemd/system/promtail.service
```
Tell systemd to pick up new service definitions, enable the unit (so that it
automatically starts on boot), and start it this time around.
```sh
sudo systemctl daemon-reload
sudo systemctl enable promtail
sudo systemctl start promtail
```

View file

@ -0,0 +1,19 @@
[Unit]
Documentation=https://grafana.com/docs/loki/latest/clients/promtail/
Requires=docker.service
After=docker.service
[Install]
WantedBy=multi-user.target
[Service]
ExecStartPre=docker pull grafana/promtail
ExecStartPre=-docker stop promtail
ExecStartPre=-docker rm promtail
ExecStart=docker run --name promtail \
--hostname "%H" \
-v /root/promtail.yaml:/config.yaml:ro \
-v /var/log:/var/log \
-v /root/var/logs:/var/logs:ro \
-v /var/lib/docker/containers:/var/lib/docker/containers:ro \
grafana/promtail -config.file=/config.yaml -config.expand-env=true

View file

@ -0,0 +1,45 @@
# https://grafana.com/docs/loki/latest/clients/promtail/configuration/
# We don't want Promtail's HTTP / GRPC server.
server:
disable: true
# Loki URL
# For Grafana Cloud, it can be found in the integrations section.
clients:
- url: http://loki:3100/loki/api/v1/push
# Manually add entries for all our services. This is a bit cumbersome, but
# - Retains flexibility in file names.
# - Makes adding job labels easy.
# - Does not get in the way of logrotation.
#
# In addition, also scrape logs from all docker containers.
scrape_configs:
- job_name: museum
static_configs:
- labels:
job: museum
host: ${HOSTNAME}
__path__: /var/logs/museum.log
- job_name: copycat-db
static_configs:
- labels:
job: copycat-db
host: ${HOSTNAME}
__path__: /var/logs/copycat-db.log
- job_name: phoenix
static_configs:
- labels:
job: phoenix
host: ${HOSTNAME}
__path__: /var/logs/phoenix.log
- job_name: docker
static_configs:
- labels:
job: docker
host: ${HOSTNAME}
__path__: /var/lib/docker/containers/*/*-json.log