Update doc : metrics (#158)

* update metrics documentation

Co-authored-by: erenJag <erenJag>
This commit is contained in:
erenJag 2020-07-29 15:57:33 +02:00 committed by GitHub
parent 5e561e30bd
commit 6f623f9a96
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23
6 changed files with 85 additions and 64 deletions

View file

@ -12,7 +12,7 @@ A collection can be installed by typing `cscli install collection crowdsecurity/
In the same spirit, the [crowdsecurity/sshd](https://hub.crowdsec.net/author/crowdsecurity/collections/sshd)'s collection will fit most sshd setups !
While {{crowdsec.name}} is running, a quick look at [`cscli metrics`](/observability/metrics/) should help you ensure that your log sources are correctly parsed.
While {{crowdsec.name}} is running, a quick look at [`cscli metrics`](/observability/command_line/) should help you ensure that your log sources are correctly parsed.
## List installed configurations

View file

@ -78,6 +78,6 @@ INFO[0000] Acquisition Metrics:
!!! info
All these metrics are actually coming from {{crowdsec.name}}'s prometheus agent. See [prometheus](/observability/metrics/) directly for more insights.
All these metrics are actually coming from {{crowdsec.name}}'s prometheus agent. See [prometheus](/observability/prometheus/) directly for more insights.

View file

@ -1,11 +1,8 @@
## metrics via {{cli.name}}
```bash
{{cli.name}} metrics
```
This command provides an overview of {{crowdsec.name}} statistics. By default it assumes that the {{crowdsec.name}} is installed on the same machine.
This command provides an overview of {{crowdsec.name}} statistics provided by [prometheus client](/observability/prometheus/). By default it assumes that the {{crowdsec.name}} is installed on the same machine.
The metrics are split in 3 main sections :
@ -61,58 +58,4 @@ INFO[0000] Parser Metrics:
+--------------------------------+--------+--------+----------+
```
</details>
## metrics via {{crowdsec.name}} prometheus
{{crowdsec.name}} can expose a prometheus endpoint for collection (on `http://127.0.0.1:6060/metrics` by default).
The goal of this endpoint, besides the usual resources consumption monitoring, aims at offering a view of {{crowdsec.name}} "applicative" behavior :
- is it processing a lot of logs ? is it parsing them successfully ?
- are a lot of scenarios being triggered ?
- are a lot of IPs banned ?
- etc.
All the counters are "since {{crowdsec.name}} start".
### Scenarios
- `cs_bucket_created_total` : number of instantiation of each scenario
- `cs_bucket_overflowed_total` : number of overflow of each scenario
- `cs_bucket_underflowed_total` : number of underflow of each scenario (bucket was created but expired because of lack of events)
- `cs_bucket_poured_total` : number of event poured to each scenario with source as complementary key :
```
#2030 lines from `/var/log/nginx/access.log` were poured to `crowdsecurity/http-scan-uniques_404` scenario
cs_bucket_poured_total{name="crowdsecurity/http-scan-uniques_404",source="/var/log/nginx/access.log"} 2030
```
### Parsers
- `cs_node_hits_total` : how many time an event from a specific source was processed by a parser node :
```
# 235 lines from `auth.log` were processed by the `crowdsecurity/dateparse-enrich` parser
cs_node_hits_total{name="crowdsecurity/dateparse-enrich",source="/var/log/auth.log"} 235
```
- `cs_node_hits_ko_total` : how many times an event from a specific was unsuccessfully parsed by a specific parser
```
# 2112 lines from `error.log` failed to be parsed by `crowdsecurity/http-logs`
cs_node_hits_ko_total{name="crowdsecurity/http-logs",source="/var/log/nginx/error.log"} 2112
```
- `cs_node_hits_ok_total` : how many times an event from a specific source was successfully parsed by a specific parser
- `cs_parser_hits_total` : how many times an event from a source has hit the parser
- `cs_parser_hits_ok_total` : how many times an event from a source was successfully parsed
- `cs_parser_hits_ko_total` : how many times an event from a source was unsuccessfully parsed
### Acquisition
- `cs_reader_hits_total` : how many events were read from a specific source
</details>

View file

@ -4,8 +4,8 @@ Observability in security software is crucial, especially when this software mig
We attempt to provide good observability of {{crowdsec.name}}'s behavior :
- {{crowdsec.name}} itself exposes a [prometheus instrumentation](/observability/metrics/#metrics-via-crowdsec-prometheus)
- {{cli.Name}} allows you to view part of prometheus metrics in [cli (`{{cli.bin}} metrics`)](/observability/metrics/)
- {{crowdsec.name}} itself exposes a [prometheus instrumentation](/observability/prometheus/)
- {{cli.Name}} allows you to view part of prometheus metrics in [cli (`{{cli.bin}} metrics`)](/observability/command_line/)
- {{crowdsec.name}} logging is contextualized for easy processing
- for **humans**, {{cli.name}} allows you to trivially start a service [exposing dashboards](/observability/dashboard/) (using [metabase](https://www.metabase.com/))

View file

@ -0,0 +1,73 @@
{{crowdsec.name}} can expose a {{prometheus.htmlname}} endpoint for collection (on `http://127.0.0.1:6060/metrics` by default).
The goal of this endpoint, besides the usual resources consumption monitoring, aims at offering a view of {{crowdsec.name}} "applicative" behavior :
- is it processing a lot of logs ? is it parsing them successfully ?
- are a lot of scenarios being triggered ?
- are a lot of IPs banned ?
- etc.
All the counters are "since {{crowdsec.name}} start".
### Scenarios
- `cs_buckets` : number of scenario that currently exist
- `cs_bucket_created_total` : total number of instantiation of each scenario
- `cs_bucket_overflowed_total` : total number of overflow of each scenario
- `cs_bucket_underflowed_total` : total number of underflow of each scenario (bucket was created but expired because of lack of events)
- `cs_bucket_poured_total` : total number of event poured to each scenario with source as complementary key
<details>
<summary>example</summary>
```
#2030 lines from `/var/log/nginx/access.log` were poured to `crowdsecurity/http-scan-uniques_404` scenario
cs_bucket_poured_total{name="crowdsecurity/http-scan-uniques_404",source="/var/log/nginx/access.log"} 2030
```
</details>
### Parsers
- `cs_node_hits_total` : how many time an event from a specific source was processed by a parser node :
<details>
<summary>example</summary>
```
# 235 lines from `auth.log` were processed by the `crowdsecurity/dateparse-enrich` parser
cs_node_hits_total{name="crowdsecurity/dateparse-enrich",source="/var/log/auth.log"} 235
```
</details>
- `cs_node_hits_ko_total` : how many times an event from a specific was unsuccessfully parsed by a specific parser
<details>
<summary>example</summary>
```
# 2112 lines from `error.log` failed to be parsed by `crowdsecurity/http-logs`
cs_node_hits_ko_total{name="crowdsecurity/http-logs",source="/var/log/nginx/error.log"} 2112
```
</details>
- `cs_node_hits_ok_total` : how many times an event from a specific source was successfully parsed by a specific parser
- `cs_parser_hits_total` : how many times an event from a source has hit the parser
- `cs_parser_hits_ok_total` : how many times an event from a source was successfully parsed
- `cs_parser_hits_ko_total` : how many times an event from a source was unsuccessfully parsed
### Acquisition
- `cs_reader_hits_total` : how many events were read from a specific source
### Info
- `cs_info` : Information about {{crowdsec.name}} (software version)

View file

@ -22,7 +22,9 @@ nav:
- Observability:
- Overview: observability/overview.md
- Logs: observability/logs.md
- Metrics: observability/metrics.md
- Metrics:
- Prometheus: observability/prometheus.md
- Command line: observability/command_line.md
- Dashboard: observability/dashboard.md
- References:
- Parsers format: references/parsers.md
@ -248,6 +250,9 @@ extra:
Name: Duration
htmlname: "[duration](/references/scenarios/#duration)"
Htmlname: "[Duration](/references/scenarios/#duration)"
prometheus:
name: prometheus
htmlname: "[prometheus](https://github.com/prometheus/client_golang)"
api:
name: API
htmlname: "[API](TBD)"