complete monitoring
17/04/2020A complete monitoring solution collects logs and metrics from multiple servers, sends it to one server for aggregation and storage, and provides visualisation tools and alerting systems.
all in one : netdata
It automatically scans a huge range of metrics, but no logs, and it doesnt store more than some hours of data.
This is why we often need a more complete stack.
collectors
Need to be installed on every server, redirecting data to their master server (database)
promtail reads logs and push them to loki
filebeat reads logs and push them to elasticsearch
metricbeat reads metrics and push them to elasticsearch
telegraf reads metrics and push them to influxdb
databases
Need to be installed on a master server to gather every datasources
influxdb is a metric database
loki is a log database (light)
elasticsearch is a log database (heavy)
visualisation
Needs to connect to databases
grafana can visualize almost every datasource (influxdb, elasticsearch, loki, mysql, ...).
kibana can visualize elasticsearch database
chronograf can visualize influxdb database
alerting
needs testing : kapacitor / alertmanager / kibana ?
stacks
Theses tools are usually organised in stacks :
- TICK (metrics) : telegraf => influxdb => chronograf => kapacitor
- prometheus stack (metrics) : prometheus => grafana + alertmanager
- ELK (logs) : filebeat + metricbeat + logstash => elasticsearch => kibana
- Loki (logs) : promtail => loki => grafana
loki
healthcheck
curl -X GET -H 'Content-type:application/json' http://localhost:3100/ready curl -X GET -H 'Content-type:application/json' http://localhost:3100/metrics
querying
curl -s "http://localhost:3100/api/prom/label"
curl -s "http://localhost:3100/api/prom/label/filename/values"
curl -G -s "http://localhost:3100/loki/api/v1/query" --data-urlencode 'query={job="varlogs"}' | jq
curl -G -s "http://localhost:3100/loki/api/v1/query" --data-urlencode 'query={stream=~".*"}' | jq
curl -G -H 'Sec-WebSocket-Version: 13' -H 'Sec-WebSocket-Extensions: permessage-deflate' -H 'Sec-WebSocket-Key: v4vMUSLqpDDrrvhrCqfE+Q==' -H 'Connection: keep-alive, Upgrade' -H 'Upgrade: websocket' 'http://localhost:3100/loki/api/v1/tail' --data-urlencode 'query={job="varlogs"}'inserting
curl -v -H "Content-Type: application/json" -XPOST -s "http://localhost:3100/loki/api/v1/push" --data-raw '{"streams": [{ "stream": { "foo": "bar2" }, "values": [ [ "1570818238000000000", "fizzbuzz" ] ] }]}'influxdb
healthcheck
http://localhost:8086/api/health
create database
curl -XPOST 'http://localhost:8086/query' --data-urlencode 'q=CREATE DATABASE "mydb"'
insert
curl -i -X POST 'http://localhost:8086/write?db=mydb&u=root&p=superpassword' --data-binary 'grostest value=0.64 1582032137924' curl -i -X POST 'http://localhost:8086/write?db=mydb&u=root&p=superpassword' --data-binary 'grostest value=0.64 1582032137934' curl -i -X POST 'http://localhost:8086/write?db=mydb&u=root&p=superpassword' --data-binary 'grostest value=0.64 2020-02-18T13:15:16Z'
query
curl -G 'http://localhost:8086/query?db=mydb&u=root&p=superpassword' --data-urlencode "q=SHOW Measurements" curl -G 'http://localhost:8086/query?db=mydb&u=root&p=superpassword' --data-urlencode "q=Select count(*) from grostest"
portainer
healthcheck
curl http://localhost:9000/api/status
generate a password
docker run --rm httpd:2.4-alpine htpasswd -nbB admin 'superpassword' | cut -d ":" -f 2
ps: when using the password in docker compose, dont forget to double the $ as yaml format would replace $xxxx with environment value.
prometheus
healthcheck
http://localhost:9090/status
grafana
healthcheck
http://localhost:3000/api/health