prometheus监控elk

prometheus 监控 es,同样采用 exporter 的方案。

  • 项目地址:
    • elasticsearch_exporter:https://github.com/justwatchcom/elasticsearch_exporter

# 1、安装部署

现有 es 三节点的集群,环境大概如下:

主机 组件
10.3.6.30–es-node1 es,nginx
10.3.6.125–es-node2 es
10.3.6.124–es-node3 es,kibana

接着分别在如上三台主机上进行如下配置:

1
2
3
wget https://github.com/justwatchcom/elasticsearch_exporter/releases/download/v1.1.0/elasticsearch_exporter-1.1.0.linux-amd64.tar.gz
tar xf elasticsearch_exporter-1.1.0.linux-amd64.tar.gz
mv elasticsearch_exporter-1.1.0.linux-amd64 /usr/local/elasticsearch_exporter

启动监控客户端:

1
nohup ./elasticsearch_exporter --web.listen-address ":9109"  --es.uri http://10.3.6.30:9200 &

使用 systemd 管理:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
cat /lib/systemd/system/es_exporter.service
[Unit]
Description=The es_exporter
After=network.target
[Service]
Type=simple
User=prometheus
ExecStart=/usr/local/elasticsearch_exporter/elasticsearch_exporter --web.listen-address ":9308" --es.uri http://127.0.0.1:9200
Restart=on-failure
[Install]
WantedBy=multi-user.target

启动:

1
2
systemctl daemon-reload
systemctl start es_exporter

查看 metrics:

1
curl 127.0.0.1:9109/metrics

# 2,配置 prometheus.yml 添加监控目标

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
$ vim /usr/local/prometheus/prometheus.yml
  - job_name: 'elasticsearch'
    scrape_interval: 60s
    scrape_timeout:  30s
    metrics_path: "/metrics"
    static_configs:
    - targets:
      - '10.3.0.41:9109'
      labels:
        service: elasticsearch

重启服务。

1
$ systemctl restart prometheus

或者通过命令热加载:

1
curl  -XPOST localhost:9090/-/reload

# 5,配置 Grafana 的模板

模板通过 json 文件进行导入,文件就在解压的包内。

参考地址:https://shenshengkun.github.io/posts/550bdf86.html

或者通过如下 ID 进行导入:2322以及其他。

img (opens new window)

# 6,开启认证的启动方式

如果 es 开启了认证,那么启动的时候需要将用户名密码加载进去:

1
lasticsearch_exporter --web.listen-address ":9308"  --es.uri http://username:password@192.168.10.3:9200 &

其中使用的是monitoring的用户密码。

当然,除去这种命令行的启动方式之外,还可以像上边一样,基于 systemd 进行管理,只需将认证的参数信息写入到如下内容当中:

1
2
$ cat /etc/default/elasticsearch_exporter
EXPORTER_ARGS="--es.uri=http://username:password@192.168.10.3:9200"

接着将启动配置文件封装成如下脚本:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
$ cat /etc/init.d/elasticsearch_exporter
#!/bin/sh
# chkconfig: 2345 60 20
# description: elasticsearch_exporter
NAME=elasticsearch_exporter
SCRIPT="/usr/bin/${NAME}"
PIDFILE="/var/run/${NAME}.pid"
LOGFILE="/var/log/${NAME}.log"
ENVFILE="/etc/default/${NAME}"
USER="root"
URL='http://192.10.10.1'
EXPORTER_NAME=$NAME
EXPORTER_PORT="9114"
#获取本机ip
IP=$(grep "IPADDR" /etc/sysconfig/network-scripts/ifcfg-eth0 | grep -Eo "([0-9]{1,3}\.){3}[0-9]{1,3}")
register_exporter() {
    json_data='{"service_id":"'${EXPORTER_NAME}${IP//./}'","job":"'${EXPORTER_NAME}'","ip":"'${IP}'","port":"'$EXPORTER_PORT'","tags":"","meta": {"hostname": "'$(hostname)'"}}'
    curl --connect-timeout 2 -s -X POST -H "Content-type: application/json" -d "${json_data}" $URL 2>&1 > /dev/null
}
start() {
  if [ -f "${PIDFILE}" ] && kill -0 $(cat "${PIDFILE}") &> /dev/null; then
    echo "${NAME} already running with PID $(cat ${PIDFILE})" >&2
    return 1
  fi
  echo "Starting ${NAME}" >&2
  . "${ENVFILE}"
  CMD="${SCRIPT} --web.listen-address=${IP}:${EXPORTER_PORT} --log.level=error ${EXPORTER_ARGS}"
  su - "${USER}" -c "${CMD} &> ${LOGFILE} & echo \$! > ${PIDFILE}"
  # echo "${NAME} started with PID $(cat ${PIDFILE})" >&2
  sleep 1
  if [ -f "${PIDFILE}" ] && kill -0 $(cat "${PIDFILE}") &> /dev/null; then
    echo "${NAME} started successfully." >&2
    register_exporter
  else
    echo "${NAME} was not started OK"
    return 1
  fi
}
stop() {
  if [ ! -f "$PIDFILE" ] || ! kill -0 $(cat "$PIDFILE") &> /dev/null; then
    echo "${NAME} not running" >&2
    return 1
  fi
  echo "Stopping ${NAME}..." >&2
  kill -15 $(cat "$PIDFILE")
  rm -f "$PIDFILE"
  echo "${NAME} stopped" >&2
}
status() {
  if [ ! -f "$PIDFILE" ] || ! kill -0 $(cat "$PIDFILE") &> /dev/null; then
    echo "${NAME} is not running" >&2
  else
    echo "${NAME} is running" >&2
  fi
}
uninstall() {
  echo -n "Are you really sure you want to uninstall ${NAME}? That cannot be undone. [yes|No] "
  local SURE
  read SURE
  if [ "$SURE" = "yes" ]; then
    stop
    rm -f "$PIDFILE"
    echo "Notice: log file is not be removed: '$LOGFILE'" >&2
    update-rc.d -f <NAME> remove
    rm -fv "$0"
  fi
}
case "$1" in
  start)
    start
    ;;
  stop)
    stop
    ;;
  uninstall)
    uninstall
    ;;
  restart)
    stop
    start
    ;;
  status)
  status
  ;;
  register)
  register_exporter
  ;;
  *)
    echo "Usage: $0 {start|stop|restart|status|register|uninstall}"
esac

此处服务启动之后将会自动注册到统一的注册中心去,而不必再手动添加配置。

Licensed under CC BY-NC-SA 4.0
最后更新于 Jan 06, 2025 05:52 UTC
comments powered by Disqus
Built with Hugo
主题 StackJimmy 设计
Caret Up