prometheus 监控 es,同样采用 exporter 的方案。
- 项目地址:
- elasticsearch_exporter:https://github.com/justwatchcom/elasticsearch_exporter
# 1、安装部署
现有 es 三节点的集群,环境大概如下:
主机 |
组件 |
10.3.6.30–es-node1 |
es,nginx |
10.3.6.125–es-node2 |
es |
10.3.6.124–es-node3 |
es,kibana |
接着分别在如上三台主机上进行如下配置:
1
2
3
|
wget https://github.com/justwatchcom/elasticsearch_exporter/releases/download/v1.1.0/elasticsearch_exporter-1.1.0.linux-amd64.tar.gz
tar xf elasticsearch_exporter-1.1.0.linux-amd64.tar.gz
mv elasticsearch_exporter-1.1.0.linux-amd64 /usr/local/elasticsearch_exporter
|
启动监控客户端:
1
|
nohup ./elasticsearch_exporter --web.listen-address ":9109" --es.uri http://10.3.6.30:9200 &
|
使用 systemd 管理:
1
2
3
4
5
6
7
8
9
10
11
|
cat /lib/systemd/system/es_exporter.service
[Unit]
Description=The es_exporter
After=network.target
[Service]
Type=simple
User=prometheus
ExecStart=/usr/local/elasticsearch_exporter/elasticsearch_exporter --web.listen-address ":9308" --es.uri http://127.0.0.1:9200
Restart=on-failure
[Install]
WantedBy=multi-user.target
|
启动:
1
2
|
systemctl daemon-reload
systemctl start es_exporter
|
查看 metrics:
1
|
curl 127.0.0.1:9109/metrics
|
# 2,配置 prometheus.yml 添加监控目标
1
2
3
4
5
6
7
8
9
10
|
$ vim /usr/local/prometheus/prometheus.yml
- job_name: 'elasticsearch'
scrape_interval: 60s
scrape_timeout: 30s
metrics_path: "/metrics"
static_configs:
- targets:
- '10.3.0.41:9109'
labels:
service: elasticsearch
|
重启服务。
1
|
$ systemctl restart prometheus
|
或者通过命令热加载:
1
|
curl -XPOST localhost:9090/-/reload
|
# 5,配置 Grafana 的模板
模板通过 json 文件进行导入,文件就在解压的包内。
参考地址:https://shenshengkun.github.io/posts/550bdf86.html
或者通过如下 ID 进行导入:2322
以及其他。
(opens new window)
# 6,开启认证的启动方式
如果 es 开启了认证,那么启动的时候需要将用户名密码加载进去:
1
|
lasticsearch_exporter --web.listen-address ":9308" --es.uri http://username:password@192.168.10.3:9200 &
|
其中使用的是monitoring
的用户密码。
当然,除去这种命令行的启动方式之外,还可以像上边一样,基于 systemd 进行管理,只需将认证的参数信息写入到如下内容当中:
1
2
|
$ cat /etc/default/elasticsearch_exporter
EXPORTER_ARGS="--es.uri=http://username:password@192.168.10.3:9200"
|
接着将启动配置文件封装成如下脚本:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
|
$ cat /etc/init.d/elasticsearch_exporter
#!/bin/sh
# chkconfig: 2345 60 20
# description: elasticsearch_exporter
NAME=elasticsearch_exporter
SCRIPT="/usr/bin/${NAME}"
PIDFILE="/var/run/${NAME}.pid"
LOGFILE="/var/log/${NAME}.log"
ENVFILE="/etc/default/${NAME}"
USER="root"
URL='http://192.10.10.1'
EXPORTER_NAME=$NAME
EXPORTER_PORT="9114"
#获取本机ip
IP=$(grep "IPADDR" /etc/sysconfig/network-scripts/ifcfg-eth0 | grep -Eo "([0-9]{1,3}\.){3}[0-9]{1,3}")
register_exporter() {
json_data='{"service_id":"'${EXPORTER_NAME}${IP//./}'","job":"'${EXPORTER_NAME}'","ip":"'${IP}'","port":"'$EXPORTER_PORT'","tags":"","meta": {"hostname": "'$(hostname)'"}}'
curl --connect-timeout 2 -s -X POST -H "Content-type: application/json" -d "${json_data}" $URL 2>&1 > /dev/null
}
start() {
if [ -f "${PIDFILE}" ] && kill -0 $(cat "${PIDFILE}") &> /dev/null; then
echo "${NAME} already running with PID $(cat ${PIDFILE})" >&2
return 1
fi
echo "Starting ${NAME}" >&2
. "${ENVFILE}"
CMD="${SCRIPT} --web.listen-address=${IP}:${EXPORTER_PORT} --log.level=error ${EXPORTER_ARGS}"
su - "${USER}" -c "${CMD} &> ${LOGFILE} & echo \$! > ${PIDFILE}"
# echo "${NAME} started with PID $(cat ${PIDFILE})" >&2
sleep 1
if [ -f "${PIDFILE}" ] && kill -0 $(cat "${PIDFILE}") &> /dev/null; then
echo "${NAME} started successfully." >&2
register_exporter
else
echo "${NAME} was not started OK"
return 1
fi
}
stop() {
if [ ! -f "$PIDFILE" ] || ! kill -0 $(cat "$PIDFILE") &> /dev/null; then
echo "${NAME} not running" >&2
return 1
fi
echo "Stopping ${NAME}..." >&2
kill -15 $(cat "$PIDFILE")
rm -f "$PIDFILE"
echo "${NAME} stopped" >&2
}
status() {
if [ ! -f "$PIDFILE" ] || ! kill -0 $(cat "$PIDFILE") &> /dev/null; then
echo "${NAME} is not running" >&2
else
echo "${NAME} is running" >&2
fi
}
uninstall() {
echo -n "Are you really sure you want to uninstall ${NAME}? That cannot be undone. [yes|No] "
local SURE
read SURE
if [ "$SURE" = "yes" ]; then
stop
rm -f "$PIDFILE"
echo "Notice: log file is not be removed: '$LOGFILE'" >&2
update-rc.d -f <NAME> remove
rm -fv "$0"
fi
}
case "$1" in
start)
start
;;
stop)
stop
;;
uninstall)
uninstall
;;
restart)
stop
start
;;
status)
status
;;
register)
register_exporter
;;
*)
echo "Usage: $0 {start|stop|restart|status|register|uninstall}"
esac
|
此处服务启动之后将会自动注册到统一的注册中心去,而不必再手动添加配置。