啥也不说了,上脚本之前先说一下有种做成zabbix自动发现的方案不是很合适,因为线上的集群动不动就几千个queue,这样生成的zabbix监控项太多,综合各方面考虑都不是很合适,下面的方案是把结果写到文件里面,然后配置个让zabbix agent定时去扫关键字的模板,具体怎么弄就不赘述了,直接上我们的Python脚本哦。
# -*- coding: utf-8 -*-
import re
import subprocess
list_vhost_cmd = "/sbin/rabbitmqctl list_vhosts |grep -v 'Listing vhosts'"
list_vhost_result = subprocess.check_output(list_vhost_cmd, shell=True).strip().split('\n')
no_exception_count = 0
with open('./results.txt', 'w') as f:
for vhost in list_vhost_result:
list_queue_cmd = "/sbin/rabbitmqctl list_queues -p {0} |grep -Ev 'Listing queues|Timeout:|name\tmessages'".format(vhost)
try:
list_queue_result = subprocess.check_output(list_queue_cmd, shell=True).strip().split('\n')
for q in list_queue_result:
q_item = q.split('\t')
# print(q_item)
q_name = q_item[0]
q_num = q_item[1]
case1= re.findall('bakup', q_name, re.IGNORECASE)
case2 = re.findall('backup', q_name, re.IGNORECASE)
if case1 or case2:
continue
if int(q_num) >= 100000:
f.write("Queue_Exception: " + "队列名:" + q_name + ' 数量:' + q_num + '\n')
no_exception_count = no_exception_count + 1
# print("Queue_Exception: " + "队列名:" + q_name + ' 数量:' + q_num + '\n')
except subprocess.CalledProcessError:
pass
if no_exception_count == 0:
with open('./results.txt', 'w') as f:
f.write("\n")
Tips:上面的阈值设置为了100000,你可以根据你们公司的业务情况进行调整哦~