啥也不说了,上脚本之前先说一下有种做成zabbix自动发现的方案不是很合适,因为线上的集群动不动就几千个queue,这样生成的zabbix监控项太多,综合各方面考虑都不是很合适,下面的方案是把结果写到文件里面,然后配置个让zabbix agent定时去扫关键字的模板,具体怎么弄就不赘述了,直接上我们的Python脚本哦。

# -*- coding: utf-8 -*-
import re
import subprocess
list_vhost_cmd = "/sbin/rabbitmqctl list_vhosts |grep -v 'Listing vhosts'"
list_vhost_result = subprocess.check_output(list_vhost_cmd, shell=True).strip().split('\n')
no_exception_count = 0
with open('./results.txt', 'w') as f:
    for vhost in list_vhost_result:
        list_queue_cmd = "/sbin/rabbitmqctl list_queues -p {0} |grep -Ev 'Listing queues|Timeout:|name\tmessages'".format(vhost)
        try:
            list_queue_result = subprocess.check_output(list_queue_cmd, shell=True).strip().split('\n')
            for q in list_queue_result:
                q_item = q.split('\t')
                # print(q_item)
                q_name = q_item[0]
                q_num = q_item[1]
                case1= re.findall('bakup', q_name, re.IGNORECASE)
                case2 = re.findall('backup', q_name, re.IGNORECASE)
                if case1 or case2:
                    continue
                if int(q_num) >= 100000:
                    f.write("Queue_Exception: " + "队列名:" + q_name + ' 数量:' + q_num + '\n')
                    no_exception_count = no_exception_count + 1
                    # print("Queue_Exception: " + "队列名:" + q_name + ' 数量:' + q_num + '\n')
        except subprocess.CalledProcessError:
            pass
if no_exception_count == 0:
    with open('./results.txt', 'w') as f:
        f.write("\n")

Tips:上面的阈值设置为了100000,你可以根据你们公司的业务情况进行调整哦~