使用 Supervisor 管理进程


1、介绍

Supervisor是一个客户端/服务器系统,允许用户在类UNIX操作系统上控制大量进程。 作用:为每个实例编写启动脚本通常是不方便的。 编写和维护会很痛苦。此外,脚本不能自动重启崩溃的进程,并且很多程序在崩溃时不能自行正常重启。Supervisord作为其子进程启动进程,并且可以配置为在崩溃时自动重启进程。它也可以自动配置为在自己的调用中启动进程。

2、安装 Supervisor

yum  install supervisor -y

更多安装方法详见官方手册:http://www.supervisord.org/installing.html

3、配置文件详解

Supervisor配置文件通常为 supervisord.conf。它由supervisord和supervisorctl使用。如果任一应用程序在没有-c选项指定配置文件的情况下启动,应用程序将按照指定的顺序在以下位置查找名为supervisord.conf的文件。它将使用它找到的第一个文件。

$ CWD / supervisord.conf
$ CWD的/ etc / supervisord.conf
/etc/supervisord.conf
/etc/supervisor/supervisord.conf
[unix_http_server]
file  =  /tmp/supervisor.sock
# 修改sock文件用户和属组,默认0700
chmod  =  0777
# 修改sock文件用户和属组,默认为启动supervisord的用户和组
chown =  sup:sup 
# web页面验证所需的用户名。
username  =  admin
password  =  123 

# 开启在web界面配置管理
[inet_http_server]
# web界面的访问地址与端口。
port = 10.0.0.12:9001
# 监听本机所有网卡地址
# port = *:9001
# port =     :9001
# 默认不需要用户名
user = admin
# 默认不需要密码,密码为明文保存在配置文件中
password = 123

# supervisor服务参数配置
[supervisord]
# 日志文件路径,默认/var/log/supervisor/supervisord.log
logfile = /var/log/supervisord.log
# 日志切割大小,默认50M
logfile_maxbytes = 1KB
# 切割的日志保留份数,默认10
logfile_backups = 3
# 日志级别,默认info。critical, error, warn, info, debug, trace, blather
loglevel = debug
# pid文件,默认/run/supervisord.pid
pidfile = /var/run/sup/sup.log 
# 是否后台开启,默认false,true为前台开启
nodaemon = false
# 当supervisord守护进程时,切换到此目录。
directory = /app

# 程序部分
# 必须包含一个或多个程序部分,以便supervisord知道应该启动和控制哪些程序。标题值是复合值。program ":" program_name
[program:redis-6661]
# 该程序启动时将运行的命令。可以是绝对路径
command = /app/redis/bin/redis-cli /app/redis/conf/redis-6661.conf
# 程序名称
process_name = redis-6661
# 是否自动启动,默认是
autostart = true
autorestart = true
# 与autorestart同用,如果进程没在root用户停止或意外,其余状态重启program 
exitcodes = 0,2


更多参数详见官网手册:http://www.supervisord.org/configuration.html

4、使用supervisord管理redis集群

1、准备环境

6台redis

[root@redis conf]# ll
total 336
-rw-r--r--. 1 root root 46723 Apr 25 20:23 redis
-rw-r--r--. 1 root root 46725 Apr 25 21:01 redis-6661.conf
-rw-r--r--. 1 root root 46725 Apr 25 21:01 redis-6662.conf
-rw-r--r--. 1 root root 46725 Apr 25 21:01 redis-6663.conf
-rw-r--r--. 1 root root 46725 Apr 25 21:01 redis-6664.conf
-rw-r--r--. 1 root root 46725 Apr 25 21:01 redis-6665.conf
-rw-r--r--. 1 root root 46725 Apr 25 21:02 redis-6666.conf

创建supervisord配置文件

[inet_http_server]  
port=:19001
username=admin 
password=1

[supervisorctl]
;serverurl=unix:///tmp/supervisor.sock	
serverurl=127.0.0.1:19001					
username=super											
password=1											
;prompt=mysupervisor										
;history_file=~/.sc_hist

[supervisord]
logfile = /var/log/supervisord.log
logfile_maxbytes = 1KB
logfile_backups = 3
loglevel = debug
pidfile = /var/run/sup/sup.log
nodaemon = false
directory = /app

[program:redis-6661]
directory=/app/redis/conf
command=/app/redis/bin/redis-server redis-6661.conf
autostart=true
autorestart=unexpected
exitcodes=0,2


[program:redis-6662]
directory=/app/redis/conf
command=/app/redis/bin/redis-server redis-6662.conf
autostart=true
autorestart=unexpected
exitcodes=0,2


[program:redis-6663]
directory=/app/redis/conf
command=/app/redis/bin/redis-server redis-6663.conf
autostart=true
autorestart=unexpected
exitcodes=0,2



[program:redis-6664]
directory=/app/redis/conf
command=/app/redis/bin/redis-server redis-6664.conf
autostart=true
autorestart=unexpected
exitcodes=0,2



[program:redis-6665]
directory=/app/redis/conf
command=/app/redis/bin/redis-server redis-6665.conf
autostart=true
autorestart=unexpected
exitcodes=0,2


[program:redis-6666]
directory=/app/redis/conf
command=/app/redis/bin/redis-server redis-6666.conf
autostart=true
autorestart=unexpected
exitcodes=0,2

启动程序

systemctl start supervisord

报错信息

BACKOFF  Exited too quickly (process log may have details)

原因:supervisor适合监控业务应用,并且只能监控前台程序,redis daemon方式实现的daemon不能监控,提示退出太快,其实程序已经启动。

Error: .ini file does not include supervisord section  
For help, use /usr/bin/supervisord -h  

原因:配置文件中没有加[supervisrod] 配置项

通过web界面进行管理

查看后台进程

[root@redis ~]# ps -ef|grep redis-server
root       7175   6671  0 21:35 ?        00:00:00 /app/redis/bin/redis-server 10.0.0.12:6662 [cluster]
root       7179   6671  0 21:35 ?        00:00:00 /app/redis/bin/redis-server 10.0.0.12:6663 [cluster]
root       7183   6671  0 21:35 ?        00:00:00 /app/redis/bin/redis-server 10.0.0.12:6661 [cluster]
root       7186   6671  0 21:35 ?        00:00:00 /app/redis/bin/redis-server 10.0.0.12:6666 [cluster]
root       7190   6671  0 21:35 ?        00:00:00 /app/redis/bin/redis-server 10.0.0.12:6664 [cluster]
root       7194   6671  0 21:35 ?        00:00:00 /app/redis/bin/redis-server 10.0.0.12:6665 [cluster]

supervisorctl使用

supervisor> help

default commands (type help ):
=====================================
add    clear  fg        open  quit    remove  restart   start   stop  update 
avail  exit   maintail  pid   reload  reread  shutdown  status  tail  version

supervisor> stop redis-6662
redis-6662: stopped
supervisor> status
redis-6661                       RUNNING   pid 9073, uptime 0:01:53
redis-6662                       STOPPED   Apr 25 10:04 PM
redis-6663                       RUNNING   pid 9072, uptime 0:01:53
redis-6664                       RUNNING   pid 9075, uptime 0:01:53
redis-6665                       RUNNING   pid 9076, uptime 0:01:52
redis-6666                       RUNNING   pid 9074, uptime 0:01:53

注:shutdown为关闭supervisord服务

更多命令行使用见官方文档:http://www.supervisord.org/running.html#supervisorctl-actions

错误1

http://localhost:9001 refused connection

解决方案:1、检查监听地址。默认为localhost:9001 2、检查服务是否正常。

错误2

[root@redis ~]# supervisorctl 
Sorry, supervisord responded but did not recognize the supervisor namespace commands that supervisorctl uses to control it.  Please check that the [rpcinterface:supervisor] section is enabled in the configuration file (see sample.conf).

解决方案:在配置文件中加入如下配置

[rpcinterface:supervisor]
supervisor.rpcinterface_factory = supervisor.rpcinterface:make_main_rpcinterface