使用 Supervisor 管理进程
1、介绍
Supervisor是一个客户端/服务器系统,允许用户在类UNIX操作系统上控制大量进程。 作用:为每个实例编写启动脚本通常是不方便的。 编写和维护会很痛苦。此外,脚本不能自动重启崩溃的进程,并且很多程序在崩溃时不能自行正常重启。Supervisord作为其子进程启动进程,并且可以配置为在崩溃时自动重启进程。它也可以自动配置为在自己的调用中启动进程。
2、安装 Supervisor
yum install supervisor -y
更多安装方法详见官方手册:http://www.supervisord.org/installing.html
3、配置文件详解
Supervisor配置文件通常为 supervisord.conf。它由supervisord和supervisorctl使用。如果任一应用程序在没有-c选项指定配置文件的情况下启动,应用程序将按照指定的顺序在以下位置查找名为supervisord.conf的文件。它将使用它找到的第一个文件。
$ CWD / supervisord.conf $ CWD的/ etc / supervisord.conf /etc/supervisord.conf /etc/supervisor/supervisord.conf
[unix_http_server] file = /tmp/supervisor.sock # 修改sock文件用户和属组,默认0700 chmod = 0777 # 修改sock文件用户和属组,默认为启动supervisord的用户和组 chown = sup:sup # web页面验证所需的用户名。 username = admin password = 123 # 开启在web界面配置管理 [inet_http_server] # web界面的访问地址与端口。 port = 10.0.0.12:9001 # 监听本机所有网卡地址 # port = *:9001 # port = :9001 # 默认不需要用户名 user = admin # 默认不需要密码,密码为明文保存在配置文件中 password = 123 # supervisor服务参数配置 [supervisord] # 日志文件路径,默认/var/log/supervisor/supervisord.log logfile = /var/log/supervisord.log # 日志切割大小,默认50M logfile_maxbytes = 1KB # 切割的日志保留份数,默认10 logfile_backups = 3 # 日志级别,默认info。critical, error, warn, info, debug, trace, blather loglevel = debug # pid文件,默认/run/supervisord.pid pidfile = /var/run/sup/sup.log # 是否后台开启,默认false,true为前台开启 nodaemon = false # 当supervisord守护进程时,切换到此目录。 directory = /app # 程序部分 # 必须包含一个或多个程序部分,以便supervisord知道应该启动和控制哪些程序。标题值是复合值。program ":" program_name [program:redis-6661] # 该程序启动时将运行的命令。可以是绝对路径 command = /app/redis/bin/redis-cli /app/redis/conf/redis-6661.conf # 程序名称 process_name = redis-6661 # 是否自动启动,默认是 autostart = true autorestart = true # 与autorestart同用,如果进程没在root用户停止或意外,其余状态重启program exitcodes = 0,2
更多参数详见官网手册:http://www.supervisord.org/configuration.html
4、使用supervisord管理redis集群
1、准备环境
6台redis
[root@redis conf]# ll total 336 -rw-r--r--. 1 root root 46723 Apr 25 20:23 redis -rw-r--r--. 1 root root 46725 Apr 25 21:01 redis-6661.conf -rw-r--r--. 1 root root 46725 Apr 25 21:01 redis-6662.conf -rw-r--r--. 1 root root 46725 Apr 25 21:01 redis-6663.conf -rw-r--r--. 1 root root 46725 Apr 25 21:01 redis-6664.conf -rw-r--r--. 1 root root 46725 Apr 25 21:01 redis-6665.conf -rw-r--r--. 1 root root 46725 Apr 25 21:02 redis-6666.conf
创建supervisord配置文件
[inet_http_server] port=:19001 username=admin password=1 [supervisorctl] ;serverurl=unix:///tmp/supervisor.sock serverurl=127.0.0.1:19001 username=super password=1 ;prompt=mysupervisor ;history_file=~/.sc_hist [supervisord] logfile = /var/log/supervisord.log logfile_maxbytes = 1KB logfile_backups = 3 loglevel = debug pidfile = /var/run/sup/sup.log nodaemon = false directory = /app [program:redis-6661] directory=/app/redis/conf command=/app/redis/bin/redis-server redis-6661.conf autostart=true autorestart=unexpected exitcodes=0,2 [program:redis-6662] directory=/app/redis/conf command=/app/redis/bin/redis-server redis-6662.conf autostart=true autorestart=unexpected exitcodes=0,2 [program:redis-6663] directory=/app/redis/conf command=/app/redis/bin/redis-server redis-6663.conf autostart=true autorestart=unexpected exitcodes=0,2 [program:redis-6664] directory=/app/redis/conf command=/app/redis/bin/redis-server redis-6664.conf autostart=true autorestart=unexpected exitcodes=0,2 [program:redis-6665] directory=/app/redis/conf command=/app/redis/bin/redis-server redis-6665.conf autostart=true autorestart=unexpected exitcodes=0,2 [program:redis-6666] directory=/app/redis/conf command=/app/redis/bin/redis-server redis-6666.conf autostart=true autorestart=unexpected exitcodes=0,2
启动程序
systemctl start supervisord
报错信息
BACKOFF Exited too quickly (process log may have details)
原因:supervisor适合监控业务应用,并且只能监控前台程序,redis daemon方式实现的daemon不能监控,提示退出太快,其实程序已经启动。
Error: .ini file does not include supervisord section For help, use /usr/bin/supervisord -h
原因:配置文件中没有加[supervisrod] 配置项
通过web界面进行管理
查看后台进程
[root@redis ~]# ps -ef|grep redis-server root 7175 6671 0 21:35 ? 00:00:00 /app/redis/bin/redis-server 10.0.0.12:6662 [cluster] root 7179 6671 0 21:35 ? 00:00:00 /app/redis/bin/redis-server 10.0.0.12:6663 [cluster] root 7183 6671 0 21:35 ? 00:00:00 /app/redis/bin/redis-server 10.0.0.12:6661 [cluster] root 7186 6671 0 21:35 ? 00:00:00 /app/redis/bin/redis-server 10.0.0.12:6666 [cluster] root 7190 6671 0 21:35 ? 00:00:00 /app/redis/bin/redis-server 10.0.0.12:6664 [cluster] root 7194 6671 0 21:35 ? 00:00:00 /app/redis/bin/redis-server 10.0.0.12:6665 [cluster]
supervisorctl使用
supervisor> help default commands (type help): ===================================== add clear fg open quit remove restart start stop update avail exit maintail pid reload reread shutdown status tail version supervisor> stop redis-6662 redis-6662: stopped supervisor> status redis-6661 RUNNING pid 9073, uptime 0:01:53 redis-6662 STOPPED Apr 25 10:04 PM redis-6663 RUNNING pid 9072, uptime 0:01:53 redis-6664 RUNNING pid 9075, uptime 0:01:53 redis-6665 RUNNING pid 9076, uptime 0:01:52 redis-6666 RUNNING pid 9074, uptime 0:01:53
注:shutdown为关闭supervisord服务
更多命令行使用见官方文档:http://www.supervisord.org/running.html#supervisorctl-actions
错误1
http://localhost:9001 refused connection
解决方案:1、检查监听地址。默认为localhost:9001 2、检查服务是否正常。
错误2
[root@redis ~]# supervisorctl Sorry, supervisord responded but did not recognize the supervisor namespace commands that supervisorctl uses to control it. Please check that the [rpcinterface:supervisor] section is enabled in the configuration file (see sample.conf).
解决方案:在配置文件中加入如下配置
[rpcinterface:supervisor] supervisor.rpcinterface_factory = supervisor.rpcinterface:make_main_rpcinterface