集群服务器重启后恢复操作


1. master无需处理

2. client:

  // 每台节点需要配置永久挂载,vi /etc/fstab

  mount -a  // 挂载生效

  systemctl restart munge

  systemctl restart ypbind

  systemctl restart slurmd

3. master 

  sinfo 查询状态,如果是unkonown或者别的状态需要update

  scontrol update Node=client[01-05] STATE=RESUME