ironic组件硬件自检服务——ironic-inspector


介绍

ironic-inspector是一个用于硬件自检的辅助型服务,它可以对被ironic组件管理的裸金属节点进行硬件自检,通过在裸金属节点上运行内存系统,发现裸金属节点的硬件信息,例如CPU数量和型号、内存容量、磁盘数量和型号、各种PCI设备等等,最终将这些信息记录于ironic组件的数据库中。

ironic-inspector的存在拓宽了ironic组件发现裸金属节点硬件信息的能力。在没有ironic-inspector之前,ironic所获取的裸金属节点信息来源于用户的手动输入,这不但效率低下,在准确性方面也有所欠缺;而通过ironic-inspector以及IPA(Ironic Python Agent)的配合,裸金属节点硬件信息的发现能力理论上可以达到极致。

ironic-inspector自检时序如下图所示:

sequenceDiagram ironic->>ironic inspector: 发送自检请求,/v1/introspection/{node} ironic inspector-->>ironic: HTTP 202,已接受 ironic inspector->>ironic inspector: 检查节点状态,配置PXE ironic inspector->>裸金属节点: 重启节点,等待回调 裸金属节点->>裸金属节点: 从ramdisk启动,收集硬件信息 裸金属节点->>ironic inspector: 返回收集的数据 ironic inspector->>ironic inspector: 处理数据 ironic inspector->>ironic: 更新节点的属性,并创建缺失的ironic port

自检的具体流程如下:

  1. 裸金属节点被注册且状态为manageable

  2. 通过API或者CLI调用ironic-inspector的自检接口

  3. ironic-inspector接收到自检请求,开始自检

    1. 检查节点当前的电源状态、provision状态等等
    2. 为节点添加PXE启动服务的权限
    3. 向节点发送重启命令,使节点从ramdisk启动
  4. ramdisk收集所需要的信息,然后将其返回给ironic-inspector

  5. ironic-inspector接收到来自ramdisk的数据,开始处理数据

    1. 检查接收到的数据
    2. 根据BMC地址从检索到节点的uuid
    3. 根据数据填充节点的属性,并创建缺失的ironic port
  6. 节点被重新置为manageable状态,等待纳管

概念

裸金属节点状态

裸金属节点状态指从ironic组件的角度,一台裸金属节点所拥有的用于区分不同可执行操作的状态。裸金属状态机介绍可参考如下文档:Bare Metal State Machine — ironic 18.1.1.dev3 documentation (openstack.org)。

ironic-inspector要求裸金属节点必须处于manageable状态,才能够进行自检,在自检完成后回到manageable状态,自检过程中裸金属节点的状态如下图所示。

stateDiagram-v2 [*]-->enroll enroll-->verifying: manage (via API) verifying-->manageable: done verifying-->enroll: fail manageable-->inspecting: inspect (via API) inspecting-->inspect_wait: wait inspect_wait-->manageable: done inspecting-->manageable: done inspecting-->inspect_failed: fail inspect_wait-->inspect_failed: fail inspect_wait-->inspect_failed: abort (via API) inspect_failed-->manageable: manage (via API) inspect_failed-->inspecting: inspect (via API)

自检规则

ironic-inspector支持在自检过程中运行一些简单的规则,这种规则为json格式,由一套专门的API来管理,这些规则会在处理完所有的钩子函数后运行。

一条规则包含条件语句和动作语句两部分。如果自检的数据符合判断条件,那么就会在节点上运行这些动作。

规则示例:

{
    "description": "...",
    "actions": [...],
    "conditions": [...],
    "scope": "SCOPE"
}

条件

条件语句示例:

{"field": "data://inventory.cpu.architecture", "op": "eq", "value": "x86_64"}
{"field": "node://properties.cpus", "op": "eq", "value": "16"}

如上展示了两条条件语句,一条条件语句由如下字段组成:

  • field:表示需要比较的字段,格式为json path,允许使用data://node://区分比较的数据来自于自检数据或是节点的属性,若忽略默认为自检数据
  • op:用于比较的运算符,允许的操作符如下
    • eq, le, ge, ne, lt, gt:基本运算符
    • in-net:检查一个ip地址是否在给定的网络中
    • matches:正则表达式完整匹配
    • contains:正则表达式部分匹配
    • is-empty:检查字段是否为空
  • value:表示需要比较的值
  • invert:是否需要倒置比较的结果,格式为布尔值
  • multiple:当field字段为列表时,描述如何处理这种情况,可用的选项如下
    • any:任意一个匹配即通过,默认选项
    • all:所有都匹配
    • first:第一个field匹配

动作

动作语句示例:

{"action": "set-attribute", "path": "/driver_info/ipmi_address", "value": "{data[inventory][bmc_address]}"}

一条动作语句由如下字段组成:

  • action:表示执行的动作,可选如下选项
    • set-attribute:为节点设置属性,需要pathvalue字段的配合
    • set-capability:为节点设置properties.capabilities属性,需要namevalue字段的配合
    • extend-attribute:与set-attribute选项相同,不过会把属性当成列表,如果一个属性已存在那么会将值追加进去,如果设置一个可选的unique字段为True,那么会覆盖而不是追加
    • add-trait:为节点添加一个trait,需要name字段的配合
    • remove-trait:为节点移除一个trait,需要name字段的配合
    • fail:设置自检状态为失败,需要message字段的配合,表示失败信息

范围

默认情况下,自检规则会作用于所有自检的裸金属节点,如果想要某条规则只作用于特定的节点,那么可以使用scope字段限定规则的使用范围,该字段需要同时设置在规则和节点上才能生效。

在节点上设置inspection_scope属性:

baremetal node set --property inspection_scope="SCOPE" 

scope字段很少才会用到,且和条件语句的应用场景有些重合。

插件

插件(Plugin)是ironic-inspector组件的重要组成部分,它通过插件处理自检的数据,并将数据更新到节点中。

每种插件均提供before_processing和before_update两种钩子函数:

  • before_processing:处理自检数据之前运行的钩子函数,主要用于对自检数据的修正
  • before_update:更新裸金属节点之前运行的钩子函数,主要用于描述如何更新裸金属节点的属性(在所有钩子函数运行完后统一更新)

ironic-inspector默认提供的插件如下:

  1. RamdiskErrorHook:报告来自ramdisk的错误

    • before_processing:判断ramdisk是否报告错误,如果有错误则raise
  2. RootDiskSelectionHook:通过ironic root_device字段选择root disk,该hook必须在schedulerHook之前,否则root_disk字段不会更新

    • before_update:获取启动的磁盘(如果裸金属已有root_device属性则采用该属性),更新裸金属的local_gb属性
  3. SchedulerHook:检查并更新用于节点调度的基本属性,如CPU个数和架构、内存容量、磁盘容量等等

    • before_update:更新裸金属的cpus、cpu_arch、memory_mb属性
  4. ValidateInterfacesHook:检测网络接口信息,创建新的ironic port,删除自检数据中不存在的ironic port,并按实际情况为这些port设置pxe_enabled标记

    • before_processing

      1. 获取ipmi地址,inventory.bmc_address/bmc_v6address --> ipmi_address/ipmi_v6address
      2. 获取所有可用接口(非环回口且物理层面存在),inventory.interfaces --> all_interfaces、interfaces,记录info日志
      3. 获取所有可用mac,macs
    • before_update:创建/删除ironic port,使之与实际相符

  5. CapabilitiesHook:探测裸金属机器的capability,包括boot mode、cpu flag等

    • before_update:设置boot mode为bios/uefi,更新capabilities
  6. PciDevicesHook:确认裸金属机器上PCI设备的型号与数量

    • before_update:
  7. local_link_connection:处理lldp包中的必选字段,用于向ironic port写入local_link_connection_info的port_id和switch_id字段

    • before_update:更新ironic port,添加lldp链路信息
  8. LLDPBasicProcessingHook:处理自检数据中的lldp数据,用途待定

    • before_update:更新ironic port,添加lldp链路信息(switch_port_mau_type)
  9. RaidDeviceDetection:处理创建raid后的root device

    • before_processing:如果数据中无local_gb,设置其为1
    • before_update:添加裸金属的extra/disks、extra/block_devices属性
  10. AccelDevicesHook:用于区分不同的加速设备

    • before_update:添加裸金属的加速设备属性,否则添加属性accelerator_has_gpu:false
  11. ExampleProcessingHook:记录自检数据的输入/输出

发现能力

ironic-inspector具备为ironic注册新裸金属节点的能力。当收到来自节点的自检数据,且该节点无法被识别时,ironic-inspector会调用enroll_node_not_found_hook函数为ironic注册裸金属。

为了启用发现能力,需要在ironic-inspector的配置文件中设置node_not_found_hook字段为enroll,并且设置enroll_node_driverenroll_node_fields字段。

[processing]
node_not_found_hook = enroll
[discovery]
enroll_node_driver = ipmi
# 用于设置注册裸金属时添加的字段
enroll_node_fields = management_interface:ipmitool,resource_class:baremetal

在调用enroll_node_not_found_hook函数之后,ironic-inspector会像处理一般节点一样的方式处理新注册的节点,因此可能还需要添加一些自检规则,用来为新节点添加ipmi_usernamedeploy_kernel等一系列字段。

[{
    "description": "Set IPMI driver_info if no credentials",
    "actions": [
        {"action": "set-attribute", "path": "driver", "value": "ipmi"},
        {"action": "set-attribute", "path": "driver_info/ipmi_username",
         "value": "username"},
        {"action": "set-attribute", "path": "driver_info/ipmi_password",
         "value": "password"}
    ],
    "conditions": [
        {"op": "is-empty", "field": "node://driver_info.ipmi_password"},
        {"op": "is-empty", "field": "node://driver_info.ipmi_username"}
    ]
},{
    "description": "Set deploy info if not already set on node",
    "actions": [
        {"action": "set-attribute", "path": "driver_info/deploy_kernel",
         "value": ""},
        {"action": "set-attribute", "path": "driver_info/deploy_ramdisk",
         "value": ""}
    ],
    "conditions": [
        {"op": "is-empty", "field": "node://driver_info.deploy_ramdisk"},
        {"op": "is-empty", "field": "node://driver_info.deploy_kernel"}
    ]
}]

enroll_node_not_found_hook函数还会在自检数据中添加一个auto_discovered的标记,该标记用来区分手动注册的节点和自动发现的节点,因此可以在自检规则中根据该标记做一些特定的操作。如下表示如果有该标记,那么设置节点的driver为ipmi。

{
    "description": "Enroll auto-discovered nodes with ipmi hardware type",
    "actions": [
        {"action": "set-attribute", "path": "driver", "value": "ipmi"}
    ],
    "conditions": [
        {"op": "eq", "field": "data://auto_discovered", "value": true}
    ]
}

代码分析

自检请求发送的代码分析:

  1. main.py,api_introspection(node_id),接收自检请求,开始自检,发送数据至mq
  2. conductor/manager.py,do_introspection(self, context, node_id, token=None, manage_boot=True),从mq接收数据
  3. introspect.py,introspect(node_id, manage_boot=True, token=None),通过ironic接口获取节点的bmc信息
  4. node_cache.py,start_introspection(uuid, **kwargs),设置自检状态为start,如果manage_boot为True,调用_do_introspect
  5. introspect.py,_do_introspect(node_info, ironic),配置pxe服务,设置节点为pxe启动,并重启节点

收到ramdisk回调后:

  1. main.py,api_continue(),接收ipa请求,发送数据至mq
  2. conductor/manager.py,ConductorManager.do_continue(self, context, data),从mq接收数据,调用process.process(data)开始处理
  3. process.py,process(introspection_data),处理来于ramdisk的数据
    1. process.py,_run_pre_hooks(introspection_data, failures),调用每个插件的before_processing方法处理数据
    2. 记录日志:Matching node is %s
    3. process.py,_process_node(node_info, node, introspection_data)
      1. process.py,_run_post_hooks(node_info, introspection_data),调用每个插件的before_update方法处理数据
      2. process.py,store_introspection_data(node_uuid, data, processed=True),将inspect数据写入存储后端
      3. 为裸金属机器执行关机操作

使用示例

与ironic-inspector组件交互的方式有直接交互与通过ironic间接交互两种,这两种交互方式同时具备CLI和HTTP两种API接口。

接下来以裸金属节点bm-10为例,介绍CLI接口的用法。

自检

自检前,需确认裸金属节点处于manageable状态。

# openstack baremetal node list
+--------------------------------------+-------+---------------+-------------+--------------------+-------------+
| UUID                                 | Name  | Instance UUID | Power State | Provisioning State | Maintenance |
+--------------------------------------+-------+---------------+-------------+--------------------+-------------+
| a796c7e4-387c-47e0-bab7-e1f56621d4d0 | bm-10 | None          | power off   | manageable         | False       |
+--------------------------------------+-------+---------------+-------------+--------------------+-------------+

通过ironic组件进行自检,自检过程中裸金属节点的状态变化:manageable --> inspect --> inspect wait --> manageable。

# openstack baremetal node inspect bm-10

或是直接自检,自检过程中裸金属节点的状态不发生变化。

# openstack baremetal introspection start bm-10

自检完成后,可以发现其extra字段和properties均已更新,添加了很多内容,并且ironic port也更新至裸金属节点实际的网卡数量。

# openstack baremetal node show bm-10
+------------------------+-----------------------------------------------------------------------------------------+
| Field                  | Value                                                                                   |
+------------------------+-----------------------------------------------------------------------------------------+
| chassis_uuid           | None                                                                                    |
| console_enabled        | False                                                                                   |
| created_at             | 2021-07-26T08:09:41+00:00                                                               |
| driver                 | ipmi                                                                                    |
| driver_info            | {u'ipmi_port': 623, u'ipmi_username': u'admin', u'deploy_kernel': u'3203927a-04c3-488c- |
|                        | bf60-bbcc42be8c86', u'ipmi_address': u'10.33.45.10', u'deploy_ramdisk':                 |
|                        | u'0372afc2-65bf-4462-9b81-5f1ab4d63fa2', u'ipmi_password': u'******'}                   |
| driver_internal_info   | {}                                                                                      |
| extra                  | {u'disks': u'[{"rotational": true, "vendor": "ATA", "name": "/dev/sda",                 |
|                        | "wwn_vendor_extension": null, "wwn_with_extension": "0x5000cca25dcff84e", "model":      |
|                        | "HGST HUS726040AL", "wwn": "0x5000cca25dcff84e", "serial": "K4H442HB", "size":          |
|                        | 4000787030016}, {"rotational": true, "vendor": "ATA", "name": "/dev/sdb",               |
|                        | "wwn_vendor_extension": null, "wwn_with_extension": "0x5000cca25dcff039", "model":      |
|                        | "HGST HUS726040AL", "wwn": "0x5000cca25dcff039", "serial": "K4H41XSB", "size":          |
|                        | 4000787030016}, {"rotational": true, "vendor": "ATA", "name": "/dev/sdc",               |
|                        | "wwn_vendor_extension": null, "wwn_with_extension": "0x5000cca25dcff84d", "model":      |
|                        | "HGST HUS726040AL", "wwn": "0x5000cca25dcff84d", "serial": "K4H442GB", "size":          |
|                        | 4000787030016}]', u'system_vendor': u'{"serial_number": "HIK096396264-B",               |
|                        | "product_name": "DS-VH2203X4-EBE/2", "manufacturer": "OEM"}', u'block_devices':         |
|                        | {u'serials': [u'K4H442HB', u'K4H41XSB', u'K4H442GB']}, u'last_inspect_status':          |
|                        | u'success', u'mac_address': u'0c:c4:7a:e2:27:a2', u'cpu': u'{"count": 24, "socket": 2,  |
|                        | "frequency": "3200.0000", "flags": ["fpu", "vme", "de", "pse", "tsc", "msr", "pae",     |
|                        | "mce", "cx8", "apic", "sep", "mtrr", "pge", "mca", "cmov", "pat", "pse36", "clflush",   |
|                        | "dts", "acpi", "mmx", "fxsr", "sse", "sse2", "ss", "ht", "tm", "pbe", "syscall", "nx",  |
|                        | "pdpe1gb", "rdtscp", "lm", "constant_tsc", "arch_perfmon", "pebs", "bts", "rep_good",   |
|                        | "nopl", "xtopology", "nonstop_tsc", "aperfmperf", "eagerfpu", "pni", "pclmulqdq",       |
|                        | "dtes64", "monitor", "ds_cpl", "vmx", "smx", "est", "tm2", "ssse3", "sdbg", "fma",      |
|                        | "cx16", "xtpr", "pdcm", "pcid", "dca", "sse4_1", "sse4_2", "x2apic", "movbe", "popcnt", |
|                        | "aes", "xsave", "avx", "f16c", "rdrand", "lahf_lm", "abm", "epb", "invpcid_single",     |
|                        | "intel_ppin", "ssbd", "ibrs", "ibpb", "tpr_shadow", "vnmi", "flexpriority", "ept",      |
|                        | "vpid", "fsgsbase", "tsc_adjust", "bmi1", "avx2", "smep", "bmi2", "erms", "invpcid",    |
|                        | "cqm", "xsaveopt", "cqm_llc", "cqm_occup_llc", "dtherm", "ida", "arat", "pln", "pts",   |
|                        | "md_clear"], "architecture": "x86_64", "model_name": "Intel(R) Xeon(R) CPU E5-2620 v3 @ |
|                        | 2.40GHz"}'}                                                                             |
| inspection_finished_at | None                                                                                    |
| inspection_started_at  | 2021-07-26T08:34:45+00:00                                                               |
| instance_info          | {}                                                                                      |
| instance_uuid          | None                                                                                    |
| last_error             | None                                                                                    |
| maintenance            | False                                                                                   |
| maintenance_reason     | None                                                                                    |
| name                   | bm-10                                                                                   |
| ports                  | [{u'href': u'http://ironic.openstack.svc.cluster.local:10080//v1/nodes/a796c7e4-387c-   |
|                        | 47e0-bab7-e1f56621d4d0/ports', u'rel': u'self'}, {u'href':                              |
|                        | u'http://ironic.openstack.svc.cluster.local:10080//nodes/a796c7e4-387c-                 |
|                        | 47e0-bab7-e1f56621d4d0/ports', u'rel': u'bookmark'}]                                    |
| power_state            | power off                                                                               |
| properties             | {u'cpu_arch': u'x86_64', u'vendor': u'intel', u'cpus': u'24', u'capabilities': u'cpu_hu |
|                        | gepages:true,cpu_txt:true,accelerator_has_gpu:false,cpu_vt:true,cpu_aes:true,cpu_hugepa |
|                        | ges_1g:true', u'memory_mb': u'65536', u'local_gb': u'3725'}                             |
| provision_state        | manageable                                                                              |
| provision_updated_at   | 2021-07-26T08:40:13+00:00                                                               |
| reservation            | None                                                                                    |
| target_power_state     | None                                                                                    |
| target_provision_state | None                                                                                    |
| updated_at             | 2021-07-26T08:40:13+00:00                                                               |
| uuid                   | a796c7e4-387c-47e0-bab7-e1f56621d4d0                                                    |
+------------------------+-----------------------------------------------------------------------------------------+

# openstack baremetal port list --node bm-10
+--------------------------------------+-------------------+
| UUID                                 | Address           |
+--------------------------------------+-------------------+
| 552673a2-1c96-4f5f-8b65-25d45d3a4325 | 0c:c4:7a:e2:27:a2 |
| 30ecf7bb-6ce2-412d-8e45-91973edb22ea | a0:36:9f:d8:18:77 |
| bbbb0e47-2aa4-4f4c-bd34-ad7c177c49d3 | a0:36:9f:d8:18:76 |
| 2358f8ed-d16a-4576-8d6d-c89c9eadc7a6 | 0c:c4:7a:e2:27:a3 |
+--------------------------------------+-------------------+

获取所有节点的自检状态

# openstack baremetal introspection list
+--------------------------------------+---------------------+---------------------+-------+
| UUID                                 | Started at          | Finished at         | Error |
+--------------------------------------+---------------------+---------------------+-------+
| a796c7e4-387c-47e0-bab7-e1f56621d4d0 | 2021-07-26T08:34:46 | 2021-07-26T08:39:22 | None  |
+--------------------------------------+---------------------+---------------------+-------+

获取指定节点的自检状态

# openstack baremetal introspection status bm-10
+-------------+--------------------------------------+
| Field       | Value                                |
+-------------+--------------------------------------+
| error       | None                                 |
| finished    | True                                 |
| finished_at | 2021-07-26T08:39:22                  |
| started_at  | 2021-07-26T08:34:46                  |
| state       | finished                             |
| uuid        | a796c7e4-387c-47e0-bab7-e1f56621d4d0 |
+-------------+--------------------------------------+

获取自检数据

自检数据获取接口可以拿到从IPA返回的自检数据。

# openstack baremetal introspection data save bm-10 --file /tmp/inspector-data.json

中断自检过程

自检过程中断接口可以中断自检过程,使裸金属节点的状态立即返回至manageable。

# openstack baremetal introspection abort bm-10

其他

除此之外,ironic-inspector还提供了一些不常用的接口:

  • 重新处理自检数据:openstack baremetal introspection reprocess NODE_ID
  • 创建自检规则:openstack baremetal introspection rule import
  • 列出所有自检规则:openstack baremetal introspection rule list
  • 删除所有自检规则:openstack baremetal introspection rule purge
  • 删除指定自检规则:openstack baremetal introspection rule delete
  • 列出指定裸金属节点的所有网络接口(与ironic port类似):openstack baremetal introspection interface list NODE_IDENT
  • 获取指定裸金属节点指定网络接口的明细:openstack baremetal introspection interface show NODE_IDENT INTERFACE

可扩展性

ironic-inspector的可扩展性较强,它提供了自检规则的概念,使用户能够依据实际环境的情况自定义自检的行为;所有自检处理函数均通过插件的方式提供,若预设的函数不满足需要,想要为其添加/修改某些功能,只需要增加少部分的代码就能够实现。

自检规则中有一系列的运算符和动作,这些预设的选项都存放于plugins/rules.py文件,若不满足需求可在该文件中扩展。

所有的插件以及插件的处理函数均存放于plugins目录,若不满足需求可在该目录中扩展。

除此之外,ironic port创建行为、启用的处理函数、发现节点的配置等等都在配置文件中定义,可按需修改。

参考文档

Hardware introspection for OpenStack Bare Metal — ironic-inspector 10.7.0.dev19 documentation