Out-of-band management with Redfish and Ansible

Fully automate systems management tasks from one central location to significantly reduce complexity and improve IT administrators' productivity.

Image by:

Photo by Peter (CC BY-SA 2.0), modified by Rikki Endsley

In this article, I'll explain how Redfish and Ansible can be used together to fully automate, at large scale, systems management tasks from one central location, significantly reducing complexity and helping improve the productivity of IT administrators.

Redfish is an open industry-standard specification published by the Distributed Management Task Force (DMTF) designed for modern and secure management of platform hardware. On Dell EMC PowerEdge servers, the Redfish management APIs are available via the integrated Dell Remote Access Controller (iDRAC), an out-of-band management controller used to remotely manage all hardware components on a server. IT administrators can use Redfish APIs on an iDRAC to perform all lifecycle management tasks. And because these APIs are accessed by sending a uniform resource identifier (URI) via HTTPS, administrators can choose from different tools, such as a command-line interface (CLI) or web browser, and connect from any device, such as a laptop or a mobile device.

Ansible is an open source automation engine used to run tasks including installing software and configuring applications. It is a one-to-many agentless mechanism where repetitive deployment tasks can be invoked and monitored from one control machine. Because all instructions are specified via YAML or JSON files, Ansible is easier to learn than shell scripts and easier to adapt by IT staff with diverse technical backgrounds. Compared to other popular configuration-management tools, Ansible is the simplest to install and configure.

Out-of-band management via iDRAC

Legacy protocols such as IPMI, SNMP, and WS-MAN have been supported in iDRAC for several server generations. In addition to these protocols, the tool RACADM (part of Dell EMC OpenManage) is easy to use and is a good option for iDRAC management; however, with the exponential growth of hardware complexity (including networking and storage gear), out-of-band management has also grown in complexity. Although security has always been a major concern, it has become even more important today. The Redfish specification was created as a result of these requirements to simplify and consolidate management of diverse hardware while providing maximum security.

Using Redfish

An iDRAC can be queried via Redfish APIs by sending it a URI along with user credentials. A simple way to do it is via the curl CLI command. This example queries the iDRAC for the server's overall health:

$ curl -s https://<idrac-ip>/redfish/v1/Systems/System.Embedded.1 \
                                -k -u root:password | jq .Status.Health
"OK"

All data returned by Redfish is in JSON format, so we need to parse for the information we want. In this case, I used the jq parser to extract the information I want.

This next example queries an iDRAC for the server's CPU information:

$ curl -s https://<idrac-ip>/redfish/v1/Systems/System.Embedded.1 \
                                -k -u root:password | jq .ProcessorSummary.Model
"Intel(R) Xeon(R) CPU E5-2630 v3 @ 2.40GHz"

These simple examples illustrate how easy it can be to collect information from servers using the Redfish APIs available in the PowerEdge iDRAC.

Using Ansible

Ansible instructions are specified in YAML files called playbooks. Though a full playbook tutorial is beyond the scope of this article, this general example shows how intuitive playbooks are:

- name: daily system admin tasks
  hosts: my_100_servers
  tasks:
  - user: name=hacker state=absent remove=yes
  - yum: name=* state=latest
  - file: path=/etc/motd state=absent

When invoking the playbook, these commands are run in 100 servers specified by the variable my_100_servers defined in /etc/ansible/hosts. The tasks to execute are: remove the hacker user, update all packages using yum, and remove the file /etc/motd. Because Ansible playbooks define a desired state, a playbook can be run multiple times against a server without impacting its state. If a certain task has already been implemented (e.g., "user hacker does not exist"), then the playbook simply ignores it and moves on.

Though most Ansible use cases communicate with hosts via SSH (though other methods, like PowerShell remoting, are also supported), the out-of-band management implementation communicates with hosts (or rather, with the host's iDRAC) via HTTPS.

iDRAC + Redfish + Ansible = Secure, scalable, and automated server management

The image below illustrates how everything comes together to provide an automated, scalable out-of-band management solution.

Sending a Redfish URI to servers, getting data back, and sorting it accordingly.

Image by:

opensource.com

1. An Ansible playbook defines what information we want from iDRAC. In this example, we ask for current power consumption. A specific Redfish URI is then built and sent to all iDRACs.

2. An iDRAC receives the URI, calls the corresponding Redfish API to collect data or execute tasks and then sends back a response in JSON format.

3. The Ansible control machine receives data in JSON format, parses it for specific information and can invoke scripts to format it for importing it into spreadsheets or to store in a database.

Development

The Ansible playbooks and module that implement this solution are hosted on the project's GitHub page. Please note that not all capabilities have been implemented yet, but development is ongoing. Pull requests and feature requests are welcome.

If you'd like to learn more, please join Jose Delarosa for Open Source Summit Europe 2017 on October 23, 2017. In his presentation, Automated Out-of-Band Management with Ansible and Redfish session, he'll share more details on using open source tools and open industry standards to achieve scalable, automated out-of-band systems management.