During the summer of 2014 I worked on the OpenStack Keystone component while interning at Red Hat. Fast forward to the end of October 2015 and I once again find myself working on OpenStack for Red Hat — this time on the RDO Continuous Integration (CI) team. Since re-joining Red Hat I’ve developed a whole new level of respect not only for the wide breadth of knowledge required to work on this team but for deploying OpenStack in general.
The list of deployment options for OpenStack[2a,b,c,d] is long and has a colorful history. Furthermore, there are probably only a handful of people who have developed and or used more than any given handful. I have the fortune of working with, or at least within the proximity of, several of these folks. However, even with that advantage, I find wrapping my head around all of the moving parts involved in deploying OpenStack confusing. This was a prime source of frustrations during my first several weeks back at Red Hat. Routinely I found myself having to accept the magic of a given deployment tool in order move forward with my tasks.
Presently, the RDO CI team uses Khaleesi, and by proxy Ansible, as one of the deployment tools for automating builds with Jenkins. During this walkthrough I will follow the deployment steps as outlined in Khaleesi’s cookbook to deploy OpenStack using RDO-Manager on a single, baremetal CentOS 7.2 server.
- One machine is set up as the controller from which you will generate the necessary Ansible configuration and execute the appropriate playbooks within Khaleesi. In my case this is a ThinkPad X1 running Fedora 22 — my work laptop.
- Note: Make sure you follow all of the steps in the Khaleesi setup guide on the controller or you will run into problems when trying to use ksgen or Ansible.
- One machine with a minimum of 1 quad core CPU, 12 GB of memory, and 120 GB of free space, as outlined by the RDO-Manager docs, that is running a clean install of CentOS 7.2
My setup meeting the two above requirements looks looks like this:
Okay, let us move to the actual setup. We will be picking up at the Configuration portion of the Khaleesi cookbook. Remember that all of these commands are to be run from the machine you installed Khaleesi on — the Thinkpad X1 in my case.
cp ansible.cfg.example ansible.cfg touch ssh.config.ansible echo "" >> ansible.cfg echo "[ssh_connection]" >> ansible.cfg echo "ssh_args = -F ssh.config.ansible" >> ansible.cfg
We begin by copying over the Ansible config in version control, with some defaults needed across Khaleesi use cases, and then telling Ansible to use the config file that will be generated by Khaleesi, ssh.config.ansible
ssh-copy-id root@<ip address of baremetal virt host> # x.x.x.49 in my example
ssh-copy-id allows us to easily transfer your ssh keys to the CentOS box. This tool has quickly become one of my gotos as I am constantly provisioning systems and removes many of the possible human errors involved in key transfers.
export TEST_MACHINE=<ip address of baremetal virt host> # x.x.x.49 in my example
The playbook we will end up calling, khaleesi/playbooks/full-job-no-test.yml will expect that the TEST_MACHINE environment variable has been set and will use it while generating the hosts file used by Ansible.
ksgen --config-dir settings generate \ --provisioner=manual \ --product=rdo \ --product-version=liberty \ --product-version-build=last_known_good \ --product-version-repo=delorean \ --distro=centos-7.0 \ --installer=rdo_manager \ --installer-env=virthost \ --installer-images=build \ --installer-network-isolation=none \ --installer-network-variant=ml2-vxlan \ --installer-post_action=none \ --installer-topology=minimal \ --installer-tempest=smoke \ --workarounds=enabled \ --extra-vars @../khaleesi-settings/hardware_environments/virt/network_configs/none/hw_settings.yml \ ksgen_settings.yml
Note: If you see warnings similar to the line directly below, don’t worry. There is a set of defaults and it is simply informing you which it will be using if no respective parameter was handed to it when called.
settings.py:105| _load_defaults() | WARNING: '--installer-network' hasn't been provided, using 'neutron' as default
Remember that tool we installed from within the Khaleesi repository? That was ksgen, a tool that generates a file, ksgen_settings.yml in our case, which contains most of the variables used by Ansible during the execution of Khaleesi’s playbooks. The parameters above line up with files underneath khaleesi/settings and pull in the variables respectively while magically handling any conflicts that may arise. For example, `–provisioner=manual` will include all variables located within khaleesi/settings/provisioner/manual.yml as well as khaleesi/settings/provisioner/common/common.yml as indicated by the include statement at the top of the aforementioned manual.yml.
This is a pretty basic setup. A few of the parameters are of particular note, namely:
We have provisioned the CentOS box ourselves as opposed to using say Beaker or Foreman (both of which are supported provisioners by Khaleesi)
RDO-Manager is our tool of choice here for the actual installation of OpenStack on our CentOS box.
Our undercloud/overcloud deployment will be installed on virtual machines running on the CentOS box, TEST_MACHINE. Accordingly, Khaleesi will need to use respective virthost playbooks — as opposed to baremetal playbooks were we to install our nodes on actual boxes.
Once upon a time all of the settings files that reside underneath khaleesi/settings lived in another repository aptly named khaleesi-settings. It still exists, we use it internally for storing sensitive data needed for our CI infrastructure that we wouldn’t want public, and it retains some things like the virtual networking settings needed for ml2-vxlan argument passed to ksgen. Why exactly does khaleesi-settings still exist upstream? To be frank, I’m not quite sure but I’ll update this post when I have a rational answer.
The result of calling ksgen is a concise YAML file, ksgen_settings.yml — you can rename it whatever you want just be sure to pass it to your ansible-playbook calls accordingly. This file is infinitely useful and will quickly become your best friend whenever you have to troubleshoot failures with Ansible.
Now we are ready to call Khaleesi’s playbook khaleesi/playbook/full-job-no-test.yml which will provision TEST_MACHINE, which is minimal in our case as we’ve already manually done so, and then use RDO-Manager to deploy an undercloud and overcloud in virtual machines that are hosted on our CentOS box.
ansible-playbook -vv --extra-vars @ksgen_settings.yml -i local_hosts playbooks/full-job-no-test.yml
If Ansible doesn’t throw an error within the first 10 seconds, indicating something is likely messed up in either you Ansible config file or in ksgen_settings.yml, feel free to go and stretch your legs as these playbooks can take awhile to finish up. The console output at the end of the playbooks execution should look similar to:
PLAY [Global post install] ****************************************************
[[ previous play time: 0:00:02.259124 = 2.26s / 3836.33s ]]
skipping: no hosts matched
PLAY RECAP ********************************************************************
host0 : ok=126 changed=81 unreachable=0 failed=0
localhost : ok=21 changed=7 unreachable=0 failed=0
overcloud-cephstorage-0 : ok=1 changed=1 unreachable=0 failed=0
overcloud-controller-0 : ok=2 changed=2 unreachable=0 failed=0
overcloud-novacompute-0 : ok=1 changed=1 unreachable=0 failed=0
undercloud : ok=123 changed=74 unreachable=0 failed=0
[[ previous task time: 0:00:00.029086 = 0.03s / 3836.35s ]]
[[ previous play time: 0:00:00.018960 = 0.02s / 3836.35s ]]
[[ previous playbook time: 1:03:56.350566 = 3836.35s / 3836.35s ]]
[[ previous total time: 1:03:56.350779 = 3836.35s / 0.00s ]]
You should now have a fully functional undercloud and overcloud running on TEST_MACHINE that is similar to the grossly simplified graphic below.
Conveniently, you can log directly into the undercloud from the root Khaleesi directory by using the ssh config generated by Khaleesi.
ssh -F ssh.config.ansible undercloud
Warning: Permanently added 'x.x.x.49' (ECDSA) to the list of known hosts.
Warning: Permanently added 'undercloud' (ECDSA) to the list of known hosts.
Last login: Wed Jan 13 18:33:22 2016 from gateway
[stack@instack ~]$ ls
Once you wrap up doing whatever it is you want to do with your new deployment, wiping out the overcloud and and performing cleanup is as simple as calling another of Khaleesi’s playbooks.
ansible-playbook -vv --extra-vars @ksgen_settings.yml -i hosts playbooks/cleanup.yml
Ansible, and Khaleesi, make it very easy to deploy OpenStack in a reproducible manner — if you have everything configured correctly beforehand. The vast majority of time I spend fixing problems while working with Khaleesi come down to mistakes related to configurations.
From the 40 or so lines we’ve entered into our shells an enormous number of subsequent actions have taken place through Khaleesi’s playbooks. I could spend days diving into each one of them. I’m sure I will eventually but it’s nice to know that I can do so as time permits me to do so thanks to Khaleesi and Ansible.
Things I’d like to write more about in the future:
- A more in depth breakdown of what is happening in each of the playbooks used during this deployment — or a deployment of a similar nature.
- Khaleesi’s purpose, history, and potential future.
- The product pipeline from OpenStack (upstream) -> RDO -> Red Hat OpenStack, aka RHOS, (downstream) and others.
- Anything you as an audience would like to hear more about related to my work.
[2a] – Devstack: http://docs.openstack.org/developer/devstack/
[2b] – Staypuft: https://github.com/theforeman/staypuft
[2c] – Packstack: https://wiki.openstack.org/wiki/Packstack
[2d] – TripleO: https://wiki.openstack.org/wiki/TripleO
 – https://jenkins-ci.org/