libvirt boxen for OpenShift v3

I promise I have not been struggling with vagrant the whole time since my last post. Actually I updated the vagrant-openshift docs and made some other fixes so the whole thing is a little more sane and obvious how to use, and then went on to other stuff. Today I’m just trying to put together OpenShift v3 libvirt boxen to put up for the public next to the virtualbox ones. Should be easy, actually it probably is; my problems today all seem to be local.

It would be nice if, just once, vagrant had a little transparency. It doesn’t have a verbose mode, and never tells you where anything is or should be.

$ vagrant box list
aws-dummy-box (aws, 0)
fedora_base (libvirt, 0)
fedora_inst (libvirt, 0)
openstack-dummy-box (openstack, 0)

Ah, yeah… so… where are those defined? What images do they point to, and where were they downloaded from?

The errors are the worst. When something goes wrong, could you please tell me what you think you got from me, what you tried to do with that, and what went wrong? No.

$ vagrant up --provider=libvirt
Bringing machine 'openshiftdev' up with 'libvirt' provider...
Name `origin_openshiftdev` of domain about to create is already taken.
Please try to run `vagrant up` command again.

Just try to figure out what is specifying “origin_openshiftdev” as a domain and what to do about it. Or how to release it so I can, in fact, run vagrant up again.

$ vagrant status
Current machine states:

openshiftdev not created (libvirt)

The Libvirt domain is not created. Run `vagrant up` to create it.
$ vagrant destroy
==> openshiftdev: Domain is not created. Please run `vagrant up` first.

Part of the problem is that I have at least three semi-autonomous bits of vagrant to deal with. There’s vagrant itself, which keeps track of box definitions. There’s the Vagrantfile I’m feeding it from OpenShift Origin, which might interact with the vagrant-openshift plugin (though I don’t think so on vagrant up) but in any case defines what hosts I’m supposed to be creating. Finally, there’s the provider plugin (libvirt in this case) that has to interface with the virtualization to actually manage the hosts. If something goes wrong, I can’t even tell which part is complaining, much less why.

Enough complaining, what is going on?

The primary input to vagrant is a “box”. This is really just a tarball that contains a minimal Vagrantfile, metadata file, and the real payload, the disk image of the virtual host. The vagrant “box” is provider-specific – the metadata specifies a provider.

When you run vagrant up, the local Vagrantfile should specify which box to start with – a URL to retrieve it and the name for vagrant to import it as. The first run will download and unpack it under ~/.vagrant.d/boxes/<name>/<version>/<provider>/ (note, you can have multiple providers for the same box name/version). Subsequent runs just use that box definition. Simple enough as it goes.

vagrant up also creates a local .vagrant/ directory to keep track of “machines” (which are intended to represent actual running virtual hosts instantiated from boxes). Machines are stored under .vagrant/machines/<name>/<provider>, where the name comes from the Vagrantfile VM definition. In OpenShift’s Vagrantfile we have config.vm.define “openshiftdev”, so for the libvirt provider I could expect to see a directory .vagrant/machines/openshiftdev/libvirt once I’ve brought up a machine. (Under vbox you can define a master and several minions, which would all have different names. I hope we can do that soon with the other providers too.)

I was planning to build a libvirt box from scratch, but then I realized there is a Vagrant plugin “vagrant-mutate” that will take an existing box and change it to another provider. Since we already have boxes defined for vbox I thought I’d just try this out to make a libvirt version of it.

$ vagrant mutate \
  https://mirror.openshift.com/pub/vagrant/boxes/openshift3/centos7_virtualbox_inst.box \
  libvirt
Downloading box centos7_virtualbox_inst from https://mirror.openshift.com/pub/vagrant/boxes/openshift3/centos7_virtualbox_inst.box
Extracting box file to a temporary directory.
Converting centos7_virtualbox_inst from virtualbox to libvirt.
 (100.00/100%)
Cleaning up temporary files.
The box centos7_virtualbox_inst (libvirt) is now ready to use.

So far, so good. Or not, because what does “ready to use” mean? Where is it? Turns out, it means said box is stored under my ~/.vagrant.d/boxes directory for use with the next vagrant up. It kept the same name with the provider embedded in it, but if I just change the name…

$ mv ~/.vagrant.d/boxes/centos7_{virtualbox_,}inst
$ vagrant box list
aws-dummy-box (aws, 0)
centos7_inst (libvirt, 0)
fedora_base (libvirt, 0)
fedora_inst (libvirt, 0)
openstack-dummy-box (openstack, 0)

… everything works out fine. So to use that with my openshift/origin Vagrantfile, I just put that name into my .vagrant-openshift.json file like so:

"libvirt": {
  "box_name": "centos7_inst"
},

Note that I don’t need to specify a box_url because the box is already local. Folks will need the box_url to access it once I publish it. So let’s vagrant up already…

$ vagrant up --provider=libvirt
Bringing machine 'openshiftdev' up with 'libvirt' provider...
/home/luke/.vagrant.d/gems/gems/fog-1.27.0/lib/fog/libvirt/requests/compute/list_volumes.rb:32:in `info': 
Call to virStorageVolGetInfo failed: Storage volume not found: 
no storage vol with matching path '/mnt/VMs/origin_openshiftdev.img'
(Libvirt::RetrieveError)

Ah. This is definitely due to some messing around on my part, because I deleted that image as I thought vagrant was saying earlier it was in the way (remember “Name `origin_openshiftdev` of domain about to create is already taken” ?). This error at least seems safe to pin on the libvirt provider, but I’m not sure what to do about it. Shouldn’t libvirt just clone the image from the vagrant box to create a new VM? How did my request to instantiate the “centos7_inst” box as “openshiftdev” get translated into looking for that particular file to exist?

I’m guessing (since grep got me nowhere) that the libvirt provider takes the directory I’m in and the box being requested and uses that as the VM name. Or at least, a volume name from which VMs can be cloned for Vagrant usage.

virsh to the rescue

I’m not really very knowledgeable of libvirt, mainly because I’ve been able to run VMs just fine using the graphical virt-manager interface and didn’t really need a lot more. I deleted that image above using virt-manager, figuring it would take care of referential integrity. Now that I’m venturing into the world of scripted VM management, I have been fiddling a little with virsh, so let’s apply that:

# virsh vol-list default
 Name                     Path 
------------------------------------------------------------------------------
[...]
 origin_openshiftdev.img  /mnt/VMs/origin_openshiftdev.img

Hmm, yes, libvirt does actually seem to expect that volume to be there. And then it’s failing trying to use it because the actual file isn’t there. So let’s nuke the volume record, wherever that may be.

# virsh vol-delete origin_openshiftdev.img default
Vol origin_openshiftdev.img deleted

And vagrant up --provider=libvirt suddenly works again.

Updating libvirt boxes

One extra note about using libvirt as a provider: as soon as you use vagrant to start a libvirt box you have downloaded, the vagrant-libvirt plugin makes a copy of the image from the box definition and uses that. The copy is made in libvirt’s default storage pool (unless you tell it otherwise… BTW, quite a few interesting options at the vagrant-libvirt README) and is named <box_name>_vagrant_box_image.img. So my box above translates to /mnt/VMs/centos7_inst_vagrant_box_image.img (I use a separate mount point for my VM storage because it’s just too easy to fill your root fs otherwise). Then when you actually create a VM, it uses a copy-on-write snapshot of that image, which seems to be named after the project and VM definition (my problem volume above, origin_openshiftdev.img). That way it’s a pretty fast, efficient startup from a consistent starting point.

Of course this could be a bit confusing if you actually want to update your vagrant box. You might download a new box definition from vagrant’s perspective, but vagrant-libvirt sees it already has a volume with the right name and keeps using that (in fact, once it has copied the volume, you may as well truncate the box.img under .vagrant to save space). You have to nuke the libvirt volume to get it to use the updated box definition. virt-manager seems to do just as well as virsh vol-delete at this (not sure what happened before in my case). So e.g.

# virsh vol-delete centos7_inst_vagrant_box_image.img default

Then the next vagrant up with that box will use the updated box definition.

Advertisements