Exploring 3 – docker

More unreliable ruminations –

When Docker started to make a splash, I took a quick look at it, you know, the basic tutorial. All very nice, but not too much depth. And even though the rest of the OpenShift team has pivoted to this platform fairly quickly, I’ve been waiting until I would actually have some real time to devote to it before digging in deeper.

Although I know that at the pace this stuff is moving, RHEL 7 is already far behind, I brought up a RHEL 7 host and started running through https://access.redhat.com/articles/881893 which has a little more meat to it as far as introducing Docker capabilities. Under RHEL 7, Docker is in the “extras” channel (and relies on one pkg in the “optional” channel). It’s useful to know that the “extras” channel is actually supported (unlike “optional”), but not on the same terms as the rest of RHEL – things in this channel are allowed to move quickly and break compatibility. That’s a good place for Docker, since I know our team is still collaborating heavily with Docker to get in features needed for OpenShift. I expect there will be a sizeable update for RHEL 7.1, although chances are we’ll be using Atomic anyway.

Atomic ships tmux but not screen. I guess it’s time for me to finally make the leap. As tempting as it is to just remap the meta key to C-a, I should probably get used to using the defaults.

The first thing that would probably help me to understand Docker is an analogy with Docker registries/repositories and git. Docker is clearly informed by git and VCS, using some of the same verbs (pull, push, commit, tag) but assigning different semantics.

This article clarified the similarities and differences in terms (although it’s not clear when it was written, looks like about a year ago… seriously, an undated blog post on new technology? How does this keep happening?). Dockerhub is approximately like Github… repositories are approximately like Github repos. The location of the image layer information doesn’t seem to be the same for me, but I don’t know if that’s because Docker changed in the meantime or because it is packaged differently for RHEL/Atomic.

docker pull

So, you “docker pull” an image. It’s a little confusing where you’re pulling it from and to. “To” turns out to be clearest… a local cache, which nothing ever tells you where that is, but it looks like on RHEL 7 it’s under /var/lib/docker/ – there’s image metadata at /var/lib/docker/graph/ and perhaps some actual content at /var/lib/docker/devicemapper/ but I’m having trouble seeing exactly how the image data is stored – I’m sure this is confusing for a reason. Open question for now.

Here’s a handy alias:

# alias json="python -mjson.tool <"

Now you can pretty-print json without having to think much about it:

json /var/lib/docker/graph/216c11b99bd09033054595d08c28cf27dabcc1b18c2cd0991fce6b1ff1c0086f/json | less

Docker storage is configurable in /etc/sysconfig/docker-storage and under Atomic, perhaps predictably, it is customized to live under /dev/atomicos/. Though there’s still plenty under /var/lib/docker.

So this is a bit like a system-wide git repository. You can contact as many “remotes” (registries) as you like, and pull down “branches” (images) composed of successive “commits” (layers) potentially with “tags” (tags! although tags do double duty as points in time and moving like branches). Once they’re present locally you can fire them up, modify them (with another commit) and push the results back to a registry.

It’s less than crystal clear to me how “docker pull” chooses a remote, i.e. how registries are determined. OK, if you “docker pull registry.access.redhat.com/rhel” it should be apparent where that’s coming from. But despite the docker hub reportedly being disabled, if I “docker pull ubuntu” or “docker pull nginx” those load up just fine – from where? Evidently Docker Hub isn’t disabled. Here’s how it seems to work:

docker pull <word e.g. "ubuntu">  = get images from public "word" repository on Docker Hub
docker pull <word>/<repo> = get images from repo owned by <word> account on Docker hub
docker pull <hostname or IP>/<repo> = get images from repo on other registry

In all cases, you can add a :tag to pull only a specific tag (and any images it is based on) rather than all of the tags in the repository.

As with git repos, you have a local concept of the remote repo which can be out of sync. So you have to push and pull to sync them up as needed.

docker commit / build / tag

If you run an image as a container, you can then commit the result as an image. If you commit it with the same name as an existing repository, it’s implicitly tagged as :latest.

Similarly you can use “docker build” with a Dockerfile that specifies base image and commands to run against it, then commit the result as an image in a repository.

Finally, you can just re-tag any image in the local cache with any repository and tag you want (within the bounds of syntax, which are pretty loose). So “docker tag” doesn’t just apply tags (and moving tags = branches) but also repositories.

docker push

Having created an image in a repo, docker push is the reverse of docker pull… and the repo indicates where it will go.

You can’t docker push to one of the root repos (like just plain “mongodb”). You can of course pull that, re-tag it with your own Docker Hub id (e.g. “docker tag mongodb sosiouxme/mongodb”) and then push it (assuming you’ve logged in and want it on your Docker Hub account).

Finally if you have tagged your image with a repo name that includes hostname/IP, then docker push will try to push it to a registry at that hostname/IP (assuming it exists and you have access). RHEL 7 ships docker-registry, but Atomic does not at this point – and why should it when you can just run the registry itself in a container?

Exploring 2 – journal

I have been reading through Lennart Poettering’s ever expanding up to the seventeenth installment of his ongoing series on systemd for Administrators without much to say here. Good stuff.

Number 17 is about the journal, which is basically a replacement for syslog. This answers my earlier question of how systemd displayed the log lines from httpd… the journal is hooked up by systemd to capture syslog and kernel log entries as well as stdout/stderr for any processes it manages. What I saw in the httpd status output there would be the stdout from starting httpd… the journal isn’t following the actual log files created by httpd (you’d need to configure httpd to log messages to syslog or journal).

The journal is really cool, though. It natively solves a lot of annoying things about system logs, mainly by attaching a ton of metadata to log entries, including automatic and unfakeable items like cgroup, pid, and executable. And then indexes by that and presents a nice filtering interface with the journalctl client (incidentally allowing users to access their own log entries). If we had this in OpenShift 2, we wouldn’t have needed a plugin for rsyslog7 to add these kinds of attributes to gear syslogs, and gears would not have needed to store their own logs at all since they could just access their own journal entries from the host with journalctl (although… I would need to check how access is controlled; if it’s by UID and not SELinux context then we would need to do something special because UIDs can be reused under OpenShift). I bet we’ll use this for v3.

One thing to note under RHEL 7… the default install doesn’t enable the persistent journal – all you have is whatever is stored in /run/log/journal since the last boot. However, it’s easy to enable the persistent journal by just creating /var/log/journal. At this point you can nuke rsyslog and just let the journal capture everything. Also, bash tab completion doesn’t seem to be set up for journalctl attributes as indicated in the blog (there’s probably a simple way to enable that too).

Exploring 1 – systemd

Starting to poke around systemd a bit more.The following is some running commentary – I may have no idea what I’m talking about.

I have to admit to myself I didn’t truly understand sysvinit all that well, I just knew how to get done what I needed to. Other than being used to typing “service xyz start” and “chkconfig xyz on” and looking in /etc/init.d for control scripts and /etc/rc*.d for symlinks, I couldn’t have answered a lot of specific questions about how it works. So really, aside from having to update my muscle memory for those things, I don’t have a huge attachment to sysvinit.

Google directed me to what looks like a good series of blog posts on systemd starting at http://0pointer.net/blog/projects/systemd-for-admins-1.html – starting in 2011 so there’s probably been a lot of drift since, but still looks like a good starting point with motivation and technical underpinnings. There’s just always more to learn about the OS.

systemctl status has a great deal more info than service status ever did, because systemd seems to standardize a bunch of stuff that was just left up to the control script before. Consider for httpd:

# service httpd status
httpd (pid 7114) is running...

So, we get that there is a running process, because the httpd daemon put down a pidfile and the process with that pid is running and looks like an httpd process. There are several other variations, including not being running, or having a pidfile but no corresponding process… well, the httpd service script could have put just about anything in its status output, tracking down log files and process trees would just be a little extra work there. systemd seems to keep track of a bunch more:

# systemctl status httpd
httpd.service - The Apache HTTP Server
 Loaded: loaded (/usr/lib/systemd/system/httpd.service; disabled)
 Active: active (running) since Tue 2014-12-02 08:11:22 EST; 12min ago
 Main PID: 23773 (httpd)
 Status: "Total requests: 0; Current requests/sec: 0; Current traffic: 0 B/sec"
 CGroup: /system.slice/httpd.service
 ├─23773 /usr/sbin/httpd -DFOREGROUND
 ├─23774 /usr/sbin/httpd -DFOREGROUND
 ├─23775 /usr/sbin/httpd -DFOREGROUND
 ├─23776 /usr/sbin/httpd -DFOREGROUND
 ├─23777 /usr/sbin/httpd -DFOREGROUND
 └─23778 /usr/sbin/httpd -DFOREGROUND
Dec 02 08:11:22 lmeyer-1201-rhel7 systemd[1]: Starting The Apache HTTP Server...
Dec 02 08:11:22 lmeyer-1201-rhel7 systemd[1]: Started The Apache HTTP Server.

We get a pointer to the systemd unit file for httpd, whether it’s enabled (for running automatically at boot), whether it’s running (“active”) and for how long, process tree (which we can obtain with confidence because the daemon is put in a cgroup at start), some log entries, and some service-specific “Status” about traffic. That’s pretty handy, and not to say that the service script couldn’t have done all this, but systemd seems to standardize it.

# cat /usr/lib/systemd/system/httpd.service
[Unit]
Description=The Apache HTTP Server
After=network.target remote-fs.target nss-lookup.target
[Service]
Type=notify
EnvironmentFile=/etc/sysconfig/httpd
ExecStart=/usr/sbin/httpd $OPTIONS -DFOREGROUND
ExecReload=/usr/sbin/httpd $OPTIONS -k graceful
ExecStop=/bin/kill -WINCH ${MAINPID}
[...]
PrivateTmp=true
[Install]
WantedBy=multi-user.target

I looked in the unit file and I’m curious how it does all this, since there’s no entry for logs or status. Something to look out for.

Part 2 gets into the usage of cgroups, which seems like a neat use case. I notice that the cgroup names seem to be longer than when the article was written, so it’s helpful to expand the column in the ps command suggested for viewing processes with cgroups:

# alias psc='ps xawf -eo pid,user:16,cgroup:64,args' 
# psc
[...]
23773 root   1:name=systemd:/system.slice/httpd.service /usr/sbin/httpd -DFOREGROUND
23774 apache 1:name=systemd:/system.slice/httpd.service \_ /usr/sbin/httpd -DFOREGROUND
23775 apache 1:name=systemd:/system.slice/httpd.service \_ /usr/sbin/httpd -DFOREGROUND
23776 apache 1:name=systemd:/system.slice/httpd.service \_ /usr/sbin/httpd -DFOREGROUND
23777 apache 1:name=systemd:/system.slice/httpd.service \_ /usr/sbin/httpd -DFOREGROUND
23778 apache 1:name=systemd:/system.slice/httpd.service \_ /usr/sbin/httpd -DFOREGROUND

I read through http://0pointer.de/blog/projects/systemd.html which is a long and detailed introduction to the motivations behind systemd. Probably should have done that first.