OpenShift with dynamic host IPs?

From the time we began packaging OpenShift Enterprise, we made a decision not to support dynamic changes to host IP addresses. This might seem a little odd since we do demonstrate installation with the assumption that DHCP is in use; we just require it to be used with addresses pinned statically to host names. It’s not that it’s impossible to work with dynamic re-leasing; it’s just that it’s an unnecessary complication and potentially a source of tricky problems.

However, I’ve crawled all over OpenShift configuration for the last few months, and I can say with a fair amount of confidence that it’s certainly possible to handle dynamic changes to host IP, as long as those changes are tracked by DNS with static hostnames.

But there are, of course, a number of caveats.

First off, it should be obvious that DNS must be integrated with DHCP such that hostnames never change and always resolve correctly to the same actual host. Then, if configuration everywhere uses hostnames, it should in theory be able to survive IP changes.

The most obvious exception is the IP(s) of the nameserver(s) themselves. In /etc/resolv.conf clearly the IP must be used, as it’s the source for name resolution, so it can’t bootstrap itself. However, in the unlikely event that nameservers need to re-IP, DHCP could make the transition with a bit of work. You could not use our basic dhclient configuration that statically prepends the installation nameserver IP – instead the DHCP server would need to supply all nameserver definitions, and there would be some complications around the transition since not all hosts would renew leases at the same time. Really, this would probably be the province of a configuration management system. I trust that those who need to do such a thing have thought about it much more than I have.

Then there’s the concern of the dynamic DNS server that OpenShift publishes app hostnames to. Well, no reason that can’t be a hostname as well, as long as the nameserver supplied by DHCP/dhclient knows how to resolve it. Have I mentioned that you should probably implement your host DNS separately from the dynamic app DNS? No reason they need to use the same server, and probably lots of reasons not to.

OK, maybe you’ve looked through /etc/openshift/node.conf and noticed the PUBLIC_IP setting in there. What about that? Well, I’ve tracked that through the code base and as far as I can tell, the only thing it is ever used for is to create a log entry when gears are created. In other words, it has no functional significance. It may have in the past – as I understand it, apps used to be created with A records rather than as CNAMEs to the node hosts. But for now, it’s a red herring.

Something else to worry about are iptables filters. In our instructions we never demonstrate filters for specific addresses, but conscientious sysadmins would likely limit connections to the hosts that are expected to need them in many cases. And they would be unlikely to define them using hostnames. So either don’t do that… or have a plan for handling IP changes.

One more caveat: what do we mean by dynamic IP changes? How dynamic?

If we’re talking about the kind of IP change where you shut down the host (perhaps to migrate its storage) and when it is booted again, it has a new IP, then that use case should be handled pretty well (again, as long as all configured host references use the hostname). This is the sort of thing you would run into in Amazon EC2 where hosts keep their IP as long as they stay up, but when shut down generally get a new IP. All the services on the host are started with the proper IP in use.

It’s a little more tricky to support IP changes while the host is operating. Any services that bind specifically to the external IP address would need restarting. I’ve had a look while writing this, though, and this is a lot less common than I expected. As far as I can see, only one node host service does that: haproxy (which is used by openshift-port-proxy to proxy specific ports from the external interface back to gear ports). The httpd proxy listens to all interfaces so it’s exempt, and individual gears listen on internal interfaces only. On the broker and supporting hosts, ActiveMQ and MongoDB either listen locally or to all interfaces. The nameserver, despite being configured to listen to “all”, appears to bind to specific IPs, so it looks like it would need a restart. You could probably configure dhclient to do this when a lease changes (with appropriate SELinux policy changes to permit it). But you can see how easy this would be to get wrong.

Hopefully this brief exploration of the issues involved demonstrates why we’re going to stick with the current (non-)support policy for the time being. But I also expect some of you out there will try out OpenShift in a dynamic IP environment, and if so I hope you’ll let me know what you run into.