Sunday, November 8, 2015

Managing Runtime Configurations

Configuration Headaches

Managing runtime application configurations in large scale, heterogeneous environments is a total pain. Over the last five years I have attacked this problem in various ways, each with its own grace and flaws. The goal of this post is to sum up the evolution of my experience and hopefully impart some insight to any folks in similar plight. 
It starts with the challenges:
  1. Different environments get different configurations
  2. Different services in the same environment get different configurations
  3. Configuration files get massive and treacherous
  4. Configuration files get complex quickly with cross references
  5. Configurations have no safety against typos in keys (or values)
  6. Configurations have essentially no type safety guarantees
  7. Configurations have no README indicating their intent or usage
  8. Configurations can be accessed from anywhere in a code base (or outside) in inconsistent manners

Sidebar: Defining Runtime Application Configurations

For the context of this article, I use the term “runtime application configurations” to represent the configurations used by an application at runtime. I do not mean configuration as code type things like Puppet, Chef, or Ansible. For example, I would consider the log configurations in alogger.xml file for a JVM application that I own to be runtime app configs. I would not consider the configuration of Apache on a web server a runtime app config in this context, I would consider that a system configuration. More specifically, I would consider it configuration for an application whose source code I'm not writing.

Evolution of a solution

Compound keys

The most popular convention I've observed for managing configurations is by flat file such as ini, json, yaml, xml; very rarely is it in the same language as the code consuming it. My assumption is that this accomplishes a few things:
  1. They can be edited by non-programmers
  2. They can be consumed by multiple programming languages, thus freeing operations teams from managing multiple manifestations of the same configuration values
  3. In the case of compiled code, configurations do not need to be compiled (or recompiled when they change)
This restriction is very powerful in the guarantees of simplicity, but also limiting. One major limiting factor is that without a separate management system, it’s impossible to have different values of configurations per runtime environment. This was a major problem for us since we shared configuration files between all of our environments, but need to set different values for the same configuration key based on the environment.
Our initial solution was to encode the environmental context onto the keys such that we could effectively define unique values for a given key based on the environment that would use it. So instead of a single key-value pair, we would have multiple key-value pairs differentiated by environment. See an example of log levels for different environments.
level<dev> = DEBUG
level<staging> = INFO
level<prod> = WARN
When the INI gets parsed natively, each of the level's will be parsed as unique keys. The next step is custom code in the application to split the environments (e.g. <dev>) off of the keys and then figure out which environment and value applies in its running context.
Sound complicated? It is. Making this work involved writing some complicated INI parsing code as well as figuring out a way to tell an application the environment in which it’s running. The result was a brittle system that was error prone and slowed down onboarding. Furthermore, if any configurations needed to be shared between projects, those projects also needed to solve the parsing and environment setting problems.

Moving away from flat files

I asked myself, “Why are configurations always in flat files anyway?” I couldn’t come up with a convincing answer, so my next attempt was to write a straightforward configuration framework in PHP that expected different values per each key based on environmental context. All configuration files were PHP files that returned a large array. I experimented with model objects that could build configurations, but ultimately discarded the idea because I felt the time to solve the edge cases would outweigh the incremental value.
With this system, the mind bending key-to-environment relationship was slightly more clear. Another gain was removing cross references to other keys within the INI values. Since the values were set with PHP, it was possible to reuse values or base other values off each other, e.g. url = "{$scheme}{$domain}{$path}".
Although the system was less brittle and easier to understand, the configuration files themselves were still rather large and it wasn’t clear how defaults worked between environments. Worse, the files could only be used by PHP applications.

A separate configuration management system

My biggest problem was the size and complications of putting configurations for every environment in the configuration files. So I decided to look into using a separate system to manage the differences configurations, which could then deliver only the necessary content to a given consumer. I landed on Puppet since we already use it extensively elsewhere. This allowed me to write files containing only the configurations that mattered in the respective environments. Based on this, going back to flat files was possible. I could also tie together configuration values in Puppet before writing the files, so there was no need for cross references within the configuration file itself.
This was great, my first 5 problems were solved. But still, type safety was not enforced, configuration files had no guaranteed documentation, and understanding how they were used within an application was a grep nightmare.

Configuration model objects

I decided to revisit configuration model objects. However this time not as builders, but instead as accessors. By funneling all access of configurations through a single point, I figured that I could enforce a few things.
I buried the configuration parsing code inside an abstract class (aptly named Configuration) with protected methods for getting at the parsed values. Configuration model classes could then extend the base and expose relevant methods for their configurations. For example, a logging configuration class would have a getLevel() method, which behind the scenes would parse the configurations file and return back the value. Any user that wanted to load configurations could only do so by using a Configurationclass, or writing a new one. Next, I tied the name of the configuration file to the name of the class, such that if a class were named LogConfigurations, the configuration file behind the scenes needed to be named log.conf. Finding usages of a given configuration file became trivial.
I added type expectations methods to the parsing code so that invalid values would raise exceptions. TheConfiguration class requires that all extending classes implement a README() method, so that users understand the classes intention and expected INI contents. This is coupled with a generic unit test that parses the README() output and ensures it can actually be used by the class. Furthermore, each of the accessor methods has its own documentation.
Configuration model classes also opened the door for more sophisticated configurations. Since access is inside a class, and not just array referencing, the class can do smart things like tie multiple values together or call out to other classes. One really great manifestation of this was the ability to create the ProjectConfiguration class which uses the project version control to load a configuration file’s contents.

Get involved

That’s where I am today. It’s been a fun journey and I’m happy (atm) with where things are. I’m excited to see where this goes next and to learn other ways folks have found to solve this problem.
One key component that I’d like to see next is the configuration delivery mechanism. Currently, Puppet solves the problem reasonably for short lived PHP processes. However, the two areas that I’d like to improve are:
  1. Automatic reloading by long lived processes. Maybe this is just from a file watch.
  2. A better interface for managing the configurations. This could get extensive. One tool here would be validation. This could be much more than just type checking, but semantics and plugins for special configuration groups (e.g. validate a database in a slave section is up not a master). Typos in keys would be completely avoided. Another would be the ability to see what values a given environment would get.
Please post back with comments or pull requests!

No comments:

Post a Comment