Confuse: Painless Configuration

Confuse is a straightforward, full-featured configuration system for Python.

Basic Usage

Set up your Configuration object, which provides unified access to all of your application’s config settings:

config = confuse.Configuration('MyGreatApp', __name__)

The first parameter is required; it’s the name of your application, which will be used to search the system for a config file named config.yaml. See Search Paths for the specific locations searched.

The second parameter is optional: it’s the name of a module that will guide the search for a defaults file. Use this if you want to include a config_default.yaml file inside your package. (The included example package does exactly this.)

Now, you can access your configuration data as if it were a simple structure consisting of nested dicts and lists—except that you need to call the method .get() on the leaf of this tree to get the result as a value:

value = config['foo'][2]['bar'].get()

Under the hood, accessing items in your configuration tree builds up a view into your app’s configuration. Then, get() flattens this view into a value, performing a search through each configuration data source to find an answer. (More on views later.)

If you know that a configuration value should have a specific type, just pass that type to get():

int_value = config['number_of_goats'].get(int)

This way, Confuse will either give you an integer or raise a ConfigTypeError if the user has messed up the configuration. You’re safe to assume after this call that int_value has the right type. If the key doesn’t exist in any configuration file, Confuse will raise a NotFoundError. Together, catching these exceptions (both subclasses of confuse.ConfigError) lets you painlessly validate the user’s configuration as you go.

View Theory

The Confuse API is based on the concept of views. You can think of a view as a place to look in a config file: for example, one view might say “get the value for key number_of_goats”. Another might say “get the value at index 8 inside the sequence for key animal_counts”. To get the value for a given view, you resolve it by calling the get() method.

This concept separates the specification of a location from the mechanism for retrieving data from a location. (In this sense, it’s a little like XPath: you specify a path to data you want and then you retrieve it.)

Using views, you can write config['animal_counts'][8] and know that no exceptions will be raised until you call get(), even if the animal_counts key does not exist. More importantly, it lets you write a single expression to search many different data sources without preemptively merging all sources together into a single data structure.

Views also solve an important problem with overriding collections. Imagine, for example, that you have a dictionary called deliciousness in your config file that maps food names to tastiness ratings. If the default configuration gives carrots a rating of 8 and the user’s config rates them a 10, then clearly config['deliciousness']['carrots'].get() should return 10. But what if the two data sources have different sets of vegetables? If the user provides a value for broccoli and zucchini but not carrots, should carrots have a default deliciousness value of 8 or should Confuse just throw an exception? With Confuse’s views, the application gets to decide.

The above expression, config['deliciousness']['carrots'].get(), returns 8 (falling back on the default). However, you can also write config['deliciousness'].get(). This expression will cause the entire user-specified mapping to override the default one, providing a dict object like {'broccoli': 7, 'zucchini': 9}. As a rule, then, resolve a view at the same granularity you want config files to override each other.

Warning

It may appear that calling config.get() would retrieve the entire configuration at once. However, this will return only the highest-priority configuration source, masking any lower-priority values for keys that are not present in the top source. This pitfall is especially likely when using Command-Line Options or Environment Variables, which may place an empty configuration at the top of the stack. A subsequent call to config.get() might then return no configuration at all.

Validation

We saw above that you can easily assert that a configuration value has a certain type by passing that type to get(). But sometimes you need to do more than just type checking. For this reason, Confuse provides a few methods on views that perform fancier validation or even conversion:

  • as_filename(): Normalize a filename, substituting tildes and absolute-ifying relative paths. For filenames defined in a config file, by default the filename is relative to the application’s config directory (Configuration.config_dir(), as described below). However, if the config file was loaded with the base_for_paths parameter set to True (see Manually Specifying Config Files), then a relative path refers to the directory containing the config file. A relative path from any other source (e.g., command-line options) is relative to the working directory. For full control over relative path resolution, use the Filename template directly (see Filename).

  • as_choice(choices): Check that a value is one of the provided choices. The argument should be a sequence of possible values. If the sequence is a dict, then this method returns the associated value instead of the key.

  • as_number(): Raise an exception unless the value is of a numeric type.

  • as_pairs(): Get a collection as a list of pairs. The collection should be a list of elements that are either pairs (i.e., two-element lists) already or single-entry dicts. This can be helpful because, in YAML, lists of single-element mappings have a simple syntax (- key: value) and, unlike real mappings, preserve order.

  • as_str_seq(): Given either a string or a list of strings, return a list of strings. A single string is split on whitespace.

  • as_str_expanded(): Expand any environment variables contained in a string using os.path.expandvars().

For example, config['path'].as_filename() ensures that you get a reasonable filename string from the configuration. And calling config['direction'].as_choice(['up', 'down']) will raise a ConfigValueError unless the direction value is either “up” or “down”.

Command-Line Options

Arguments to command-line programs can be seen as just another source for configuration options. Just as options in a user-specific configuration file should override those from a system-wide config, command-line options should take priority over all configuration files.

You can use the argparse and optparse modules from the standard library with Confuse to accomplish this. Just call the set_args method on any view and pass in the object returned by the command-line parsing library. Values from the command-line option namespace object will be added to the overlay for the view in question. For example, with argparse:

args = parser.parse_args()
config.set_args(args)

Correspondingly, with optparse:

options, args = parser.parse_args()
config.set_args(options)

This call will turn all of the command-line options into a top-level source in your configuration. The key associated with each option in the parser will become a key available in your configuration. For example, consider this argparse script:

config = confuse.Configuration('myapp')
parser = argparse.ArgumentParser()
parser.add_argument('--foo', help='a parameter')
args = parser.parse_args()
config.set_args(args)
print(config['foo'].get())

This will allow the user to override the configured value for key foo by passing --foo <something> on the command line.

Overriding nested values can be accomplished by passing dots=True and have dot-delimited properties on the incoming object.

parser.add_argument('--bar', help='nested parameter', dest='foo.bar')
args = parser.parse_args()  # args looks like: {'foo.bar': 'value'}
config.set_args(args, dots=True)
print(config['foo']['bar'].get())

set_args works with generic dictionaries too.

args = {
  'foo': {
    'bar': 1
  }
}
config.set_args(args, dots=True)
print(config['foo']['bar'].get())

Note that, while you can use the full power of your favorite command-line parsing library, you’ll probably want to avoid specifying defaults in your argparse or optparse setup. This way, Confuse can use other configuration sources—possibly your config_default.yaml—to fill in values for unspecified command-line switches. Otherwise, the argparse/optparse default value will hide options configured elsewhere.

Environment Variables

Confuse supports using environment variables as another source to provide an additional layer of configuration. The environment variables to include are identified by a prefix, which defaults to the uppercased name of your application followed by an underscore. Matching environment variable names are first stripped of this prefix and then lowercased to determine the corresponding configuration option. To load the environment variables for your application using the default prefix, just call set_env on your Configuration object. Config values from the environment will then be added as an overlay at the highest precedence. For example:

export MYAPP_FOO=something
import confuse
config = confuse.Configuration('myapp', __name__)
config.set_env()
print(config['foo'].get())

Nested config values can be overridden by using a separator string in the environment variable name. By default, double underscores are used as the separator for nesting, to avoid clashes with config options that contain single underscores. Note that most shells restrict environment variable names to alphanumeric and underscore characters, so dots are not a valid separator.

export MYAPP_FOO__BAR=something
import confuse
config = confuse.Configuration('myapp', __name__)
config.set_env()
print(config['foo']['bar'].get())

Both the prefix and the separator can be customized when using set_env. Note that prefix matching is done to the environment variables prior to lowercasing, while the separator is matched after lowercasing.

export APPFOO_NESTED_BAR=something
import confuse
config = confuse.Configuration('myapp', __name__)
config.set_env(prefix='APP', sep='_nested_')
print(config['foo']['bar'].get())

For configurations that include lists, use integers starting from 0 as nested keys to invoke “list conversion.” If any of the sibling nested keys are not integers or the integers are not sequential starting from 0, then conversion will not be performed. Nested lists and combinations of nested dicts and lists are supported.

export MYAPP_FOO__0=first
export MYAPP_FOO__1=second
export MYAPP_FOO__2__BAR__0=nested
import confuse
config = confuse.Configuration('myapp', __name__)
config.set_env()
print(config['foo'].get())  # ['first', 'second', {'bar': ['nested']}]

For consistency with YAML config files, the values of environment variables are type converted using the same YAML parser used for file-based configs. This means that numeric strings will be converted to integers or floats, “true” and “false” will be converted to booleans, and the empty string or “null” will be converted to None. Setting an environment variable to the empty string or “null” allows unsetting a config value from a lower-precedence source.

To change the lowercasing and list handling behaviors when loading environment variables or to enable full YAML parsing of environment variables, you can initialize an EnvSource configuration source directly.

If you use config overlays from both command-line args and environment variables, the order of calls to set_args and set_env will determine the precedence, with the last call having the highest precedence.

Search Paths

Confuse looks in a number of locations for your application’s configurations. The locations are determined by the platform. For each platform, Confuse has a list of directories in which it looks for a directory named after the application. For example, the first search location on Unix-y systems is $XDG_CONFIG_HOME/AppName for an application called AppName.

Here are the default search paths for each platform:

  • macOS: ~/.config/app and ~/Library/Application Support/app

  • Other Unix: ~/.config/app and /etc/app

  • Windows: %APPDATA%\app where the APPDATA environment variable falls back to %HOME%\AppData\Roaming if undefined

Both macOS and other Unix operating sytems also try to use the XDG_CONFIG_HOME and XDG_CONFIG_DIRS environment variables if set then search those directories as well.

Users can also add an override configuration directory with an environment variable. The variable name is the application name in capitals with “DIR” appended: for an application named AppName, the environment variable is APPNAMEDIR.

Manually Specifying Config Files

You may want to leverage Confuse’s features without Search Paths. This can be done by manually specifying the YAML files you want to include, which also allows changing how relative paths in the file will be resolved:

import confuse
# Instantiates config. Confuse searches for a config_default.yaml
config = confuse.Configuration('MyGreatApp', __name__)
# Add config items from specified file. Relative path values within the
# file are resolved relative to the application's configuration directory.
config.set_file('subdirectory/default_config.yaml')
# Add config items from a second file. If some items were already defined,
# they will be overwritten (new file precedes the previous ones). With
# `base_for_paths` set to True, relative path values in this file will be
# resolved relative to the config file's directory (i.e., 'subdirectory').
config.set_file('subdirectory/local_config.yaml', base_for_paths=True)

val = config['foo']['bar'].get(int)

Your Application Directory

Confuse provides a simple helper, Configuration.config_dir(), that gives you a directory used to store your application’s configuration. If a configuration file exists in any of the searched locations, then the highest-priority directory containing a config file is used. Otherwise, a directory is created for you and returned. So you can always expect this method to give you a directory that actually exists.

As an example, you may want to migrate a user’s settings to Confuse from an older configuration system such as ConfigParser. Just do something like this:

config_filename = os.path.join(config.config_dir(),
                               confuse.CONFIG_FILENAME)
with open(config_filename, 'w') as f:
    yaml.dump(migrated_config, f)

Dynamic Updates

Occasionally, a program will need to modify its configuration while it’s running. For example, an interactive prompt from the user might cause the program to change a setting for the current execution only. Or the program might need to add a derived configuration value that the user doesn’t specify.

To facilitate this, Confuse lets you assign to view objects using ordinary Python assignment. Assignment will add an overlay source that precedes all other configuration sources in priority. Here’s an example of programmatically setting a configuration value based on a DEBUG constant:

if DEBUG:
    config['verbosity'] = 100
...
my_logger.setLevel(config['verbosity'].get(int))

This example allows the constant to override the default verbosity level, which would otherwise come from a configuration file.

Assignment works by creating a new “source” for configuration data at the top of the stack. This new source takes priority over all other, previously-loaded sources. You can cause this explicitly by calling the set() method on any view. A related method, add(), works similarly but instead adds a new lowest-priority source to the bottom of the stack. This can be used to provide defaults for options that may be overridden by previously-loaded configuration files.

YAML Tweaks

Confuse uses the PyYAML module to parse YAML configuration files. However, it deviates very slightly from the official YAML specification to provide a few niceties suited to human-written configuration files. Those tweaks are:

  • All strings are returned as Python Unicode objects.

  • YAML maps are parsed as Python OrderedDict objects. This means that you can recover the order that the user wrote down a dictionary.

  • Bare strings can begin with the % character. In stock PyYAML, this will throw a parse error.

To produce a YAML string reflecting a configuration, just call config.dump(). This does not cleanly round-trip YAML, but it does play some tricks to preserve comments and spacing in the original file.

Custom YAML Loaders

You can also specify your own PyYAML Loader object to parse YAML files. Supply the loader parameter to a Configuration constructor, like this:

config = confuse.Configuration("name", loader=yaml.Loaded)

To imbue a loader with Confuse’s special parser overrides, use its add_constructors method:

class MyLoader(yaml.Loader):
    ...
confuse.Loader.add_constructors(MyLoader)
config = confuse.Configuration("name", loader=MyLoader)

Configuring Large Programs

One problem that must be solved by a configuration system is the issue of global configuration for complex applications. In a large program with many components and many config options, it can be unwieldy to explicitly pass configuration values from component to component. You quickly end up with monstrous function signatures with dozens of keyword arguments, decreasing code legibility and testability.

In such systems, one option is to pass a single Configuration object through to each component. To avoid even this, however, it’s sometimes appropriate to use a little bit of shared global state. As evil as shared global state usually is, configuration is (in my opinion) one valid use: since configuration is mostly read-only, it’s relatively unlikely to cause the sorts of problems that global values sometimes can. And having a global repository for configuration option can vastly reduce the amount of boilerplate threading-through needed to explicitly pass configuration from call to call.

To use global configuration, consider creating a configuration object in a well-known module (say, the root of a package). But since this object will be initialized at module load time, Confuse provides a LazyConfig object that loads your configuration files on demand instead of when the object is constructed. (Doing complicated stuff like parsing YAML at module load time is generally considered a Bad Idea.)

Global state can cause problems for unit testing. To alleviate this, consider adding code to your test fixtures (e.g., setUp in the unittest module) that clears out the global configuration before each test is run. Something like this:

config.clear()
config.read(user=False)

These lines will empty out the current configuration and then re-load the defaults (but not the user’s configuration files). Your tests can then modify the global configuration values without affecting other tests since these modifications will be cleared out before the next test runs.

Redaction

You can also mark certain configuration values as “sensitive” and avoid including them in output. Just set the redact flag:

config['key'].redact = True

Then flatten or dump the configuration like so:

config.dump(redact=True)

The resulting YAML will contain “key: REDACTED” instead of the original data.