Making the Best of Config | Thorsten Frommen

October 10, 2019

Over the last half year, I have been looking into several of project repositories on GitHub, and I came across quite a few different ways of configuring (and using) various tools and services. In this post, I would like to share some simple, yet effective suggestions that, in my opinion, make for a much easier and streamlined usage, while still allowing for (almost?) the same flexibility.

A Place for Configuration

When working on a software project, we use or interact with many different tools or services. These might help us write better code by performing static analysis or by running dynamic tests (e.g., ESLint, PHP_CodeSniffer, Jest, or PHPUnit), or they help us manage our dependencies (e.g., Composer, or Yarn), or they provide useful functionality around building, deployment or continuous integration (e.g., webpack, Babel, GitHub Actions, or Travis CI), and more.

How to Configure

Usually, these tools can be configured, and usually this happens either via a dedicated config file or on the command line, or both. Some tools can also be configured via some other tool’s config file, for example, Jest or ESLint respect config settings specified in the package.json file, while Altis ? parses the composer.json file for configuration.

Personally, I like it better when there are multiple individual files, each for configuring a single tool only, but I don’t have a strong enough opinion on this. Having individual files allows for easier discoverability (in case you know what file name to look for, see next section) and you can easily copy select config files from one repository to another. Having various configuration data in one and the same file means you can reuse it all in one go. But, to be fair, you will usually need to change so many things in either the composer.json or package.json file that I don’t think this really is a good enough reason or actual advantage anyway.

Config Files

Now, coming back to config files, we sometimes find different names (and even extensions or types) for one and the same tool.

Config File Types

While many tools support JSON-formatted files, these don’t have to have the .json extension (e.g., .eslintrc). Now, JSON sucks when it comes to either commenting or conditional/dynamic configuration, so a lot of tools also support other formats such as a JavaScript file. Even though you might not have any dynamically/conditionally defined configuration, it is best practice to opt for the more flexible config file type, which is why most modern JavaScript projects come with a JavaScript-based .eslintrc.js config file (instead of the static .eslintrc JSON-style variant).

One important thing when it comes to different places of config data is the precedence or priority order. Some tools will look for configuration data/files in a specific order, and then read all of what they might find. Other tools might only read whatever they find first, ignoring everything else. So if you have config data for some tool both in the package.json and a dedicated file, it’s crucial to know what will be parsed, and in what order. It should be obvious, but you should not provide configuration for one and the same tool in multiple places.

Distributable Config Files

Yet another variant of configuration files are distributables, ending in the .dist pseudo-extension. The idea of these files is to distribute a default configuration that anyone can use, and CI will use. However, if someone wants to adapt certain things, for example, if they wanted to have more verbose and/or colored output, they can do so by creating a local copy of the file, without the .dist ending.

For PHPUnit, as an example, one would find a phpunit.xml.dist in version control, while people might have a local phpunit.xml, which is ignored. Ideally, the latter is based on (i.e., it includes and extends) the distributable config. That way, you will automatically benefit fom any change to the shared file, while you are still able to selectively ignore/exclude what doesn’t seem good for you.

Personally, I think, where possible, we should be using the distributable variant, for example, include phpunit.xml.dist in our project files, and add phpunit.xml to our .gitignore file.

Custom Config File Names

One of the more exotic things around config files—and something that, if I remember correctly, I did not see in a lot of Human Made repositories—is using custom config names. For example, config.TOOL.json or TOOL.config.js, with “TOOL” being the name of some tool. While this is possible (for some/most of the tools we use), this clearly means that you have to explicitly tell the tools where their respective config file is located. They wouldn’t find anything otherwise.

To be quite frank: I don’t see a point in doing that. If everyone did that everywhere, and if that was possible for all tools, and if there was a clearly defined standard on how to name config files, then yes. Maybe.

But then again, having a list of names (and places) for config files for each tool or service we are using, already is a standard. Standards don’t need to mean that everything is named the same. It rather means that you know how something is named, because the standard either includes the process to name things, or an actual and extensive list of names.

Config Folders

Another thing that I saw in different forms and shapes is config data in subfolders. This could be the actual main config file in a dedicated folder, for example, .config/webpack.config.js. Or it could be that the main config file still lives in the root, while certain pieces of the actual configuration data (which that main config file imports) is kept in a dedicated folder, for example, a webpack.config.js file (conditionally) using various other files from a webpack subfolder.

Now, I understand that keeping multiple (!) configuration files in a single config folder makes for a leaner root. But is that really that important? I mean, if you need to find a certain file, you usually know where to look. There is not much that lives in the project root and that you might need to search for. Code lives usually in a subfolder (e.g., inc, php, or src etc.), tests live in a subfolder (e.g., __tests__, .tests or tests) or are co-located with the production code files, and documentation that goes beyond the readme lives in a subfolder too (e.g., docs).

And again, for some tools, we accept that it’s just not possible otherwise. For example, GitHub clearly specifies where certain meta or config files such as issue templates or workflow definitions need to be placed—in the root, or an optional .github subfolder. Putting a .gitignore o an .editorconfig file in some specific folder has an effect on that very folder (and potential subfolders). There’s no arguing around that fact. So, while it is possible to keep certain config files in a subfolder, and then pointing each individual tool to its respective config file, why would you want to do this? This is neither consistent, nor, to me, “better” in any way.

Configuration Data and You

In your opinion, what are the pros and cons of using dedicated config files versus specifying onfiguation data inline in some other file (e.g., package.json)?

What are the config file names and types you see the most? And which ones would you like to use, consistently, across all new projects? Why?

What are your thoughts on distributable config files (e.g., phpcs.xml.dist)?

What about a dedicated folder for config files (e.g., config or .config)? Or maybe even tool-specific folders such as webpack? Do you see any of that in projects? If so, does it make sense to you?

Complete Configuration

Most tools we use allow for complete configuration, meaning everything can be included in static configuration data in some file, so there is nothing you have to specify on a per-use basis. However, I found that oftentimes this is not done, or not to its fullest extent, at least.

Targetting Files

If you use a specific tool for project work, you most probably have a clear use case for that. In general, this means that you want to run that tool on one or more specific files or folders. For example, some test runner executing PHP or JS tests, maybe even specific tests. Or some linting tool checking your source and/or production code. Or… You get the point.

Now, you certainly can do this either manually, every single time you run the tool via the command line. Or you could store that exact longer shell command somewhere to execute, for example, in the scripts section of either the composer.json or the package.json file. Or you coul hard-code that in your CI configuration file, say .travis.yml.

How to do this exactly depends on the individual tool. Some take one or more target paths as actual command line arguments (e.g., eslint target.js). Other tools might have dedicated options for that (e.g., phpunit --filter "Data/*"). And other tools might even support both of that.

Or you could just include this information right in the config file/definition itself, where possible. This would mean that, no matter how and where you run the tool, it automatically knows what the target is.

Assuming your project is a regular WordPress plugin, and you have a main plugin file, pluin.php, and the rest of the code in the inc subfolder, then letting PHP_CodeSniffer know is as easy as adding these two lines to your phpcs.xml.dist config file:

    <file>inc</file>
    <file>plugin.php</file>

For PHP_CodeSniffer specifically, you also might want to include that you are only interested in linting PHP files (if that’s what you want, because it can also lint JavaScript and CSS files):

    <arg name="extensions" value="php"/>

Ignoring Files

Sometimes you also want or have to explicitly tell a tool which files or folder not to consider. This is usually done via a dedicated ignore file, for example, .gitignore or .eslintignore. Other tools (also) expect or support this in their main config file, for example, PHP_CodeSniffer:

    <exclude-pattern>content/uploads/*</exclude-pattern>

You might be able to do this on a per-call basis, on the command line. But, again, I suggest to make this part of the actual configuration.

Presentation

Most configuration—be it hard-coded as a shell command with multiple flags and options, or in the form of a config file—almost always include the What. Meaning some sort of rules for or definitions of what exactly the individual tool should do when executed. This includes rulesets, potential presets or plugins, and maybe also required extra configuration for those presets or plugins.

What is missing most of the time, however, is the How. If a tool does what it should do, how should it inform the user about what is going on at the moment? Should there be output or not at all? If yes, how verbose should it be? Should the output use colors? Should any report files get created as part of the process?

Most of the above questions can be answered as part of the (default) configuration already. I am not saying that this always should be done. However, if there is a preferred way of how to do (certain) things, then why not make that part of the config? Most tools also allow config values to be overriden, so you still can disable or customize various defaults, if need be.

Putting It All Together

Now, if we keep our config files where tools can pick them up automatically, and if we ensure that our config files are as complete as possible or as makes sense, all we really need to do then is run the tools.

PHP_CodeSniffer, as an example, can be executed by running something like a specific phpcs Composer script. Let’s imagine this is what it looks like:

    "scripts": {
        "phpcs": "vendor/bin/phpcs --standard=./path/to/config.xml -p -s -v --extensions=php inc"
    }

Changing the config file to the default name and place phpcs.xml.dist gets rid of the --standard option:

    "scripts": {
        "phpcs": "vendor/bin/phpcs -p -s -v --extensions=php inc"
    }

If the config already includes what to target, we don’t need to specify that:

    "scripts": {
        "phpcs": "vendor/bin/phpcs -p -s -v"
    }

If the reporting or presentation configuration is also specified, let’s remove that too:

    "scripts": {
        "phpcs": "vendor/bin/phpcs"
    }

At that point we might realize this is just an alias for an existing binary (conincidentally with the same name). Since this is Composer-managed, this could be changed to this:

    "scripts": {
        "phpcs": "phpcs"
    }

This means we can composer run the phpcs script, which then executes the phpcs binary.

Since Composer also provides a way to execute any binary, we don’t even need the scripts entry, and can just run composer exec phpcs (in this case, phpcs is the name of the binary living in vendor/bin). But this is just preference.

And know what? It does not matter how and where you run the tool. It will always do the exact same thing, and not use differing config. Want to do it the Composer scripts way? Run composer run phpcs. Simply execute a Composer-managed binary? Sure, call composer exec phpcs. Like to do it the more manual way? Running ./vendor/bin/phpcs works just fine. Even if you have PHP_CodeSniffer installed globally (or have it in your PATH somehow), running phpcs will automatically pick up all your config.

✨

Tags: Best Practice, Developer Tools, Quality