When writing unit or integration tests for PHP, occasionally you need to test a specific scenario multiple times, with different input data. To make this easy and less repetitive, PHPUnit provides what is called Data Providers.

A data provider is a method in some test case file, providing several sets of input data for one or more test methods. This is done by return a two-dimensional array, meaning an array where each element is an array itself:

/**
 * @return array[]
 */
public function provideInvalidTimestamp(): array {

	return [
		[ 'invalid' ],
		[ false ],
		[ [ 48151623420000 ] ],
	];
}

When running your tests, PHPUnit will pass the elements of these nested arrays as arguments to the test methods referencing a data provider. Referencing a data provider is done by using the @dataProvider annotation:

/**
 * @dataProvider provideInvalidTimestamp
 */
public function testStuff( $timestamp ) {

	// ...
}

I’m sure this is no news to a lot of people, so in this post, I’d like to share some quick tips on how to make the best use of data providers.

Scenario Names

To allow for better readability, and thus easier maintenance, I recommend that you give the different scenarios a name. You could, of course, add a comment above each data set, but there is actual support for this built into PHPUnit, and it works by using the array keys:

public function provideInvalidTimestamp(): array {

	return [
		'invalid' => [ 'invalid' ],
		'missing' => [ false ],
		'array'   => [ [ 48151623420000 ] ],
	];
}

PHPUnit will use these keys/names when reporting failing tests.

Argument Names

To further help the people reading your tests, you may want to bring names to the second dimension. Each element in a second-level array represents a value for a test method parameter, so let’s use the parameter names as keys in the nested arrays. This will, of course, mean to break each data set into multiple lines:

public function provideSearchData(): array {

	return [
		'post' => [
			'post type' => 'post',
			'query' => [ 's' => '1' ],
			'expected' => [ 'post_parent' => 1 ],
		],
		'attachment' => [
			'post type' => 'attachment',
			'query' => [ 's' => '1' ],
			'expected' => [ 'p' => 1 ],
		],
		'search' => [
			'post type' => false,
			'query' => [ 's' => 'string' ],
			'expected' => [ 's' => 'string' ],
		],
		'no search' => [
			'post type' => '',
			'query' => [],
			'expected' => [],
		],
	];
}

Depending on the number of data sets and/or parameters, duplicating this over and over again is quite unnecessary. It’s as easy as copy and paste and no one cares about the extra characters, but if you want to save yourself some typing, maybe only provide names for the first data set…? I still recommend writing one value per line:

public function provideSearchData(): array {

	return [
		'post' => [
			'post type' => 'post',
			'query' => [ 's' => '1' ],
			'expected' => [ 'post_parent' => 1 ],
		],
		'attachment' => [
			'attachment',
			[ 's' => '1' ],
			[ 'p' => 1 ],
		],
		// ...
	];
}

Multiple Data Providers

As of PHPUnit 5.7, a test method can reference more than one data provider. You would simply use multiple @dataProvider annotations:

/**
 * @dataProvider provideEmptyTimestamp
 * @dataProvider provideInvalidTimestamp
 */
public function testStuff( $timestamp ) {

	// ...
}

This comes in super handy if you want to split different kinds of input data, for example, various empty data in contrast to various (non-empty) invalid data, as illustrated in the above code snippet.

You could also use this for non-falsy data, for example, if you integrate with some tool or API or data format where there are two (or more) versions. Using separate data providers would represent “The behavior is the same, but these data sets are version 1, while those are version 2.”.

Anything Missing?

Is this post missing anything around PHPUnit’s data providers? Please share in the comments! Thanks!

Leave a Reply

Your email address will not be published. Required fields are marked *