Fixing Block Strings with PHP CS Fixer

Block Strings in PHP Are Everywhere

Block strings — nowdoc and heredoc syntax — have survived every era of PHP. Whether it's multiline SQL queries, JSON payloads in tests, XML fixtures, or generated templates, they remain part of everyday PHP development.

And they’re not going away anytime soon.

 1$records = $connection->fetchAllAssociative(
 2	<<<SQL
 3		SELECT id, name
 4		FROM users
 5		WHERE active AND username = ?
 6		SQL,
 7	[$username]
 8);

Or in tests:

 1$this->assertJsonStringEqualsJsonString(
 2	<<<'JSON'
 3		{
 4			"name": "John",
 5			"roles": ["admin", "editor"]
 6		}
 7		JSON,
 8	$client->get('/api/users/123')->getBody()
 9);

Despite their widespread use, formatting and code-style tools tend to ignore block strings entirely.

There are generally two reasons for this:

  1. The content is often considered “opaque” and potentially sensitive to changes — especially in tests. IDEs like PhpStorm deliberately avoid auto-formatting them unless edited explicitly as standalone content.
  2. PHP tooling, such as PHP CS Fixer, typically has no understanding of embedded languages. Integrating an SQL formatter into a such a tool would add substantial complexity without creating equal value for all users.

The result is familiar to many teams:

Projects that heavily rely on block strings eventually accumulate formatting debt that conventional tooling simply ignores.

Enter uuf6429/php-cs-fixer-blockstring

That’s why I created uuf6429/php-cs-fixer-blockstring - a custom extension for PHP CS Fixer focused specifically on formatting block strings.

The core idea is surprisingly simple:

  1. Detect block strings using PHP CS Fixer tokens
  2. Extract the embedded content
  3. Determine which formatter should handle it
  4. Reinsert the formatted result back into the PHP source

Conceptually:

flowchart LR A[PHP Source] B[Extract Block String] C["External formatter<br/>(JSON / SQL / XML / ...)"] D[Reinsert into PHP source] A --> B --> C --> D

Rather than reinventing formatters for every language, the extension delegates formatting to external tools keeping it flexible and language-agnostic.

Setting it all up is relatively straightforward - here's an example formatting SQL using sqlfluff via Docker:

 1<?php declare(strict_types=1);
 2
 3use uuf6429\PhpCsFixerBlockstring\Fixer\BlockStringFixer;
 4use uuf6429\PhpCsFixerBlockstring\Formatter\DockerPipeFormatter;
 5
 6return (new PhpCsFixer\Config())
 7	->setRiskyAllowed(true)
 8	->registerCustomFixers([new BlockStringFixer()])
 9	->setRules([
10		BlockStringFixer::NAME => [
11			'formatters' => [
12				'SQL' => new DockerPipeFormatter(
13					image: 'sqlfluff/sqlfluff',
14					command: ['format', '--dialect', 'ansi', '-'],
15				),
16			],
17		],
18	]);

Although to be fair, all that flexibility makes it a tad harder setting it all up without memorizing a few things. Therefore, I made a separate recipes repository containing ready-made (copy-pasteable) configurations for common formatter usages: uuf6429/php-cs-fixer-blockstring-recipes

The Unexpected Complexity

The implementation turned out to be significantly more complicated than expected.

Interpolation Breaks Everything

One of the first major problems was string interpolation.

Consider this example:

 1$query = <<<"SQL"
 2	SELECT *
 3	FROM users
 4	WHERE email = {$email}
 5	SQL;

From PHP’s perspective, this is perfectly valid.

From the perspective of an SQL formatter, however, {$email} is nonsense.

The formatter sees invalid SQL and either fails outright or mangles the output.

The solution is to introduce a codec layer: a reversible transformation process that temporarily replaces interpolation expressions with language-safe placeholders before formatting.

Here's how that would look like:

  1. Original:
     1Select * from users where   email={$email}
    
  2. Placeholders encoded:
     1Select * from users where   email='__PHP_PLACEHOLDER_1__'
    
  3. Formatted:
     1SELECT * FROM users WHERE email = '__PHP_PLACEHOLDER_1__'
    
  4. Placeholders decoded:
     1SELECT * FROM users WHERE email = {$email}
    

The exact placeholder strategy depends on the embedded language, but the extension provides sensible defaults for common use cases.

The Rabbit Hole of Line Endings

The next issue came as a complete surprise: line endings.

You’d think that by 2026 this problem would mostly be solved.

Unfortunately, no.

Even on UNIX-based systems, formatters may inconsistently preserve or normalize trailing newlines. Add Windows, Docker and WSL into the mix, and the inconsistency turns into a soup.

Depending on the exact situation, a formatter might add an extra trailing newline, switch line endings entirely, preserve inconsistent input, and so on.

The most reliable solution ended up being to push the exact desired behaviour to the end user by configuring an end-of-line normalizer.

In Conclusion

Block strings have existed in PHP for decades, yet most formatting pipelines still treat them as untouchable blobs of text - when in fact, block strings are code too.

This project attempts to close that gap by allowing embedded code to participate in the same automated formatting workflow as the surrounding PHP source.

There’s still plenty of room to improve the ecosystem around embedded-language tooling in PHP, but even small improvements can eliminate a surprising amount of formatting friction in large codebases.

Якщо не зазначено інше і чітко: