Skip to content

Commit

Permalink
Improve CallbackStreamFilter implementation
Browse files Browse the repository at this point in the history
  • Loading branch information
nyamsprod committed Jan 6, 2025
1 parent f242a29 commit 8fb628e
Show file tree
Hide file tree
Showing 16 changed files with 646 additions and 154 deletions.
6 changes: 5 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,10 +8,14 @@ All Notable changes to `Csv` will be documented in this file

- Adding the `TabularDataReader::map` method.
- Adding `CallbackStreamFilter` class
- `AbstractCsv::appendStreamFilterOnRead`
- `AbstractCsv::appendStreamFilterOnWrite`
- `AbstractCsv::prependStreamFilterOnRead`
- `AbstractCsv::prependStreamFilterOnWrite`

### Deprecated

- None
- `AbstractCsv::addStreamFilter` use `AbstractCsv::appendStreamFilterOnRead` or `AbstractCsv::appendStreamFilterOnWrite` instead.

### Fixed

Expand Down
50 changes: 0 additions & 50 deletions docs/9.0/connections/callback-stream-filter.md

This file was deleted.

223 changes: 214 additions & 9 deletions docs/9.0/connections/filters.md
Original file line number Diff line number Diff line change
Expand Up @@ -79,16 +79,33 @@ Here's a table to quickly determine if PHP stream filters works depending on how

```php
public AbstractCsv::addStreamFilter(string $filtername, mixed $params = null): self
public AbstractCsv::appendStreamFilterOnRead(string $filtername, mixed $params = null): self
public AbstractCsv::prependStreamFilterOnRead(string $filtername, mixed $params = null): self
public AbstractCsv::appendStreamFilterOnWrite(string $filtername, mixed $params = null): self
public AbstractCsv::prependStreamFilterOnReadWrite(string $filtername, mixed $params = null): self
public AbstractCsv::hasStreamFilter(string $filtername): bool
```

The `AbstractCsv::addStreamFilter` method adds a stream filter to the connection.

- The `$filtername` parameter is a string that represents the filter as registered using php `stream_filter_register` function or one of [PHP internal stream filter](http://php.net/manual/en/filters.php).
<div class="message-notice">
<ul>
<li><code>addStreamFilter</code> is deprecated since version <code>9.21.0</code></li>
<li><code>appendStreamFilterOnRead</code> is available since <code>9.21.0</code></li>
<li><code>prependStreamFilterOnRead</code> is available since <code>9.21.0</code></li>
<li><code>appendStreamFilterOnWrite</code> is available since <code>9.21.0</code></li>
<li><code>prependStreamFilterOnWrite</code> is available since <code>9.21.0</code></li>
</ul>
</div>

- The `$filtername` parameter is a string that represents the filter as registered using php `stream_filter_register` function or one of [PHP internal stream filter](http://php.net/manual/en/filters.php).
- The `$params` : This filter will be added with the specified parameters to the end of the list.

<p class="message-warning">Each time your call <code>addStreamFilter</code> with the same argument the corresponding filter is registered again.</p>
The `appendStreamFilterOn*` methods add the stream filter at the bottom of the stream filter queue whereas
`prependStreamFilterOn*` methods add the stream filter on top of the queue. Both methods share the same
arguments and the same return type.

<p class="message-warning">Each time your call a method with the same argument the corresponding filter is attached again.</p>

The `AbstractCsv::hasStreamFilter` method tells whether a specific stream filter is already attached to the connection.

Expand All @@ -101,8 +118,8 @@ stream_filter_register('convert.utf8decode', Transcode::class);

$reader = Reader::createFromPath('/path/to/my/chinese.csv', 'r');
if ($reader->supportsStreamFilterOnRead()) {
$reader->addStreamFilter('convert.utf8decode');
$reader->addStreamFilter('string.toupper');
$reader->appendStreamFilterOnRead('convert.utf8decode');
$reader->appendStreamFilterOnRead('string.toupper');
}

$reader->hasStreamFilter('string.toupper'); //returns true
Expand All @@ -116,11 +133,11 @@ foreach ($reader as $row) {

## Stream filters removal

Stream filters attached **with** `addStreamFilter` are:
Stream filters attached **with** `addStreamFilter`, `appendStreamFilterOn*`, `prependStreamFilterOn*` are:

- removed on the CSV object destruction.

Conversely, stream filters added **without** `addStreamFilter` are:
Conversely, stream filters added **without** the feature are:

- not detected by the library.
- not removed on object destruction.
Expand All @@ -133,8 +150,8 @@ stream_filter_register('convert.utf8decode', Transcode::class);
$fp = fopen('/path/to/my/chines.csv', 'r');
stream_filter_append($fp, 'string.rot13'); //stream filter attached outside of League\Csv
$reader = Reader::createFromStream($fp);
$reader->addStreamFilter('convert.utf8decode');
$reader->addStreamFilter('string.toupper');
$reader->prependStreamFilterOnRead('convert.utf8decode');
$reader->prependStreamFilterOnRead('string.toupper');
$reader->hasStreamFilter('string.rot13'); //returns false
$reader = null;
// 'string.rot13' is still attached to `$fp`
Expand All @@ -148,4 +165,192 @@ The library comes bundled with the following stream filters:
- [RFC4180Field](/9.0/interoperability/rfc4180-field/) stream filter to read or write RFC4180 compliant CSV field;
- [CharsetConverter](/9.0/converter/charset/) stream filter to convert your CSV document content using the `mbstring` extension;
- [SkipBOMSequence](/9.0/connections/bom/) stream filter to skip your CSV document BOM sequence if present;
- [CallbackStramFilter](/9.0/connections/callback-strean-filter/) apply a callback via a stream filter.

## Custom Stream Filter

<p class="message-info">Available since version <code>9.21.0</code></p>

Sometimes you may encounter a scenario where you need to create a specific stream filter
to resolve your issue. Instead of having to put up with the hassle of creating a
fully fledge stream filter, we are introducing the `CallbackStreamFilter`. This filter
is a PHP stream filter which enables applying a callable onto your stream prior or after it
has been actively consumed by the CSV package.

### Registering the callbacks

Out of the box, to work, the feature requires a callback and its associated unique filter name.

```php
use League\Csv\CallbackStreamFilter;

CallbackStreamFilter::register('myapp.to.upper', strtoupper(...));
```

<p class="message-warning"><code>CallbackStreanFilter::register</code> register your callback
globally. So you only need to register it once. Preferably in your container definition if you
are using a framework.</p>

The callback signature is the following

```php
callable(string $bucket [, mixed $params]): string
```

- the `$bucket` parameter represents the chunk of the stream you will be operating on.
- the `$params` represents an additional, **optional**, parameter you may pass onto the callback when it is being attached.

Once registered you can use the filter via its `$filtername`. You can register multiple times your callback
but each registration needs to be done with a unique name otherwise an exception will be triggered.

You can always check for the existence of your registered filter by calling the `CallbackStreamFilter::isRegistered` method.
The method will only return `true` for filters registered via the class; otherwise `false` is returned.

```php
CallbackStreamFilter::isRegistered('myapp.to.upper');
//returns true - exists; was registered in the previous example
CallbackStreamFilter::isRegistered('myapp.to.lower');
//returns false - does not exist; is not registered by CallbackStreamFilter
CallbackStreamFilter::isRegistered('string.tolower');
//returns false - exits, is registered by PHP itself not by CallbackStreamFilter
```

Last but not least, you can always list all the registered filter names by calling the

```php
CallbackStreamFilter::registeredFilterNames(); // returns a list
```

<p class="message-info">To avoid conflict with already registered stream filters a best
practice is to namespace your own filters by using a unique prefix. Instead of
naming it <code>string.to.lower</code> you should name it <code><strong>myapp.</strong>string.to.lower</code>
where <code>myapp</code> is specific for your own codebase.</p>

### Applying the callback

Once registered you can use one of the following methods to attach your filter to your instance.

- `CallbackStreamFilter::appendOnReadTo`
- `CallbackStreamFilter::appendOnWriteTo`
- `CallbackStreamFilter::prependOnReadTo`
- `CallbackStreamFilter::prependOnWriteTo`

Those static public methods will all add the filter to the stream filter queue attached to the structure
(League/CSV objects or PHP stream resource). They all share the same signature and only differ in:

- where in the queue the filter is added (at the top or at the bottom of the stream filter queue);
- which mode (read or write) will be used;
- their return value may be a `Reader` or a `Writer` instance or a reference to the attached stream filter.

To illustrate their usage let's check the two examples below, one with the `Reader` class and another one
with PHP stream resources.

### Usage with CSV objects

Let's imagine we have a CSV document using the return carrier character (`\r`) as the end of line character.
This type of document is parsable by the package but only if you enable the deprecated `auto_detect_line_endings` ini setting.

If you no longer want to rely on that feature which has been deprecated since PHP 8.1 and will be
removed from PHP once PHP9.0 is released, you can, as an alternative, use the `CallbackStreamFilter`
instead by replacing the offending character with a supported alternative.

```php
use League\Csv\CallbackStreamFilter;
use League\Csv\Reader;

$csv = "title1,title2,title3\r".
. "content11,content12,content13\r"
. "content21,content22,content23\r";

$document = Reader::createFromString($csv);
$document->setHeaderOffset(0);

CallbackStreamFilter::register(
'myapp.replace.eol',
fn (string $bucket): string => str_replace("\r", "\n", $bucket)
);
CallbackStreamFilter::appendOnReadTo($document, 'myapp.replace.eol');

return $document->first();
// returns [
// 'title1' => 'content11',
// 'title2' => 'content12',
// 'title3' => 'content13',
// ]
```

The `appendOnReadTo` method will check for the availability of the filter via its
name `myapp.replace.eol`. If it is not present a `LogicException` will be
thrown, otherwise it will attach the filter to the CSV document object at the
bottom of the stream filter queue using the reading mode.

<p class="message-warning">On read, the CSV document content is <strong>never changed or replaced</strong>.
However, on write, the changes <strong>are persisted</strong> into the created document.</p>

### Usage with streams

<p class="message-notice">In the following example we will use the optional <code>$params</code> parameter
to add a specific behaviour to our callback</p>

```php
use League\Csv\CallbackStreamFilter;

$csv = <<<CSV
title1,title2,title3
content11,content12,content13
content21,content22,content23
CSV;

$stream = tmpfile();
fwrite($stream, $csv);

// We first check to see if the callback is not already registered
// without the check a LogicException would be thrown on
// usage or on callback registration
if (!CallbackStreamFilter::isRegistered('myapp.replace.string')) {
CallbackStreamFilter::register(
'myapp.replace.string',
function (string $bucket, array $params): string {
return str_replace(
$params['search'],
$params['replace'],
$bucket
);
}
);
}

$filterReference = CallbackStreamFilter::appendOnReadTo(
$stream,
'myapp.replace.string',
[
'search' => ['content', '1', '2', '3'],
'replace' => ['contenu ', 'A', 'B', 'C'],
],
);

rewind($stream);
$data = [];
while (($record = fgetcsv($stream, 1000, ',')) !== false) {
$data[] = $record;
}
var_dump($data[1]);
//returns ['contenu AA', 'contenu AB', 'contenu AC']

stream_filter_remove($filterReference); //we remove the stream filter

rewind($stream);
$altData = [];
while (($record = fgetcsv($stream, 1000, ',')) !== false) {
$altData[] = $record;
}
var_dump($altData[1]);
//returns ['content11', 'content12', 'content13']

fclose($stream);
```

When using one of the attaching methods with a resource, the method returns a stream reference
that you can use later on if you wish to remove the stream filter. When using the method with
the `Reader` and/or the `Writer` class, the methods return the CSV class instance because
both classes manage the filter lifecycle themselves and automatically remove them on
the class destruction.
2 changes: 1 addition & 1 deletion docs/9.0/interoperability/encoding.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ $reader = Reader::createFromPath('/path/to/my/file.csv', 'r');
//let's set the output BOM
$reader->setOutputBOM(Bom::Utf8);
//let's convert the incoming data from iso-88959-15 to utf-8
$reader->addStreamFilter('convert.iconv.ISO-8859-15/UTF-8');
$reader->appendStreamFilter('convert.iconv.ISO-8859-15/UTF-8');
//BOM detected and adjusted for the output
echo $reader->getContent();
```
Expand Down
1 change: 0 additions & 1 deletion docs/_data/menu.yml
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,6 @@ version:
BOM Sequences: '/9.0/connections/bom/'
Document output: '/9.0/connections/output/'
Stream Filters: '/9.0/connections/filters/'
Callback Stream Filter : '/9.0/connections/callback-stream-filter/'
Inserting Records:
Writer Connection: '/9.0/writer/'
Bundled Helpers: '/9.0/writer/helpers/'
Expand Down
1 change: 1 addition & 0 deletions phpstan.neon
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ parameters:
- identifier: missingType.iterableValue
- '#implements deprecated interface League\\Csv\\ByteSequence#'
- '#Attribute class Deprecated does not exist.#'
- '#Parameter \#4 \$params of function stream_filter_(pre|ap)pend expects array, mixed given#'
level: max
paths:
- src
Expand Down
Loading

0 comments on commit 8fb628e

Please sign in to comment.