Skip to content

Commit

Permalink
Improve CallbackStreamFilter implementation
Browse files Browse the repository at this point in the history
  • Loading branch information
nyamsprod committed Jan 4, 2025
1 parent f242a29 commit 52e3bad
Show file tree
Hide file tree
Showing 14 changed files with 436 additions and 120 deletions.
4 changes: 3 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,10 +8,12 @@ All Notable changes to `Csv` will be documented in this file

- Adding the `TabularDataReader::map` method.
- Adding `CallbackStreamFilter` class
- `AbstractCsv::appendStreamFilter`
- `AbstractCsv::prependStreamFilter`

### Deprecated

- None
- `AbstractCsv::addStreamFilter` use `AbstractCsv::appendStreamFilter` instead.

### Fixed

Expand Down
125 changes: 108 additions & 17 deletions docs/9.0/connections/callback-stream-filter.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,46 +5,137 @@ title: Dynamic Stream Filter

# Callback Stream Filter

<p class="message-info">Available since version <code>9.22.0</code></p>
<p class="message-info">Available since version <code>9.21.0</code></p>

Sometimes you may encounter a scenario where you need to create a specific stream filter
to resolve a specific issue. Instead of having to put up with the hassle of creating a
fully fledge stream filter, we are introducing a `CallbackStreamFilter`. This filter
is a PHP stream filter which enables applying a callable onto the stream prior to it
being actively consumed by the CSV process.

## Usage with CSV objects
## Registering the callbacks

Out of the box, to work, the feature requires a callback and an associated unique filter name.

```php
use League\Csv\CallbackStreamFilter;

CallbackStreamFilter::register('string.to.upper', strtoupper(...));
```

Out of the box, the filter can not work, it requires a unique name and a callback to be usable.
Once registered you can re-use the filter with CSV documents or with a resource.

let's imagine we have a CSV document with the return carrier character as the end of line character.
This type of document is parsable by the package but only if you enable the deprecated `auto_detect_line_endings`.
<p class="message-warning"><code>CallbackStreanFilter::register</code> register your callback
globally. So you only need to register it once. Preferably in your container definition if you
are using a framework.</p>

You can always check for the registered filter by calling the `CallbackStreamFilter::isRegistered`

```php
CallbackStreamFilter::isRegisterd('string.to.upper'); //returns true
CallbackStreamFilter::isRegisterd('string.to.lower'); //returns false
```

Last but not least you can always list all the registered filter names by calling the

```php
CallbackStreamFilter::registeredFilterNames(); // returns a list
```

## Usage with CSV objects

Let's imagine we have a CSV document using the return carrier character (`\r`) as the end of line character.
This type of document is parsable by the package but only if you enable the deprecated `auto_detect_line_endings` ini setting.

If you no longer want to rely on that feature since it emits a deprecation warning you can use the new
`CallbackStreamFilter` instead by swaping the offending character with a modern alternative.
If you no longer want to rely on that feature since it has been deprecated since PHP 8.1 and will be
removed from PHP once PHP9.0 is release, you can use the `CallbackStreamFilter` instead by
swaping the offending character with a supported alternative.

```php
use League\Csv\CallbackStreamFilter;
use League\Csv\Reader;

$csv = "title1,title2,title3\rcontent11,content12,content13\rcontent21,content22,content23\r";
$csv = "title1,title2,title3\r".
. "content11,content12,content13\r"
. "content21,content22,content23\r";

$document = Reader::createFromString($csv);
CallbackStreamFilter::addTo(
$document,
'swap.carrier.return',
$document->setHeaderOffset(0);

CallbackStreamFilter::register(
'swap.carrier.return',
fn (string $bucket): string => str_replace("\r", "\n", $bucket)
);
$document->setHeaderOffset(0);
CallbackStreamFilter::appendTo($document, 'swap.carrier.return');

return $document->first();
// returns ['title1' => 'content11', 'title2' => 'content12', 'title3' => 'content13']
// returns [
// 'title1' => 'content11',
// 'title2' => 'content12',
// 'title3' => 'content13',
// ]
```

The `addTo` method register the filter with the unique `swap.carrier.return` name and then attach
it to the CSV document object on read.
The `appendTo` method will check for the availability of the filter via its
name `swap.carrier.return`. If it is not present a `LogicException` will be
thrown, otherwise it will attach the filter to the CSV document object at the
bottom of the stream filter queue. Since we are using the `Reader` class, the
filter is attached using the reader mode. If we were to use the `Writer` class,
the filter would be attached using the write mode only.

<p class="message-warning">On read, the CSV document content is <strong>never changed or replaced</strong>.
Conversely, the changes <strong>are persisted during writing</strong>.</p>
However, on write, the changes <strong>are persisted</strong> into the created document.</p>

## Usage with streams

Of course the `CallbackStreamFilter` can be use in other scenario or with PHP stream resources.

With PHP streams you can also use:

- `CallbackStreamFilter::appendTo`
- `CallbackStreamFilter::appendOnReadTo`
- `CallbackStreamFilter::appendOnWriteTo`
- `CallbackStreamFilter::prependTo`
- `CallbackStreamFilter::prependOnReadTo`
- `CallbackStreamFilter::prependOnWriteTo`

to add the stream filter at the bottom or on the top of the stream filter queue on
a read or write mode.

<p class="message-notice">Those methods can also be used by the CSV classes, <strong>but</strong>
the read or write mode will be superseeded by the CSV class mode.</p>

```php
use League\Csv\CallbackStreamFilter;

$csv = "title1,title2,title3\r".
. "content11,content12,content13\r"
. "content21,content22,content23\r";
$stream = tmpfile();
fwrite($stream, $csv);

// We first check to see if the callback is not already registered
// without the check a LogicException would be thrown on
// usage or on callback registration
if (!CallbackStreamFilter::isRegistered('swap.carrier.return')) {
CallbackStreamFilter::register(
'swap.carrier.return',
fn (string $bucket): string => str_replace("\r", "\n", $bucket)
);
}
CallbackStreamFilter::apppendOnReadTo($stream, 'swap.carrier.return');
$data = [];

rewind($stream);
while (($record = fgetcsv($stream, 1000, ',')) !== false) {
$data[] = $record;
}
fclose($stream);

return $data[0]
//returns ['title1', 'title2', 'title3']
```

Of course the `CallbackStreamFilter` can be use in other different scenario or with PHP stream resources.
<p class="message-warning">If you use <code>appendTo</code> or <code>prependTo</code> on a stream
which can be read and written to, the filter will be registered on both mode which
<strong>MAY</strong> lead to unexpected behaviour depending on your callback logic.</p>
29 changes: 21 additions & 8 deletions docs/9.0/connections/filters.md
Original file line number Diff line number Diff line change
Expand Up @@ -79,16 +79,29 @@ Here's a table to quickly determine if PHP stream filters works depending on how

```php
public AbstractCsv::addStreamFilter(string $filtername, mixed $params = null): self
public AbstractCsv::appendStreamFilter(string $filtername, mixed $params = null): self
public AbstractCsv::prependStreamFilter(string $filtername, mixed $params = null): self
public AbstractCsv::hasStreamFilter(string $filtername): bool
```

The `AbstractCsv::addStreamFilter` method adds a stream filter to the connection.

- The `$filtername` parameter is a string that represents the filter as registered using php `stream_filter_register` function or one of [PHP internal stream filter](http://php.net/manual/en/filters.php).
<div class="message-notice">
<ul>
<li><code>addStreamFilter</code> is deprecated since version <code>9.21.0</code></li>
<li><code>appendStreamFilter</code> is available since <code>9.21.0</code> and replace <code>addStreamFilter</code></li>
<li><code>prependStreamFilter</code> is available since <code>9.21.0</code></li>
</ul>
</div>

- The `$filtername` parameter is a string that represents the filter as registered using php `stream_filter_register` function or one of [PHP internal stream filter](http://php.net/manual/en/filters.php).
- The `$params` : This filter will be added with the specified parameters to the end of the list.

<p class="message-warning">Each time your call <code>addStreamFilter</code> with the same argument the corresponding filter is registered again.</p>
The `appendStreamFilter` adds the stream filter at the bottom of the stream filter queue whereas
`prependStreamFilter` adds the stream filter on top of the queue. Both methods share the same
arguments and the same return type.

<p class="message-warning">Each time your call <code>appendStreamFilter</code> with the same argument the corresponding filter is registered again.</p>

The `AbstractCsv::hasStreamFilter` method tells whether a specific stream filter is already attached to the connection.

Expand All @@ -101,8 +114,8 @@ stream_filter_register('convert.utf8decode', Transcode::class);

$reader = Reader::createFromPath('/path/to/my/chinese.csv', 'r');
if ($reader->supportsStreamFilterOnRead()) {
$reader->addStreamFilter('convert.utf8decode');
$reader->addStreamFilter('string.toupper');
$reader->appendStreamFilter('convert.utf8decode');
$reader->appendStreamFilter('string.toupper');
}

$reader->hasStreamFilter('string.toupper'); //returns true
Expand All @@ -116,7 +129,7 @@ foreach ($reader as $row) {

## Stream filters removal

Stream filters attached **with** `addStreamFilter` are:
Stream filters attached **with** `addStreamFilter`, `appendStreamFilter`, `prependStreamFilter` are:

- removed on the CSV object destruction.

Expand All @@ -133,8 +146,8 @@ stream_filter_register('convert.utf8decode', Transcode::class);
$fp = fopen('/path/to/my/chines.csv', 'r');
stream_filter_append($fp, 'string.rot13'); //stream filter attached outside of League\Csv
$reader = Reader::createFromStream($fp);
$reader->addStreamFilter('convert.utf8decode');
$reader->addStreamFilter('string.toupper');
$reader->prependStreamFilter('convert.utf8decode');
$reader->prependStreamFilter('string.toupper');
$reader->hasStreamFilter('string.rot13'); //returns false
$reader = null;
// 'string.rot13' is still attached to `$fp`
Expand All @@ -148,4 +161,4 @@ The library comes bundled with the following stream filters:
- [RFC4180Field](/9.0/interoperability/rfc4180-field/) stream filter to read or write RFC4180 compliant CSV field;
- [CharsetConverter](/9.0/converter/charset/) stream filter to convert your CSV document content using the `mbstring` extension;
- [SkipBOMSequence](/9.0/connections/bom/) stream filter to skip your CSV document BOM sequence if present;
- [CallbackStramFilter](/9.0/connections/callback-strean-filter/) apply a callback via a stream filter.
- [CallbackStreamFilter](/9.0/connections/callback-stream-filter/) apply a callback via a stream filter.
2 changes: 1 addition & 1 deletion docs/9.0/interoperability/encoding.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ $reader = Reader::createFromPath('/path/to/my/file.csv', 'r');
//let's set the output BOM
$reader->setOutputBOM(Bom::Utf8);
//let's convert the incoming data from iso-88959-15 to utf-8
$reader->addStreamFilter('convert.iconv.ISO-8859-15/UTF-8');
$reader->appendStreamFilter('convert.iconv.ISO-8859-15/UTF-8');
//BOM detected and adjusted for the output
echo $reader->getContent();
```
Expand Down
37 changes: 36 additions & 1 deletion src/AbstractCsv.php
Original file line number Diff line number Diff line change
Expand Up @@ -378,7 +378,7 @@ public function setOutputBOM(Bom|string|null $str): static
* @throws InvalidArgument If the stream filter API can not be appended
* @throws UnavailableFeature If the stream filter API can not be used
*/
public function addStreamFilter(string $filtername, null|array $params = null): static
public function appendStreamFilter(string $filtername, ?array $params = null): static
{
$this->document instanceof Stream || throw UnavailableFeature::dueToUnsupportedStreamFilterApi(get_class($this->document));

Expand All @@ -390,6 +390,24 @@ public function addStreamFilter(string $filtername, null|array $params = null):
return $this;
}

/**
* Prepend a stream filter.
*
* @throws InvalidArgument If the stream filter API can not be appended
* @throws UnavailableFeature If the stream filter API can not be used
*/
public function prependStreamFilter(string $filtername, ?array $params = null): static
{
$this->document instanceof Stream || throw UnavailableFeature::dueToUnsupportedStreamFilterApi(get_class($this->document));

$this->document->prependFilter($filtername, static::STREAM_FILTER_MODE, $params);
$this->stream_filters[$filtername] = true;
$this->resetProperties();
$this->input_bom = null;

return $this;
}

/**
* DEPRECATION WARNING! This method will be removed in the next major point release.
*
Expand Down Expand Up @@ -516,4 +534,21 @@ public function output(?string $filename = null): int
throw new InvalidArgument($exception->getMessage());
}
}

/**
* DEPRECATION WARNING! This method will be removed in the next major point release.
* @codeCoverageIgnore
* @deprecated since version 9.22.0
* @see AbstractCsv::appendStreamFilter()
*
* Append a stream filter.
*
* @throws InvalidArgument If the stream filter API can not be appended
* @throws UnavailableFeature If the stream filter API can not be used
*/
#[Deprecated(message:'use League\Csv\AbstractCsv::appendStreamFilter() instead', since:'league/csv:9.18.0')]
public function addStreamFilter(string $filtername, ?array $params = null): static
{
return $this->appendStreamFilter($filtername, $params);
}
}
22 changes: 11 additions & 11 deletions src/AbstractCsvTest.php
Original file line number Diff line number Diff line change
Expand Up @@ -324,36 +324,36 @@ public function testEnclosure(): void
$this->csv->setEnclosure('foo');
}

public function testAddStreamFilter(): void
public function testappendStreamFilter(): void
{
$csv = Reader::createFromPath(__DIR__.'/../test_files/foo.csv');
$csv->addStreamFilter('string.rot13');
$csv->addStreamFilter('string.tolower');
$csv->addStreamFilter('string.toupper');
$csv->appendStreamFilter('string.rot13');
$csv->appendStreamFilter('string.tolower');
$csv->appendStreamFilter('string.toupper');
foreach ($csv as $row) {
self::assertSame($row, ['WBUA', 'QBR', '[email protected]']);
}
}

public function testFailedAddStreamFilter(): void
public function testFailedappendStreamFilter(): void
{
$csv = Writer::createFromFileObject(new SplTempFileObject());
self::assertFalse($csv->supportsStreamFilterOnWrite());

$this->expectException(UnavailableFeature::class);

$csv->addStreamFilter('string.toupper');
$csv->appendStreamFilter('string.toupper');
}

public function testFailedAddStreamFilterWithWrongFilter(): void
public function testFailedappendStreamFilterWithWrongFilter(): void
{
$this->expectException(InvalidArgument::class);

/** @var resource $tmpfile */
$tmpfile = tmpfile();

Writer::createFromStream($tmpfile)
->addStreamFilter('foobar.toupper');
->appendStreamFilter('foobar.toupper');
}

public function testStreamFilterDetection(): void
Expand All @@ -363,7 +363,7 @@ public function testStreamFilterDetection(): void

self::assertFalse($csv->hasStreamFilter($filtername));

$csv->addStreamFilter($filtername);
$csv->appendStreamFilter($filtername);

self::assertTrue($csv->hasStreamFilter($filtername));
}
Expand All @@ -372,7 +372,7 @@ public function testClearAttachedStreamFilters(): void
{
$path = __DIR__.'/../test_files/foo.csv';
$csv = Reader::createFromPath($path);
$csv->addStreamFilter('string.toupper');
$csv->appendStreamFilter('string.toupper');

self::assertStringContainsString('JOHN', $csv->toString());

Expand All @@ -384,7 +384,7 @@ public function testClearAttachedStreamFilters(): void
public function testSetStreamFilterOnWriter(): void
{
$csv = Writer::createFromPath(__DIR__.'/../test_files/newline.csv', 'w+');
$csv->addStreamFilter('string.toupper');
$csv->appendStreamFilter('string.toupper');
$csv->insertOne([1, 'two', 3, "new\r\nline"]);

self::assertStringContainsString("1,TWO,3,\"NEW\r\nLINE\"", $csv->toString());
Expand Down
Loading

0 comments on commit 52e3bad

Please sign in to comment.