<regex>
: Correct characters not matched by special character dot
#5192
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This corrects the set of characters the special character dot
.
does not match in a regular expression as specified in the ECMAScript and POSIX standards, and aligns our treatment of.
with libstdc++ and libc++..
in awregex
in ECMAScript mode. See the definition of.
semantics in Section 22.2.2.7 of ECMAScript 14, which removes the line terminators from the set of matched characters, and the list of line terminators in Section 12.3. (Note that this links to a newer standard, but the set of unmatched characters has not been changed since ECMAScript 3. Furthermore, the C++ standard does not modify the interpretation of.
.).
matches all characters except NUL now. This is in accordance with Section 9.3.4 and Section 9.4.4 of the POSIX standard. (I contemplated whether a new line (LF) should not be matched in addition to or instead of NUL in grep or egrep mode, as that is what grep implementations tend to do. The POSIX standard only states that regular expressions cannot match LFs due to the way grep works, but does not explicitly modify the definition of.
or regular expressions in general, so it is ambiguous on this question. Since libstdc++ and libc++ only exclude NUL from the set of characters matched by.
in grep and egrep mode, I decided to align the set of unmatched characters with them.)Note: Whether NUL should be matched in POSIX regular expressions is the subject of LWG-3603.