PHP Regular Expressions
PHP uses PCRE (Perl-compatible regular expressions). You can try out your expressions in a live environment at regex101 (ensure that PCRE2 is checked).
PHP RE Syntax
We can use the following syntax to define regular expressions in PHP:
'/a*bc/'
The expression is enclosed in two /
.
Quantifiers
We can request a certain number of characters:
Quantifier | Description |
---|---|
* |
Match 0 or more times. |
+ |
Match 1 or more times. |
? |
Match 1 or 0 times. |
{n} |
Match exactly $n$ times. |
{n,} |
Match at least $n$ times. |
{n,m} |
Match between $n$ and $m$ times. |
These quantifiers are greedy:
- Match the longest leftmost sequence of characters possible.
We can match lazily by following a quantifier with a ?
character:
'=/\*.*?\*/='
Any non-whitespace, non-backslash character can be a regex delimiter. =
is used here for convenience.
this would match:
x = /* one */ y*z; /* three */
as opposed to the following using a greedy expression:
x = /* one */ y*z; /* three */
Capture Groups & Back-references
We can capture groups and use them with backreferences like so:
"=<(?<c1>\w+)>/*?</\g{c1}>="
This captures a group of w+
called c1
and uses it later using \g{c1}
.
This would match the whole string:
"<li><b>item</b></li>"
Available syntax includes:
Expression | Description |
---|---|
(regex) |
Creates a capturing sub-pattern and automatically names starting from 1. |
(?<name>regex) |
Creates a named capturing sub-pattern. |
(?:regex) |
Creates a non-capturing sub-pattern. |
\N , \gN , \g{N} |
Back-reference to a capturing pattern called $N$, where $N$ is a natural number. |
\g{name} |
Back-reference to a named capture group. |
(?:regex)
is used to enclose alternations without making another capture group:
"/(?:regex1|regex2)/"
Modifiers
Usually we might enclose our regular expression in /
/
. We can use the following modifiers to match in different ways like so:
"/hello/i"
This performs a case-insensitive match.
Modifier | Description |
---|---|
i |
Perform case-insensitive match. |
s |
Treat multiline string as a single line. |
m |
Treat string as a set of multiple lines (multi-line mode). |
You can combine these modifiers one after the other.
PHP Regex Functions
preg_match()
preg_match(rx, str [,&$matches [,flags [,offset]]])
Attempts to match a regular expression rx
against the string str
starting at offset
:
- Returns
1
is there is a match,0
if not andFalse
if there is an error. $matches
is an array containing all the matching groups where group/key 0 is the full matching string.flags
modify the behaviour of the function.
It can be useful to run this in the expression of an if
statement so that your output will only run if the pattern matches.
preg_match_all()
preg_match_all(rx, str, [,&$matches [,flags [,offs]]])
Retrieves all matches of the regular expression rx
against teh string str
starting at offs
:
- Returns the number of matches and
False
in the case of an error. $matches
is a multi-dimensional array containing all matches indexed from 0; with each match in the same format atpreg_match()
.flags
modify the behaviour of the function.
preg_replace()
preg_replace(rx, rpl, str [,lmt [, &$num]])
Returns the result of replacing matches of rx
in str
by rpl
:
lmt
specifies the maximum number of replacements.- On completion,
$num
contains the number of replacements performed. - You can use groups captured from
str
inrpl
.
preg_replace_callback()
You can also use preg_replace_callback()
with the result of a function:
$old = "105 degrees Fahrenheit is quite warm";
$new = preg_replace_callback(
'/(\d+) degrees Fahrenheit/',
function ($match) {
return round(($match[1] - 32) * 5 / 9) . " degrees Celsius";
},
$old
);
echo $new;
41 degrees Celsius is quite warm
preg_split()
preg_split(rx, str [,lmt [,flags]])
Splits str
by the regular expression rx
and returns the result as an array:
lmt
specifies the maximum number of split components.flags
modify the behaviour of the function.