Note that there’s some simplifying going on in this example; if the
alt
attribute didn’t
come first, this wouldn’t work, nor would it if there were extra whitespace. We’ll see
this example revisited later with these problems addressed.
Just as before, the first group will match either a single or double quote, followed by
zero or more characters (note the question mark that makes the match lazy), followed
by
\1
—which will be whatever the first match was, either a single quote or a double
quote.
Let’s take a moment to reinforce our understanding of lazy versus greedy matching.
Go ahead and remove the question mark after the
*
, making the match greedy. Run
the expression again; what do you see? Do you understand why? This is a very impor‐
tant concept to understand if you want to master regular expressions, so if this is not
clear to you, I encourage you to revisit the section on lazy versus greedy matching.
Replacing Groups
One of the benefits grouping brings is the ability to make more sophisticated replace‐
ments. Continuing with our HTML example, let’s say that we want to strip out every‐
thing but the
href
from an
<a>
tag:
let
html
=
'<a class="nope" href="/yep">Yep</a>'
;
html
=
html
.
replace
(
/<a .*?(href=".*?").*?>/
,
'<a $1>'
);
Just as with backreferences, all groups are assigned a number starting with 1. In the
regex itself, we refer to the first group with
\1
; in the replacement string, we use
$1
.
Note the use of lazy quantifiers in this regex to prevent it from spanning multiple
<a>
tags. This regex will also fail if the
href
attribute uses single quotes instead of double
quotes.
Now we’ll extend the example. We want to preserve the
class
attribute and the
href
attribute, but nothing else:
let
html
=
'<a class="yep" href="/yep" id="nope">Yep</a>'
;
html
=
html
.
replace
(
/<a .*?(class=".*?").*?(href=".*?").*?>/
,
'<a $2 $1>'
);
Note in this regex we reverse the order of
class
and
href
so that
href
always occurs
first. The problem with this regex is that
class
and
href
always have to be in the
same order and (as mentioned before) it will fail if we use single quotes instead of
double. We’ll see an even more sophisticated solution in the next section.
In addition to
$1
,
$2
, and so on, there are also
$‘
(everything before the match),
$&
(the match itself), and
$’
(everything after the match). If you want to use a literal dol‐
lar sign, use
$$
:
const
input
=
"One two three"
;
input
.
replace
(
/two/
,
'($`)'
);
// "One (One ) three"
input
.
replace
(
/\w+/g
,
'($&)'
);
// "(One) (two) (three)"
250 | Chapter 17: Regular Expressions