LEARNING JAVASCRIPT - Trang 278

discussion, we’ll assume email addresses start with a letter and end with a letter).

Think of the situations you have to consider:

const

inputs

=

[

"[email protected]"

,

// nothing but the email

"[email protected] is my email"

,

// email at the beginning

"my email is [email protected]"

,

// email at the end

"use [email protected], my email"

,

// email in the middle, with comma afterward

"my email:[email protected]."

,

// email surrounded with punctuation

];

It’s a lot to consider, but all of these email addresses have one thing in common: they

exist at word boundaries. The other advantage of word boundary markers is that,

because they don’t consume input, we don’t need to worry about “putting them back”

in the replacement string:

const

emailMatcher

=

/\b[a-z][a-z0-9._-]*@[a-z][a-z0-9_-]+\.[a-z]+(?:\.[a-z]+)?\b/ig

;

inputs

.

map

(

s

=>

s

.

replace

(

emailMatcher

,

'<a href="mailto:$&">$&</a>'

));

// returns [
// "<a href="mailto:[email protected]">[email protected]</a>",
// "<a href="mailto:[email protected]">[email protected]</a> is my email",
// "my email is <a href="mailto:[email protected]">[email protected]</a>",
// "use <a href="mailto:[email protected]">[email protected]</a>, my email",
// "my email:<a href="mailto:[email protected]>[email protected]</a>.",
// ]

In addition to using word boundary markers, this regex is using a lot of the features

we’ve covered in this chapter: it may seem daunting at first glance, but if you take the

time to work through it, you’re well on your way to regex mastery (note especially

that the replacement macro,

$&

, does not include the characters surrounding the

email address…because they were not consumed).
Word boundaries are also handy when you’re trying to search for text that begins

with, ends with, or contains another word. For example,

/\bcount/

will find count

and countdown, but not discount, recount, or accountable.

/\bcount\B/

will only find

countdown,

/\Bcount\b/

will find discount and recount, and

/\Bcount\B/

will only

find accountable.

Lookaheads

If greedy versus lazy matching is what separates the dilettantes from the pros, look‐

aheads are what separate the pros from the gurus. Lookaheads—like anchor and word

boundary metacharacters—don’t consume input. Unlike anchors and word bound‐

aries, however, they are general purpose: you can match any subexpression without

consuming it. As with word boundary metacharacters, the fact that lookaheads don’t

match can save you from having to “put things back” in a replacement. While that can

254 | Chapter 17: Regular Expressions

Liên Kết Chia Sẽ

** Đây là liên kết chia sẻ bới cộng đồng người dùng, chúng tôi không chịu trách nhiệm gì về nội dung của các thông tin này. Nếu có liên kết nào không phù hợp xin hãy báo cho admin.