Skip to main content
Ganesh Joshi
Back to Blogs

A practical guide to regular expressions for developers

February 9, 20265 min read
Tutorials
Code or terminal, pattern matching

Regular expressions are patterns that match text. They're powerful for validation, search, and text manipulation. You don't need to memorize every symbol—understanding core concepts and using a tester gets you most of the way.

Basic syntax

A regex pattern matches text character by character. Literal characters match themselves:

const pattern = /hello/;
pattern.test('hello world'); // true
pattern.test('Hello world'); // false (case sensitive)

Add flags for different behavior:

Flag Meaning
i Case insensitive
g Global (find all matches)
m Multiline (^ and $ match line boundaries)
s Dotall (. matches newlines)
/hello/i.test('Hello World'); // true

Character classes

Match one character from a set:

Pattern Matches
[abc] a, b, or c
[a-z] Any lowercase letter
[A-Z] Any uppercase letter
[0-9] Any digit
[a-zA-Z0-9] Any alphanumeric
[^abc] Anything except a, b, c

Shorthand classes:

Pattern Equivalent Meaning
\d [0-9] Digit
\D [^0-9] Not a digit
\w [a-zA-Z0-9_] Word character
\W [^a-zA-Z0-9_] Not word character
\s [ \t\n\r\f] Whitespace
\S [^ \t\n\r\f] Not whitespace
. [^\n] Any character (except newline)

Quantifiers

Specify how many times a pattern repeats:

Quantifier Meaning
* Zero or more
+ One or more
? Zero or one
{3} Exactly 3
{2,5} Between 2 and 5
{3,} 3 or more
/\d+/.test('123');     // true - one or more digits
/\d{3}/.test('12');    // false - needs exactly 3 digits
/\d{2,4}/.test('123'); // true - between 2 and 4 digits

Greedy vs lazy

By default, quantifiers are greedy (match as much as possible):

const html = '<div>Hello</div><div>World</div>';

// Greedy: matches everything between first < and last >
html.match(/<.*>/)[0]; // '<div>Hello</div><div>World</div>'

// Lazy: matches as little as possible
html.match(/<.*?>/)[0]; // '<div>'

Add ? after a quantifier to make it lazy.

Anchors

Match positions, not characters:

Anchor Position
^ Start of string (or line with m flag)
$ End of string (or line with m flag)
\b Word boundary
\B Not a word boundary
/^hello/.test('hello world'); // true - starts with hello
/world$/.test('hello world'); // true - ends with world
/^\d+$/.test('12345');        // true - entire string is digits

Word boundaries:

/\bcat\b/.test('my cat is cute'); // true
/\bcat\b/.test('concatenate');    // false - cat is not a whole word

Groups

Parentheses create groups:

// Capturing group - extract matches
const match = 'John Doe'.match(/(\w+) (\w+)/);
// match[0] = 'John Doe' (full match)
// match[1] = 'John' (first group)
// match[2] = 'Doe' (second group)

// Non-capturing group - grouping without capturing
/(?:https?:\/\/)?example\.com/.test('https://example.com'); // true

Named groups

Give groups meaningful names:

const pattern = /(?<year>\d{4})-(?<month>\d{2})-(?<day>\d{2})/;
const match = '2026-02-23'.match(pattern);

match.groups.year;  // '2026'
match.groups.month; // '02'
match.groups.day;   // '23'

Backreferences

Reference earlier groups:

// Match repeated words
/(\b\w+\b)\s+\1/.test('the the'); // true

// Named backreference
/(?<word>\w+)\s+\k<word>/.test('hello hello'); // true

Alternation

Match one of several patterns:

/cat|dog|bird/.test('I have a dog'); // true
/(cat|dog) food/.test('dog food');   // true

Lookahead and lookbehind

Match based on what comes before or after without including it:

Pattern Type Meaning
(?=...) Positive lookahead Followed by
(?!...) Negative lookahead Not followed by
(?<=...) Positive lookbehind Preceded by
(?<!...) Negative lookbehind Not preceded by
// Password: has letter and number
/(?=.*[a-z])(?=.*\d).{8,}/.test('abc12345'); // true

// Price without currency
'$100'.match(/(?<=\$)\d+/)[0]; // '100'

// Word not preceded by 'un'
/(?<!un)happy/.test('happy');   // true
/(?<!un)happy/.test('unhappy'); // false

Common patterns

Email (basic)

/^[^\s@]+@[^\s@]+\.[^\s@]+$/

Phone number (US)

/^\(?(\d{3})\)?[-.\s]?(\d{3})[-.\s]?(\d{4})$/

URL

/^https?:\/\/[\w.-]+(?:\/[\w./?&=-]*)?$/

Date (YYYY-MM-DD)

/^\d{4}-(0[1-9]|1[0-2])-(0[1-9]|[12]\d|3[01])$/

Hex color

/^#([0-9A-Fa-f]{3}|[0-9A-Fa-f]{6})$/

JavaScript methods

Method Returns Use case
test() boolean Check if pattern matches
match() array or null Extract matches
matchAll() iterator All matches with groups
replace() string Replace matches
search() index or -1 Find first match position
split() array Split on pattern
// Replace with captured groups
'John Doe'.replace(/(\w+) (\w+)/, '$2, $1'); // 'Doe, John'

// Replace with function
'hello'.replace(/./g, (c) => c.toUpperCase()); // 'HELLO'

// Split on multiple delimiters
'a,b;c:d'.split(/[,;:]/); // ['a', 'b', 'c', 'd']

Escaping

Special characters need escaping to match literally:

. * + ? ^ $ { } [ ] \ | ( )
// Match literal dot
/\./.test('example.com');

// In string constructor, double escape
new RegExp('\\.').test('example.com');

Performance tips

Tip Example
Avoid catastrophic backtracking Use atomic groups or possessive quantifiers
Be specific \d{4} instead of \d+ when you know length
Anchor when possible ^pattern is faster than unanchored
Compile once, use many Store regex in variable
// Bad: regex created each iteration
items.filter(item => /pattern/.test(item));

// Good: regex compiled once
const pattern = /pattern/;
items.filter(item => pattern.test(item));

Debugging

Use regex testers to:

  • See matches highlighted
  • Test against multiple inputs
  • Understand what each part matches

The Regex Tester on this site lets you test patterns with real-time feedback.

Summary

Regex patterns match text using character classes, quantifiers, and anchors. Use groups to capture parts of matches. Lookahead and lookbehind match based on context without including it. Test patterns with a regex tester before using in code. For complex validation, consider using a validation library that handles edge cases.

Frequently Asked Questions

A regular expression (regex) is a pattern that describes a set of strings. It's used for searching, matching, and manipulating text. Most programming languages support regex with similar syntax.

A basic email pattern is /^[^\s@]+@[^\s@]+\.[^\s@]+$/. For production, use a validation library since email rules are complex. Regex works for basic format checking.

\d matches any digit (0-9). The + quantifier means one or more. So \d+ matches one or more consecutive digits like '123' or '7'.

Use backslash before special characters: \. for literal dot, \* for literal asterisk, \[ for literal bracket. In JavaScript strings, double the backslash: '\\d+'.

.* is greedy—it matches as much as possible. .*? is lazy—it matches as little as possible. Use lazy matching when you want the shortest match.

Related Posts