Regular Expressions
What are Regular Expressions?
Regular expressions (regex or regexp) are patterns used to match character combinations in strings. They're powerful tools for text validation, searching, and manipulation.
In JavaScript, regular expressions are objects created using either the regex literal syntax or the RegExp constructor.
Creating regular expressions
// Literal syntax
const pattern1 = /hello/;
// Constructor syntax
const pattern2 = new RegExp('hello');
// Both are equivalent
console.log(pattern1.test('hello world')); // true
console.log(pattern2.test('hello world')); // true
Basic Pattern Syntax
Literal Characters
Match exact characters:
Literal character matching
const pattern = /cat/;
console.log(pattern.test('The cat is here')); // true
console.log(pattern.test('The dog is here')); // false
Character Classes
Match any character from a set:
Character classes
// [abc] - matches a, b, or c
console.log(/[abc]/.test('b')); // true
// [a-z] - matches any lowercase letter
console.log(/[a-z]/.test('m')); // true
// [0-9] - matches any digit
console.log(/[0-9]/.test('5')); // true
// [^abc] - matches anything except a, b, or c (negation)
console.log(/[^abc]/.test('d')); // true
console.log(/[^abc]/.test('a')); // false
Predefined Character Classes
Predefined character classes
// \d - digit (0-9)
console.log(/\d/.test('abc5def')); // true
// \w - word character (letters, digits, underscore)
console.log(/\w/.test('hello_123')); // true
// \s - whitespace character
console.log(/\s/.test('hello world')); // true
// \D - not a digit
console.log(/\D/.test('abc')); // true
// \W - not a word character
// \S - not a whitespace
Quantifiers
Control how many times a pattern matches:
Quantifiers
// * - 0 or more times
/ab*c/.test('ac'); // true
/ab*c/.test('abc'); // true
/ab*c/.test('abbc'); // true
// + - 1 or more times
/ab+c/.test('ac'); // false
/ab+c/.test('abc'); // true
// ? - 0 or 1 time (optional)
/colou?r/.test('color'); // true
/colou?r/.test('colour'); // true
// {n} - exactly n times
/a{3}/.test('aaa'); // true
/a{3}/.test('aa'); // false
// {n,} - n or more times
/a{2,}/.test('aa'); // true
/a{2,}/.test('aaa'); // true
// {n,m} - between n and m times
/a{2,4}/.test('aaa'); // true
Anchors
Match positions in the string:
Anchors
// ^ - start of string
/^hello/.test('hello world'); // true
/^hello/.test('world hello'); // false
// $ - end of string
/world$/.test('hello world'); // true
/world$/.test('world hello'); // false
// \b - word boundary
/\bhello\b/.test('hello world'); // true
/\bhello\b/.test('helloworld'); // false
Flags
Flags modify how the regex behaves:
Common flags
// g - global (find all matches, not just first)
const text = 'cat cat cat';
console.log(text.match(/cat/)); // ['cat'] - just first
console.log(text.match(/cat/g)); // ['cat', 'cat', 'cat'] - all
// i - case insensitive
console.log(/hello/i.test('HELLO')); // true
// m - multiline (^ and $ match line boundaries)
const multiline = 'line1\nline2\nline3';
console.log(/^line/.test(multiline)); // true
console.log(multiline.match(/^line/gm)); // ['line', 'line', 'line']
// s - dotall (. matches newlines too)
console.log(/a.b/.test('a\nb')); // false
console.log(/a.b/s.test('a\nb')); // true
// Combine flags
/pattern/gim; // global, case-insensitive, multiline
Methods
Test Method
Check if a pattern matches:
test() method
const pattern = /[0-9]/;
console.log(pattern.test('abc')); // false
console.log(pattern.test('abc5')); // true
// Stateful with global flag
const globalPattern = /\d/g;
console.log(globalPattern.test('a1b2')); // true
console.log(globalPattern.test('a1b2')); // true - starts from last index!
globalPattern.lastIndex = 0; // Reset if needed
Match Method
Find matches in a string:
match() method
const text = 'The numbers are 42 and 23';
// Without global flag - returns first match with details
console.log(text.match(/\d+/));
// ['42', index: 20, input: 'The numbers are 42 and 23', groups: undefined]
// With global flag - returns all matches
console.log(text.match(/\d+/g)); // ['42', '23']
// No match returns null
console.log(text.match(/xyz/)); // null
Replace Method
Replace matched patterns:
replace() method
const text = 'The cat and the cat';
// Replace first match
console.log(text.replace(/cat/, 'dog')); // 'The dog and the cat'
// Replace all matches with global flag
console.log(text.replace(/cat/g, 'dog')); // 'The dog and the dog'
// Replace with function
console.log(text.replace(/cat/g, (match) => {
return match.toUpperCase();
})); // 'The CAT and the CAT'
// Replace with capture groups
console.log('2024-01-15'.replace(/(\d{4})-(\d{2})-(\d{2})/, '$3/$2/$1'));
// 15/01/2024
Search Method
Find index of first match:
search() method
const text = 'The quick brown fox';
console.log(text.search(/quick/)); // 4
console.log(text.search(/lazy/)); // -1 (not found)
Split Method
Split string by pattern:
split() method
console.log('a,b;c:d'.split(/[,;:]/)); // ['a', 'b', 'c', 'd']
console.log('Hello123World456'.split(/\d+/)); // ['Hello', 'World', '']
Capture Groups
Groups allow you to extract parts of matches:
Capture groups
const email = 'john.doe@example.com';
const pattern = /(\w+)\.(\w+)@(\w+)/;
const match = email.match(pattern);
console.log(match[0]); // 'john.doe@example'
console.log(match[1]); // 'john' - first group
console.log(match[2]); // 'doe' - second group
console.log(match[3]); // 'example' - third group
// Named capture groups (ES2018+)
const datePattern = /(?<year>\d{4})-(?<month>\d{2})-(?<day>\d{2})/;
const dateMatch = '2024-01-15'.match(datePattern);
console.log(dateMatch.groups.year); // '2024'
console.log(dateMatch.groups.month); // '01'
console.log(dateMatch.groups.day); // '15'
Practical Examples
Email Validation
Email validation
// Basic email validation
const emailPattern = /^[\w.-]+@[\w.-]+\.\w+$/;
console.log(emailPattern.test('john@example.com')); // true
console.log(emailPattern.test('john.doe@example.co.uk')); // true
console.log(emailPattern.test('invalid.email@')); // false
console.log(emailPattern.test('invalid@.com')); // false
// Note: Full RFC 5322 email validation is complex
// For production, use a library or server-side validation
Password Validation
Password validation
// At least 8 characters, uppercase, lowercase, digit, special char
const passwordPattern = /^(?=.*[a-z])(?=.*[A-Z])(?=.*\d)(?=.*[@$!%*?&])[A-Za-z\d@$!%*?&]{8,}$/;
console.log(passwordPattern.test('Pass123!')); // true
console.log(passwordPattern.test('password')); // false (no uppercase/digit)
console.log(passwordPattern.test('Pass123')); // false (no special char)
Phone Number Validation
Phone number validation
// US phone format: (123) 456-7890 or 123-456-7890
const phonePattern = /^(\(\d{3}\)|\d{3})[-\s]?\d{3}[-\s]?\d{4}$/;
console.log(phonePattern.test('(123) 456-7890')); // true
console.log(phonePattern.test('123-456-7890')); // true
console.log(phonePattern.test('123 456 7890')); // true
console.log(phonePattern.test('1234567890')); // false
URL Validation
URL validation
const urlPattern = /^https?:\/\/(www\.)?[-a-zA-Z0-9@:%._+~#=]{2,256}\.[a-z]{2,6}\b([-a-zA-Z0-9@:%_+.~#?&/=]*)$/;
console.log(urlPattern.test('https://www.example.com')); // true
console.log(urlPattern.test('http://example.com/page')); // true
console.log(urlPattern.test('example.com')); // false (no protocol)
Extract Data
Extract data from text
const text = 'Contact: john@example.com or jane@example.com';
const emails = text.match(/[\w.-]+@[\w.-]+\.\w+/g);
console.log(emails); // ['john@example.com', 'jane@example.com']
// Extract all numbers
const numbers = '123 apples, 456 oranges'.match(/\d+/g);
console.log(numbers); // ['123', '456']
Find and Replace
Find and replace with patterns
const csv = 'John,25,Engineer|Jane,30,Designer|Bob,28,Manager';
// Convert to table format
const table = csv.replace(/\|/g, '\n')
.replace(/,/g, ' | ');
console.log(table);
// John | 25 | Engineer
// Jane | 30 | Designer
// Bob | 28 | Manager
Remove Duplicates
Remove duplicate words
const text = 'hello hello world world world test';
const unique = text.replace(/\b(\w+)(\s+\1)+\b/g, '$1');
console.log(unique); // 'hello world test'
Common Mistakes
1. Forgetting Global Flag
Global flag importance
const text = 'cat cat cat';
// Without g - only first match
console.log(text.match(/cat/)); // ['cat']
// With g - all matches
console.log(text.match(/cat/g)); // ['cat', 'cat', 'cat']
2. Not Escaping Special Characters
Escaping special characters
// This looks for a literal period
/hello.world/.test('helloXworld'); // true (. matches any character!)
// Escape with backslash for literal period
/hello\.world/.test('hello.world'); // true
/hello\.world/.test('helloXworld'); // false
3. Greedy vs Non-Greedy
Greedy vs non-greedy matching
const html = '<p>First</p><p>Second</p>';
// Greedy (default) - matches longest possible
html.match(/<p>.*<\/p>/); // ['<p>First</p><p>Second</p>']
// Non-greedy - matches shortest possible
html.match(/<p>.*?<\/p>/); // ['<p>First</p>']
Performance Tips
- Use literal syntax for simple patterns: Faster than constructor
- Cache compiled regexes: Don't recreate the same pattern repeatedly
- Use specific patterns:
/a+/is faster than/a*/ - Test early: Use
test()instead ofmatch()if you only need true/false - Avoid backtracking: Complex patterns with many quantifiers can be slow
Performance best practice
// Bad: Recompiling regex every time
function validateEmail(email) {
return /^[\w.-]+@[\w.-]+\.\w+$/.test(email);
}
// Good: Compile once
const emailPattern = /^[\w.-]+@[\w.-]+\.\w+$/;
function validateEmail(email) {
return emailPattern.test(email);
}
Key Takeaways
- Regular expressions are powerful tools for pattern matching
- Use character classes and quantifiers to create flexible patterns
- Flags modify regex behavior (global, case-insensitive, multiline)
- Capture groups extract specific parts of matches
- Test your patterns thoroughly, especially for validation
- For complex validation, consider libraries or server-side validation
- Remember special characters need escaping with backslash