JavaScript Methods to Extract Email Addresses from Strings

Tested prompts for extract email addresses javascript compared across 5 leading AI models.

BEST BY JUDGE SCORE Claude Haiku 4.5 9/10

If you have a blob of text, a scraped webpage, a log file, or user-submitted content and you need to pull every email address out of it programmatically, JavaScript regex is the standard tool for the job. The core challenge is writing a pattern that catches real emails without matching malformed strings or missing edge cases like subdomains, plus signs, and dots in the local part.

Most developers reach for a simple regex like /[\w.-]+@[\w.-]+\.[a-z]{2,}/gi and call it done. That works for clean input, but it breaks on quoted local parts, misses emails at the start of a line, or grabs trailing punctuation like a period at the end of a sentence. The difference between a quick solution and a reliable one is knowing which tradeoffs your use case demands.

This page walks through AI-generated JavaScript approaches for extracting email addresses from strings, compares their accuracy and edge-case handling, and gives you enough context to pick the right implementation for your specific situation, whether that is a one-off script, a Node.js backend parser, or a browser-side form tool.

When to use this

JavaScript email extraction from strings fits well when you are processing unstructured or semi-structured text programmatically and need a list of addresses without spinning up a dedicated parsing library. It works on both the browser and Node.js sides of the stack, and regex-based extraction keeps your dependency count at zero.

  • Parsing CSV exports or plain-text data dumps that contain email addresses mixed with other content
  • Scraping contact information from HTML strings after fetching a page with fetch() or Puppeteer
  • Validating and extracting emails from user-pasted blocks of text in a form or admin tool
  • Processing server log files or webhook payloads to collect notification recipients
  • Building a lightweight browser extension that highlights or collects emails from the current page

When this format breaks down

  • When you need RFC 5321/5322 full compliance: JavaScript regex cannot correctly parse all technically valid email formats, including quoted local parts like '"john doe"@example.com', without becoming unmaintainable.
  • When input volume is in the millions of records: a pure regex loop in JavaScript will bottleneck; use a compiled language or a streaming parser with worker threads instead.
  • When the extracted emails feed directly into a send list without any secondary validation: extraction is not the same as verification. An extracted address can still be syntactically correct but non-existent.
  • When you are operating inside a sandboxed environment that restricts regex backtracking, such as certain Cloudflare Worker limits, because a naive greedy pattern can cause ReDoS vulnerabilities on adversarial input.

The prompt we tested

You are a JavaScript expert helping users extract email addresses from strings using JavaScript.

Respond with a clear explanation followed by a working JavaScript code block using String.prototype.match() with a robust email regex. Include a brief breakdown of the regex pattern and show the array of extracted emails as output. Keep the response under 250 words and focus on practical, copy-paste-ready code.

User's input:
I have a block of user-submitted text from a contact form that contains multiple email addresses mixed with other content: 'Please contact john.doe@example.com or our support team at help_desk@company.co.uk. You can also reach Sarah at sarah.smith+work@gmail.com for urgent matters.' How do I extract just the email addresses into an array in JavaScript?

Provide a complete JavaScript solution that extracts all email addresses from the given text, including the regex pattern, working code example, and expected output.

Model comparison

ModelQualityCostLatencyTokens
Claude Opus 4.7 9 $0.0544 7.6s 304 / 664
Claude Haiku 4.5 9 $0.0025 3.4s 206 / 465
GPT-5.4 9 $0.0056 5.1s 193 / 343
Gemini 2.5 Pro 9 $0.0022 14.5s 198 / 382
Grok 4.1 Fast Reasoning 9 $0.0002 7.7s 343 / 288

Outputs

Claude Opus 4.7 9/10
# Extracting Email Addresses from Text in JavaScript

You can use `String.prototype.match()` with a global regex flag to extract all email addresses into an array. Here's a practical, copy-paste-ready solution:

```javascript
const text = "Please contact john.doe@example.com or our support team at help_desk@company.co.uk. You can also reach Sarah at sarah.smith+work@gmail.com for urgent matters.";

const emailRegex = /[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}/g;

const emails = text.match(emailRegex) || [];

console.log(emails);
// Output: [
//   'john.doe@example.com',
//   'help_desk@company.co.uk',
//   'sarah.smith+work@gmail.com'
// ]
```

## Regex Breakdown

- **`[a-zA-Z0-9._%+-]+`** — Matches the local part (before `@`): letters, numbers, dots, underscores, percent signs, plus signs, and hyphens. Handles cases like `sarah.smith+work`.
- **`@`** — Literal `@` symbol separator.
- **`[a-zA-Z0-9.-]+`** — Matches the domain name: letters, numbers, dots, and hyphens (e.g., `company`).
- **`\.`** — Literal dot before the TLD.
- **`[a-zA-Z]{2,}`** — Matches the top-level domain of 2+ characters (e.g., `com`, `co`, `uk`).
- **`/g`** — Global flag ensures **all** matches are returned, not just the first.

## Key Notes

- The `|| []` fallback prevents errors when no emails are found (`match()` returns `null` otherwise).
- This regex covers ~99% of real-world emails. For strict RFC 5322 compliance, more complex patterns exist, but this is ideal for form processing and data scraping.
Judge notes: Excellent response that directly addresses the query with working code, clear regex breakdown, practical explanation, and includes the exact example from the user's input—minor deduction only because RFC 5322 compliance note could acknowledge real-world edge cases more explicitly.
Claude Haiku 4.5 9/10
# Extracting Email Addresses in JavaScript

You can use `String.prototype.match()` with a robust regex pattern to extract all email addresses from your text.

## Regex Pattern Breakdown

```
/[a-zA-Z0-9._+%-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}/g
```

- `[a-zA-Z0-9._+%-]+` – Local part (before @): alphanumerics, dots, underscores, plus, percent, hyphen
- `@` – Required separator
- `[a-zA-Z0-9.-]+` – Domain name: alphanumerics, dots, hyphens
- `\.` – Literal dot before TLD
- `[a-zA-Z]{2,}` – Top-level domain: 2+ letters
- `g` – Global flag to find all matches

## Working Code

```javascript
const text = 'Please contact john.doe@example.com or our support team at help_desk@company.co.uk. You can also reach Sarah at sarah.smith+work@gmail.com for urgent matters.';

const emailRegex = /[a-zA-Z0-9._+%-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}/g;
const emails = text.match(emailRegex);

console.log(emails);
```

## Output

```javascript
[
  'john.doe@example.com',
  'help_desk@company.co.uk',
  'sarah.smith+work@gmail.com'
]
```

## Notes

- The regex handles common email formats including dots, underscores, and plus signs in the local part
- It requires a valid domain with at least a 2-letter TLD
- For RFC 5322 compliance, use a more complex pattern, but this covers 99% of real-world cases
- Always validate emails server-side for security
Judge notes: Excellently satisfies the query with practical, working code, clear regex breakdown, correct expected output, and helpful caveats—only minor gap is not mentioning edge cases like numbers-only domains or internationalized emails.
GPT-5.4 9/10
Use `String.prototype.match()` with a global email regex to pull all email addresses into an array.

```javascript
const text = "Please contact john.doe@example.com or our support team at help_desk@company.co.uk. You can also reach Sarah at sarah.smith+work@gmail.com for urgent matters.";

// Robust practical email regex
const emailRegex = /\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,}\b/g;

const emails = text.match(emailRegex) || [];

console.log(emails);
```

**Expected output:**
```javascript
[
  "john.doe@example.com",
  "help_desk@company.co.uk",
  "sarah.smith+work@gmail.com"
]
```

**Regex breakdown:**
- `\b` — word boundary so matches start/end cleanly
- `[A-Za-z0-9._%+-]+` — local part before `@`  
  (letters, numbers, dot, underscore, percent, plus, hyphen)
- `@` — literal at symbol
- `[A-Za-z0-9.-]+` — domain name and subdomains
- `\.` — required dot before the TLD
- `[A-Za-z]{2,}` — top-level domain like `com`, `co.uk` works because `company.co.uk` is matched by the domain part plus final `.uk`
- `/g` — global flag to find **all** matches

If no emails are found, `match()` returns `null`, so `|| []` ensures you always get an array.
Judge notes: Excellent response that directly answers the query with working code, clear regex breakdown, correct expected output, and practical null-handling—only minor limitation is that the regex doesn't perfectly handle all RFC 5322 edge cases, but it's entirely appropriate for real-world use.
Gemini 2.5 Pro 9/10
Of course. You can extract all email addresses from a string using `String.prototype.match()` with a regular expression. The key is to use the global (`g`) flag to find all matches, not just the first one.

This code uses a robust regex to identify common email formats and returns them in an array.

```javascript
const text = 'Please contact john.doe@example.com or our support team at help_desk@company.co.uk. You can also reach Sarah at sarah.smith+work@gmail.com for urgent matters.';

// A robust regex for matching common email formats
const emailRegex = /\b[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,}\b/gi;

const extractedEmails = text.match(emailRegex);

console.log(extractedEmails);
```

### Regex Breakdown:
*   `\b`: A word boundary to avoid matching parts of other words.
*   `[A-Z0-9._%+-]+`: The local part (username) before the `@`.
*   `@`: The literal "@" symbol.
*   `[A-Z0-9.-]+`: The domain name.
*   `\.[A-Z]{2,}`: The top-level domain (e.g., .com, .co.uk).
*   `gi`: Flags for global (find all) and case-insensitive matching.

### Output:
The code will produce the following array of extracted emails:
```json
[
  "john.doe@example.com",
  "help_desk@company.co.uk",
  "sarah.smith+work@gmail.com"
]
```
Judge notes: Excellent response that directly satisfies the query with a robust, working solution, clear regex breakdown, and correct expected output—minor deduction only for not mentioning regex limitations with edge cases.
Grok 4.1 Fast Reasoning 9/10
To extract email addresses from the given text, use `String.prototype.match()` with a global regex. This robust pattern matches common email formats including dots, underscores, plus signs, and subdomains.

```javascript
const text = `Please contact john.doe@example.com or our support team at help_desk@company.co.uk. You can also reach Sarah at sarah.smith+work@gmail.com for urgent matters.`;

const emailRegex = /\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,}\b/g;

const emails = text.match(emailRegex) || [];
console.log(emails);
// Output: ['john.doe@example.com', 'help_desk@company.co.uk', 'sarah.smith+work@gmail.com']
```

**Regex breakdown** (`/…/g` for global matches):
- `\b`: Word boundary (avoids partial matches).
- `[A-Za-z0-9._%+-]+`: Local part (letters, digits, common symbols).
- `@`: Literal at-sign.
- `[A-Za-z0-9.-]+`: Domain (letters, digits, dots, hyphens).
- `\.`: Escaped dot.
- `[A-Za-z]{2,}`: TLD (2+ letters).
- `\b`: End boundary.

Copy-paste ready; handles the exact text perfectly (120 words).
Judge notes: Excellently satisfies the query with a practical, working solution, clear regex breakdown, correct output, and appropriate length; only minor limitation is that the regex doesn't handle all edge cases (like numeric TLDs or hyphens in local parts), but this is acceptable for most real-world scenarios.

What makes these work

  1. 01
    Use the global flag every time

    Without the g flag, String.match() returns only the first result plus capture groups, not an array of all matches. Always write your regex literal as /pattern/g or pass { global: true } equivalent. Forgetting this is the single most common reason developers think their extraction is broken when the regex itself is fine.

  2. 02
    Guard against null before iterating

    match() returns null when there are no matches, not an empty array. Chaining .map() or .forEach() directly on the result without a null check throws a TypeError at runtime. Use the pattern (text.match(regex) || []) to safely default to an empty array in all cases.

  3. 03
    Strip HTML before running the regex

    Running an email regex directly on raw HTML will still find addresses in attribute values and text nodes, but it can also produce partial matches on malformed markup or encoded characters like &#64; for the at-sign. A fast text.replace(/<[^>]+>/g, ' ') pass before extraction keeps results clean without needing a full DOM parser.

  4. 04
    Scope your TLD length to your expected input

    A pattern ending in [a-zA-Z]{2,} allows any TLD from two characters upward. If your input is controlled, tightening to {2,10} or {2,6} reduces ReDoS attack surface on adversarial strings. For public-facing tools processing unknown input, also consider adding a possessive quantifier or atomic group if your regex engine supports it.

More example scenarios

#01 · Extract emails from a pasted block of marketing copy
Input
Please reach out to our team at sales@acmecorp.com or support@acmecorp.com for pricing. For press inquiries contact jane.doe+press@media.acmecorp.co.uk. Do not reply to noreply@acmecorp.com.
Expected output
['sales@acmecorp.com', 'support@acmecorp.com', 'jane.doe+press@media.acmecorp.co.uk', 'noreply@acmecorp.com'] -- A regex like /[a-zA-Z0-9._%+\-]+@[a-zA-Z0-9.\-]+\.[a-zA-Z]{2,}/g captures all four including the plus-tagged address and the two-part country-code TLD.
#02 · Pull contact emails from scraped HTML string in Node.js
Input
const html = '<p>Contact us: <a href="mailto:hello@startup.io">hello@startup.io</a> or billing@startup.io. &copy; 2024</p>'; // Need to extract both addresses from this raw HTML string without a DOM parser.
Expected output
Strip HTML tags first with html.replace(/<[^>]+>/g, ' '), then apply the email regex to the resulting plain text. Result: ['hello@startup.io', 'billing@startup.io']. This prevents partial matches on href attribute values or encoded entities.
#03 · Deduplicate emails extracted from a server access log
Input
Log lines contain repeated entries like '2024-01-15 user=admin@corp.net action=login' and '2024-01-15 user=admin@corp.net action=export' along with 'user=dev@corp.net action=login'. Need unique emails only.
Expected output
const emails = [...new Set(logText.match(/[a-zA-Z0-9._%+\-]+@[a-zA-Z0-9.\-]+\.[a-zA-Z]{2,}/g) || [])]; Returns ['admin@corp.net', 'dev@corp.net']. Wrapping match() output in a Set removes duplicates before spreading back to an array.
#04 · Extract and validate emails from a user-submitted contact list
Input
A textarea where users paste comma or newline separated contacts: 'Alice Smith <alice@example.com>, bob@example.com, Charlie <not-an-email>, delta@sub.domain.org'
Expected output
Regex extracts ['alice@example.com', 'bob@example.com', 'delta@sub.domain.org'] and ignores 'not-an-email' because it lacks a valid TLD segment. Display the matched list back to the user for confirmation before submission.
#05 · Browser extension collecting emails from the visible page text
Input
document.body.innerText contains a company directory page with hundreds of names and emails formatted as 'John Doe | j.doe@university.edu | Professor' repeated across many rows.
Expected output
Run document.body.innerText.match(/[a-zA-Z0-9._%+\-]+@[a-zA-Z0-9.\-]+\.[a-zA-Z]{2,}/g) directly in the content script. Returns all .edu addresses in order. Filter further with .filter(e => e.endsWith('.edu')) if you only want institutional addresses.

Common mistakes to avoid

  • Matching trailing punctuation

    A regex that ends with .+ or similar will grab the period at the end of a sentence as part of the TLD, turning 'email@example.com.' into 'email@example.com.' with a trailing dot. End your pattern at the TLD segment boundary and do not include punctuation characters in the final character class.

  • Forgetting case-insensitive flag

    Email local parts and domains are technically case-insensitive but real-world data often contains uppercase characters like 'Sales@Company.COM'. Without the i flag, your regex misses these or requires you to manually include A-Z ranges everywhere, making the pattern longer and harder to read.

  • Using extraction as validation

    A string that matches your email regex is not guaranteed to be a deliverable address. Extraction finds patterns that look like emails; it does not confirm the domain exists or the mailbox is active. Never skip a secondary validation step like a DNS MX lookup or a verification API before adding extracted addresses to a send list.

  • Not deduplicating results

    match() returns every occurrence including repeats. If an email appears ten times in your source text, you get it ten times in the array. Wrap results in new Set() and spread back to an array as a default habit, even when you think the input will not have duplicates.

  • Overly greedy patterns causing ReDoS

    Patterns like (\w+\.)*\w+@(\w+\.)+\w+ with nested quantifiers can trigger catastrophic backtracking on crafted input, hanging your event loop. Test your regex against adversarial strings like a very long sequence of 'aaaaaa@' before deploying any extraction logic to a public endpoint.

Related queries

Frequently asked questions

What is the best regex to extract email addresses in JavaScript?

For most practical use cases, /[a-zA-Z0-9._%+\-]+@[a-zA-Z0-9.\-]+\.[a-zA-Z]{2,}/gi covers the vast majority of real-world email formats. It handles plus-tagged addresses, subdomain senders, and common TLD lengths. Full RFC 5322 compliance requires a regex hundreds of characters long and is rarely worth the complexity outside of specialized mail processing tools.

How do I extract multiple emails from a string in JavaScript?

Use String.match() with the global flag: const emails = text.match(/[a-zA-Z0-9._%+\-]+@[a-zA-Z0-9.\-]+\.[a-zA-Z]{2,}/gi) || []. The g flag tells match() to return all occurrences as an array instead of stopping at the first one. The || [] guard handles the case where no emails are found and match() returns null.

Can I extract emails from HTML in JavaScript without a library?

Yes. Strip HTML tags first with text.replace(/<[^>]+>/g, ' ') to remove markup, then run your email regex on the resulting plain text. This is fast, has no dependencies, and works in both browser and Node.js environments. For large documents or when you also need to navigate the DOM structure, a proper HTML parser like DOMParser or cheerio gives more control.

How do I remove duplicate emails after extracting them?

Wrap the match result in a Set and spread it back: const unique = [...new Set(emails)]. Set automatically discards duplicate values. If you need case-insensitive deduplication, lowercase the array first: const unique = [...new Set(emails.map(e => e.toLowerCase()))] to treat 'User@Example.com' and 'user@example.com' as the same address.

Is it safe to run email extraction regex on user-supplied input?

It depends on the pattern. Simple character-class-based regexes like the one recommended here are generally safe. Avoid patterns with nested quantifiers or alternation groups on unbounded input, as they can trigger ReDoS and hang your JavaScript event loop. Always test candidate patterns against pathological inputs like long strings of repeated characters before deploying to production.

How do I extract emails from a file in Node.js?

Read the file with fs.readFile() or fs.readFileSync(), convert the buffer to a string, then apply your regex with match(). For large files, use a readline stream to process line by line and push matches into an array to avoid loading the entire file into memory at once. This keeps memory usage flat regardless of file size.

Try it with a real tool

Run this prompt in one of these tools. Affiliate links help keep Gridlyx free.