Format and Clean Phone Numbers in Excel with AI

Tested prompts for clean up phone numbers in excel compared across 5 leading AI models.

BEST BY JUDGE SCORE Claude Opus 4.7 9/10

You have a spreadsheet full of phone numbers and none of them match. Some have country codes, some don't. Some use dashes, some use dots, some are jammed together as raw digits. A few have extensions tacked on, parentheses around area codes, or sneaky whitespace that breaks your VLOOKUPs. You need them all in one clean, consistent format so you can dedupe contacts, import to a CRM, or send an SMS campaign without errors.

Excel can do this with nested SUBSTITUTE, TEXT, and REGEX formulas, but writing them for every edge case is painful. A few malformed rows and your formula chain collapses. This page shows how to hand the job to an AI model: paste the messy column, describe the target format (E.164, US 10-digit, dashed, whatever), and get back a clean list you can drop straight back into your sheet.

Below you'll find the exact prompt that works, outputs from four different models so you can pick the best fit, and guidance on when this approach beats a formula and when it doesn't.

When to use this

Use AI to clean phone numbers when the mess is varied, the volume is manageable (under a few thousand rows), and you need judgment calls about country codes, extensions, or ambiguous formats. It's faster than writing regex for every variation and handles edge cases a formula would miss.

  • Merging contact lists from multiple sources with different formatting conventions
  • Preparing a CRM import where the target field requires E.164 format (+14155551234)
  • Cleaning a lead list before a Twilio, HubSpot, or Salesforce upload
  • Standardizing numbers scraped from forms where users typed freely
  • Flagging invalid or incomplete numbers so you can review them before sending

When this format breaks down

  • Datasets over 10,000 rows where token costs and latency make a Power Query or Python script cheaper
  • Regulated data (HIPAA, financial PII) where sending phone numbers to a third-party API violates policy
  • Cases where you need 100% deterministic output for audit trails, since models can occasionally hallucinate digits
  • Real-time validation inside a live workflow, where a dedicated library like libphonenumber is faster and more accurate

The prompt we tested

You are a data-cleaning assistant specializing in Excel phone number formatting. Clean and standardize the phone number data provided below.

Instructions: Standardize all phone numbers to the format (XXX) XXX-XXXX for US numbers, or +CC XXX XXX XXXX for international. Strip extensions, letters, and extra characters; flag any invalid or incomplete numbers as 'INVALID'. Always include at least one ready-to-use Excel formula (using SUBSTITUTE, TEXT, or REGEXREPLACE) and keep the explanation under 150 words.

Phone number data to clean:
Here are the phone numbers from my customer list column B: '555.123.4567', '(555) 123 4567 ext 42', '15551234567', '1-800-CALL-NOW', '+44 20 7946 0958', and '555-CALL'. I need them all cleaned up and consistent so I can import them into our CRM.

Return a cleaned table with the original value and the cleaned value side by side, then provide the exact Excel formula(s) a user can paste into their spreadsheet to automate this cleanup across a full column.

Model comparison

ModelQualityCostLatencyTokens
Claude Opus 4.7 9 $0.0579 11.2s 384 / 695
Claude Haiku 4.5 6 $0.0054 9.0s 267 / 1032
GPT-5.4Not run yet
Gemini 2.5 ProNot run yet
Grok 4.1 Fast Reasoning 8 $0.0003 86.8s 386 / 438

Outputs

Claude Opus 4.7 9/10
# Phone Number Cleanup Results

## Cleaned Table

| Original Value | Cleaned Value |
|---|---|
| `555.123.4567` | `(555) 123-4567` |
| `(555) 123 4567 ext 42` | `(555) 123-4567` |
| `15551234567` | `(555) 123-4567` |
| `1-800-CALL-NOW` | `INVALID` |
| `+44 20 7946 0958` | `+44 20 7946 0958` |
| `555-CALL` | `INVALID` |

## Excel Formula (Microsoft 365 / Excel 2021+)

Paste into **C2** and drag down:

```excel
=LET(
  raw, B2,
  digits, IFERROR(REGEXREPLACE(raw,"[^\d+]",""), ""),
  intl, LEFT(digits,1)="+",
  core, IF(intl, MID(digits,2,99), IF(LEN(digits)=11, MID(digits,2,10), digits)),
  IF(intl,
     "+" & LEFT(core,2) & " " & MID(core,3,2) & " " & MID(core,5,4) & " " & MID(core,9,4),
     IF(LEN(core)=10,
        "("&LEFT(core,3)&") "&MID(core,4,3)&"-"&RIGHT(core,4),
        "INVALID"))
)
```

## Legacy Excel Alternative (no REGEX)

```excel
=IFERROR(
  TEXT(VALUE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(B2,"-",""),".","")," ",""),"(",""),")","")),"(000) 000-0000"),
  "INVALID")
```

**Notes:** Letters (e.g., `CALL-NOW`) are rejected as ambiguous vanity numbers—flagged `INVALID` for manual review. International numbers keep their `+CC` prefix with spaced grouping.
Judge notes: Excellent execution with accurate cleaned results, two practical formulas (modern and legacy), proper handling of edge cases, and clear formatting—minor deduction only because the legacy formula doesn't perfectly handle all international formats and the explanation could briefly mention the LET function limitation in older Excel versions.
Claude Haiku 4.5 6/10
# Phone Number Cleaning Results

| Original | Cleaned |
|----------|---------|
| 555.123.4567 | (555) 123-4567 |
| (555) 123 4567 ext 42 | (555) 123-4567 |
| 15551234567 | (555) 123-4567 |
| 1-800-CALL-NOW | INVALID |
| +44 20 7946 0958 | +44 20 7946 0958 |
| 555-CALL | INVALID |

---

## Excel Formula

Paste this formula into column C to automate cleanup:

```excel
=IF(OR(ISNUMBER(SEARCH("[A-Z]",B2)),LEN(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(B2,"+",""),"-","")," ",""),".",""),"(",""),")",""),"ext",""))NOT IN (10,11,12)),"INVALID",IF(LEFT(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(B2,"+",""),"-","")," ",""),".",""),"(",""),")",""),"ext",""),1)="+","+"&MID(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(B2,"+",""),"-","")," ",""),".",""),"(",""),")",""),"ext",""),1,2)&" "&MID(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(B2,"+",""),"-","")," ",""),".",""),"(",""),")",""),"ext",""),3,2)&" "&MID(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(B2,"+",""),"-","")," ",""),".",""),"(",""),")",""),"ext",""),5,4)&" "&MID(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(B2,"+",""),"-","")," ",""),".",""),"(",""),")",""),"ext",""),9,4),"("&MID(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(IF(LEFT(B2,1)="1",MID(B2,2,20),B2),"+",""),"-","")," ",""),".",""),"(",""),")",""),"ext",""),1,3)&") "&MID(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(IF(LEFT(B2,1)="1",MID(B2,2,20),B2),"+",""),"-","")," ",""),".",""),"(",""),")",""),"ext",""),4,3)&"-"&MID(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(IF(LEFT(B2,1)="1",MID(B2,2,20),B2),"+",""),"-","")," ",""),".",""),"(",""),")",""),"ext",""),7,4)))
```

**Simpler Alternative** (for US numbers only):
```excel
=IF(LEN(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(B2,"-","")," ",""),".",""),"(",""),")",""),"ext",""))<>10,"INVALID","("&MID(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(B2,"-","")," ",""),".",""),"(",""),")",""),"ext",""),1,3)&") "&MID(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(B2,"-","")," ",""),".",""),"(",""),")",""),"ext",""),4,3)&"-"&RIGHT(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(B2,"-","")," ",""),".",""),"(",""),")",""),"ext",""),4))
```

The formula strips non-numeric characters, removes extensions, detects letters (invalid), validates length, and reformats accordingly.
Judge notes: The output delivers a correctly cleaned table with good results, but the Excel formulas are overly complex, difficult to implement, and lack practical usability guidance that most users would need to successfully apply them.
Grok 4.1 Fast Reasoning 8/10
### Cleaned Phone Numbers

| Original              | Cleaned           |
|-----------------------|-------------------|
| 555.123.4567         | (555) 123-4567   |
| (555) 123 4567 ext 42| (555) 123-4567   |
| 15551234567          | (555) 123-4567   |
| 1-800-CALL-NOW       | INVALID          |
| +44 20 7946 0958     | +44 207 946 0958 |
| 555-CALL             | INVALID          |

### Excel Formula
Paste this into C1 (assuming data in A1:A6 or column A) and drag down. Requires Excel 365 (REGEX/LET). Handles stripping extensions/letters/extra chars, US normalization (drop leading 1), intl spacing, flags invalid lengths.

```
=LET(
  step1, REGEXREPLACE(UPPER(A1), "\s*EXT.*", ""),
  step2, REGEXREPLACE(step1, "[A-Z]", ""),
  cleaned, REGEXREPLACE(step2, "[^0-9+]", ""),
  IF(LEFT(cleaned,1)="+",
    IF(LEN(cleaned)<13,"INVALID",
      "+" & LEFT(cleaned,3) & " " & MID(cleaned,4,3) & " " & MID(cleaned,7,3) & " " & MID(cleaned,10,4)
    ),
    LET(
      us_core, IF(AND(LEN(cleaned)=11,LEFT(cleaned,1)="1"),MID(cleaned,2,10),cleaned),
      IF(LEN(us_core)=10,
        "(" & LEFT(us_core,3) & ") " & MID(us_core,4,3) & "-" & RIGHT(us_core,4),
        "INVALID"
      )
    )
  )
)
```

(78 words: Formula strips invalid chars first, detects + for intl (+CC XXX XXX XXXX assuming 2-digit CC +10 digits), else US 10-digit format. Invalid if wrong length.)
Judge notes: Excellent delivery with accurate cleaned table, proper formatting standards, and a sophisticated Excel formula that handles both US and international numbers; minor issue is the international format assumes 2-digit country codes which may not cover all cases, and the formula complexity might challenge some users despite being correct.

What makes these work

  1. 01
    Specify the exact target format

    Say 'E.164' or 'XXX-XXX-XXXX' or '+CC XXX XXX XXXX' explicitly. Models default to whatever looks common in your input, which is rarely what you want. Show one example of the desired output in the prompt.

  2. 02
    Handle country codes deliberately

    Tell the model what to assume when a country code is missing. 'Assume US (+1) if no country code is present' prevents it from guessing or leaving numbers ambiguous. For international lists, point it to a country column.

  3. 03
    Ask for a flag column on invalid rows

    Request a second column with 'INVALID' or 'REVIEW' for numbers that are too short, spelled out, or contain multiple numbers. This beats having the model silently drop or fabricate digits.

  4. 04
    Process in batches of 200-500 rows

    Larger batches hit token limits and model attention drops, meaning more errors toward the end. Split the column, run each batch, paste results back. Most models stay accurate at this size.

More example scenarios

#01 · Mixed US formats to E.164
Input
Column A has: (415) 555-1234, 415.555.9876, 4155550001, 1-415-555-2222, +1 415 555 3333, 415/555/4444 ext 12. Convert all to E.164 format for a Twilio upload.
Expected output
+14155551234, +14155559876, +14155550001, +14155552222, +14155553333, +14155554444. Note: row 6 had an extension (ext 12) which was dropped since E.164 does not support extensions. Flag this row if the extension matters for your use case.
#02 · International contacts with missing country codes
Input
A European sales list: 020 7946 0958 (UK), 01 42 86 82 00 (France), +49-30-12345678, 3 5555 1234 (Japan, no code), 415-555-1234 (US). Assume country based on context column B.
Expected output
+442079460958, +33142868200, +493012345678, +81355551234, +14155551234. Leading zeros from national trunk prefixes were stripped. Row 4 required inferring +81 from the Japan context label before prepending.
#03 · Scraped form data with garbage characters
Input
User-submitted phone field: 'call me at 555.123.4567 please', '(555) 123 4567 x205', '555-123-4567/555-987-6543', 'five five five one two three four five six seven', '+1 (555) 123-4567'.
Expected output
5551234567, 5551234567 ext 205, 5551234567 (second number 5559876543 flagged for review), INVALID (spelled out digits, needs manual entry), +15551234567. Two rows need human review before import.
#04 · Dashed format for a mail merge
Input
Raw digits: 4155551234, 2125559876, 13105550001, 4155552222extension45. Output format: XXX-XXX-XXXX, drop country code if it's 1, keep extensions in a separate column.
Expected output
Column A: 415-555-1234, 212-555-9876, 310-555-0001, 415-555-2222. Column B (extension): blank, blank, blank, 45. Country code 1 was stripped from row 3.
#05 · Deduping after cleanup
Input
Clean these and flag duplicates: +1-415-555-1234, (415) 555-1234, 4155551234, 415.555.1235, 14155551234.
Expected output
All rows except row 4 normalize to +14155551234 and are duplicates of each other. Row 4 is +14155551235, a distinct number. Keep one instance of +14155551234 and the single +14155551235 entry.

Common mistakes to avoid

  • Not specifying extension handling

    E.164 does not support extensions but your users entered them. If you don't tell the model what to do, it may drop them silently or concatenate them into the main number. Always specify: drop, keep in separate column, or flag.

  • Trusting the output without spot-checking

    Models occasionally transpose or hallucinate a digit, especially on long or garbled inputs. Spot-check 10-20 rows against the source before importing to a CRM or sending any SMS campaign.

  • Mixing regions without context

    Pasting a list with US, UK, and Brazilian numbers and asking for 'E.164' without telling the model which row is which country produces wrong country codes. Include a country column or split the list by region.

  • Ignoring leading zeros

    Excel strips leading zeros from numeric cells, which breaks UK and European numbers. Format the column as Text before pasting cleaned output back, or the '+44' becomes '44' and your dialing breaks.

  • Running it on 10,000 rows at once

    Most chat models will truncate output, skip rows, or slow down dramatically. You'll think it worked until you scroll to the bottom. Batch it or switch to a script-based approach for large files.

Related queries

Frequently asked questions

What's the best Excel formula to clean phone numbers without AI?

For simple cleanup, =SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(A1,"(",""),")",""),"-","")," ","") strips common characters. For more control, Excel 365 has REGEXREPLACE: =REGEXREPLACE(A1,"[^0-9]","") keeps only digits. These work for uniform input but struggle with extensions, country codes, and mixed formats.

How do I format phone numbers as (XXX) XXX-XXXX in Excel?

Select the column, open Format Cells (Ctrl+1), choose Special, and pick Phone Number. This only works if the cells contain exactly 10 digits with no other characters. Clean the raw data first (strip non-digits) and then apply the format.

How do I add a country code to phone numbers in Excel?

Use ="+1"&A1 for a single country, assuming A1 already has 10 clean digits. For mixed countries, use an IF or VLOOKUP against a country column: ="+"&VLOOKUP(B1,CountryCodes,2,FALSE)&A1. AI is faster when your country assignments are implicit rather than in a separate column.

Can AI handle international phone numbers correctly?

Yes, if you give it the country context. GPT-4, Claude, and Gemini all know national trunk prefixes (the leading 0 in UK and most European numbers) and will strip them when converting to E.164. Without a country column or explicit instruction, they guess based on length, which is unreliable.

Will AI make up digits when cleaning phone numbers?

It can, rarely. The risk is higher with messy input (spelled-out digits, numbers split across words, OCR artifacts). Always spot-check and ask the model to flag rows it's uncertain about rather than forcing an answer. Never send a cleaned list to an SMS campaign without reviewing the flagged rows.

How do I remove duplicate phone numbers after cleaning?

Once every number is in a single consistent format (E.164 is best), use Excel's Remove Duplicates feature under the Data tab. Normalization is the hard part because (415) 555-1234 and +14155551234 are the same number but Excel sees them as different strings. Clean first, dedupe second.