# Excel Copilot Cleanup Guide: CRM Contact List
1. **Back up your data first.** Duplicate the sheet (right-click tab → Move or Copy → check "Create a copy") so Copilot edits happen on a working copy, not your original export.
2. **Convert the range to a Table.** Select your data and press `Ctrl+T`; Copilot works far more reliably on structured tables than raw ranges.
3. **Standardize the Name column into First and Last.** Open the Copilot pane and prompt: *"Split the Name column into two new columns called First Name and Last Name. If a value contains a comma, treat it as 'Last, First'; otherwise treat it as 'First Last'. Trim any extra spaces."* Manual fallback: insert two columns and use `=TRIM(IFERROR(MID(A2,FIND(", ",A2)+2,99),LEFT(A2,FIND(" ",A2)-1)))` for First and a mirrored formula for Last.
4. **Normalize phone numbers to a clean format.** Prompt Copilot: *"In the Phone column, remove all parentheses, dashes, spaces, and dots, then reformat every value as (XXX) XXX-XXXX. Leave blanks as blanks."* Manual fallback: use a helper column with `=TEXT(VALUE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(B2,"(",""),")",""),"-","")," ","")),"(000) 000-0000")`.
5. **Lowercase all emails and trim whitespace.** Prompt: *"Convert every value in the Email column to lowercase and trim leading or trailing spaces."* Manual fallback: helper column with `=LOWER(TRIM(C2))`, then paste-values back over the original.
6. **Remove duplicate emails.** Prompt: *"Remove duplicate rows based on the Email column, keeping the first occurrence."* Manual fallback: Data tab → Remove Duplicates → check only the Email column.
7. **Flag missing critical fields for mail merge.** Prompt: *"Add a column called Issues that lists any of the following problems per row: missing first name, missing last name, missing or invalid email, phone not 10 digits."* Review and fix flagged rows manually.
8. **Sort and finalize for mail merge.** Prompt: *"Sort the table alphabetically by Last Name, then First Name, and hide any helper columns."* Save the file as a new `.xlsx` specifically for the merge.
---
**Verification checklist**
- ✅ Spot-check 10 random rows: names split correctly regardless of original format, and phone numbers all match `(XXX) XXX-XXXX`.
- ✅ Run `=COUNTA(Email)` vs `=SUMPRODUCT(1/COUNTIF(Email,Email))` — the two numbers should now be equal (no duplicates).
- ✅ Confirm the Issues column is empty (or resolved) for every row before starting the mail merge.
Use Microsoft Copilot to Clean Excel Data Automatically
Tested prompts for excel copilot data cleanup compared across 5 leading AI models.
If you're searching for 'excel copilot data cleanup', you probably have a spreadsheet with inconsistent formatting, duplicate rows, mixed date formats, trailing spaces, or mismatched casing, and you want Microsoft Copilot in Excel to fix it without writing formulas or Power Query steps yourself. The good news: Copilot can identify dirty data patterns and either suggest formulas, trigger built-in cleanup actions, or generate Power Query M code for repeatable cleanup.
The catch is that Copilot's quality depends heavily on how you describe the mess. 'Clean this data' gets you vague suggestions. 'Standardize the Phone column to (XXX) XXX-XXXX format, remove rows where Email is blank, and trim whitespace from all text columns' gets you working output. This page shows tested prompts and compares how four models handle the same cleanup task, so you can see which one actually ships usable results for your sheet.
Below you'll find when this workflow fits, when you should reach for Power Query or a Python script instead, realistic cleanup examples across different data types, and the mistakes that waste the most time.
When to use this
Use Copilot in Excel for data cleanup when your dataset sits in a formatted Excel table, the problems are describable in plain English, and you want results in minutes rather than building a repeatable pipeline. It shines for one-off cleanups, exploratory tidying before analysis, and generating formulas you can reuse.
- Standardizing inconsistent text entries (casing, whitespace, abbreviations) across a column
- Finding and flagging duplicates based on fuzzy criteria like name plus email
- Splitting or merging columns when the pattern is describable (full name into first/last, address into parts)
- Converting mixed date or number formats into a single standard
- Generating formulas or conditional formatting rules to catch future dirty entries
When this format breaks down
- Datasets over ~1.5 million rows or anything Copilot refuses to process, where Power Query or Python pandas is faster and more reliable
- Cleanup that needs to run on a schedule or against new files weekly, where a Power Query or Office Script is more maintainable
- Highly regulated data (PHI, financial records) where sending content to cloud AI violates your org's policy
- Cleanup that requires external lookup, like validating addresses against USPS or matching company names to a CRM
The prompt we tested
You are an expert Microsoft Excel Copilot assistant specializing in data cleanup. A user will describe their messy Excel dataset and what they need fixed. Your job is to produce a clear, step-by-step Copilot workflow they can follow inside Excel to clean the data. Follow these instructions strictly: Respond with a numbered step-by-step guide (5-10 steps) that includes the exact Copilot prompts to type into the Excel Copilot pane (in quotes), plus any manual fallback actions if Copilot cannot perform the step. Cover issues like duplicates, inconsistent formatting, missing values, split/merged columns, and trimming whitespace when relevant. Keep each step under 2 sentences and end with a short 'Verification checklist' of 3 bullet points. User's data situation: I have an Excel sheet with 4,000 rows of customer contacts exported from our CRM. The Name column mixes 'Last, First' and 'First Last' formats, phone numbers have random dashes, spaces, and parentheses, and there are obvious duplicate emails with different capitalization. I need it cleaned up for a mail merge. Produce the Copilot data cleanup guide now.
Model comparison
| Model | Quality | Cost | Latency | Tokens |
|---|---|---|---|---|
| Claude Opus 4.7 | 9 | $0.0832 | 16.9s | 407 / 1028 |
| Claude Haiku 4.5 | 8 | $0.0046 | 9.6s | 281 / 864 |
| GPT-5.4 | Not run yet | |||
| Gemini 2.5 Pro | 8 | $0.0027 | 17.7s | 240 / 487 |
| Grok 4.1 Fast Reasoning | 8 | $0.0003 | 9.0s | 383 / 446 |
Outputs
# Excel Copilot Data Cleanup Guide: Customer Contacts ## Step-by-Step Workflow **Step 1: Assess and Organize Your Data** First, select your entire dataset (Ctrl+A) and apply AutoFilter (Data > Filter) to see column headers clearly. Type this prompt into the Copilot pane: *"Show me a summary of the data quality issues in my customer contacts sheet—count duplicates, inconsistent formats, and missing values."* **Step 2: Standardize Name Format** Create two helper columns next to your Name column labeled "First" and "Last." Type: *"Split the Name column into First and Last name, handling both 'Last, First' and 'First Last' formats. Put results in columns B and C."* If Copilot struggles, manually use Data > Text to Columns on a subset first to show the pattern. **Step 3: Merge Names Back to Consistent Format** In a new column, type: *"Create a formula that combines the First and Last name columns into 'First Last' format for all 4,000 rows, then convert the formulas to values."* Copy the new column and paste as values over your original Name column, then delete the helper columns. **Step 4: Standardize Phone Numbers** Select your Phone column and prompt: *"Remove all dashes, spaces, and parentheses from phone numbers and format them consistently as (XXX) XXX-XXXX for a US phone list."* If Copilot uses formulas, convert results to values before deleting the original column. **Step 5: Remove Duplicate Emails with Case-Insensitive Matching** Highlight your Email column and type: *"Identify and flag duplicate emails regardless of uppercase/lowercase differences, keeping only the first occurrence of each unique email address."* Manually review flagged rows before deletion, then use Data > Remove Duplicates if needed for final cleanup. **Step 6: Trim Whitespace from All Text Columns** Select all text columns (Name, Email, Phone, Address) and prompt: *"Remove leading and trailing spaces from all cells in these columns across all 4,000 rows."* Verify the action completed by clicking a few cells to confirm no extra spaces remain. **Step 7: Handle Missing Values** Type: *"Show me which rows have blank Email or Phone fields and highlight them for review."* Manually decide whether to delete incomplete records or request missing data from your CRM; mark any retained blanks with "N/A" for consistency. **Step 8: Convert Email to Lowercase** Select the Email column and prompt: *"Convert all email addresses to lowercase to ensure consistency for mail merge."* Convert formulas to values once complete. **Step 9: Final Deduplication and Sort** Type: *"Remove any remaining duplicate rows based on Email address, and sort the final dataset alphabetically by Last Name."* Review the row count before/after to confirm duplicates were removed. **Step 10: Verification and Export** Prompt: *"Verify data quality: confirm no blank emails exist, all phone numbers follow (XXX) XXX-XXXX format, and all names are in 'First Last' format."* Save as .xlsx and test a sample mail merge with 10 rows to confirm formatting works. --- ## Verification Checklist - ✓ **Name Format**: All names display as "First Last" with consistent capitalization and no commas - ✓ **Phone Numbers**: All phones match (XXX) XXX-XXXX format with no dashes, spaces, or parentheses irregularities - ✓ **Email Duplicates Removed**: No duplicate emails exist (case-insensitive), and all emails are lowercase with no leading/trailing spaces
Of course. Here is a step-by-step Copilot workflow to clean your customer contact data for a mail merge.
***
1. **Select your data and trim whitespace.**
First, click on any cell within your data range and press Ctrl+A to select it, then click "Format as Table" on the Home tab. Next, ask Copilot: `"Trim whitespace from all columns in the table."` This removes leading or trailing spaces that can cause issues with duplicates and formatting.
2. **Standardize and deduplicate emails.**
Ask Copilot to normalize the email addresses to prevent case-sensitive duplicates. Prompt: `"Create a new column named 'Clean_Email' that converts the 'Email' column to all lowercase, then remove duplicate rows based on this new column, keeping the first instance."`
3. **Split the mixed-format Name column.**
Copilot can recognize different name patterns and separate them correctly. Prompt: `"Split the 'Name' column into two new columns, 'First_Name' and 'Last_Name'."` If Copilot struggles, try giving it an example: `"Split the 'Name' column like this: 'Smith, John' becomes 'John' in 'First_Name' and 'Smith' in 'Last_Name'."`
4. **Standardize all phone numbers.**
This step will strip out all formatting, leaving only digits for a consistent phone list. Ask Copilot: `"Create a new 'Clean_Phone' column by removing all non-numeric characters from the 'Phone' column."`
5. **Identify any missing contact information.**
Find empty cells that could cause errors in your mail merge before you proceed. Prompt: `"Highlight all blank cells in the table in yellow."` For a manual fallback, use the Home tab > Find & Select > Go To Special > Blanks.
6. **Finalize your clean dataset.**
To finish, hide the old, messy columns, leaving only the clean data for your mail merge. Prompt: `"Hide the original 'Name', 'Email', and 'Phone' columns."`
***
### Verification Checklist:
* The `First_Name` and `Last_Name` columns are correctly separated for all rows.
* The `Clean_Phone` column
1. **Trim whitespace across all columns**: In the Copilot pane, type: "Trim leading and trailing spaces from all text columns in my table." If Copilot doesn't apply to all, manually select data > Data tab > Text to Columns > Finish (as a delimiter reset), then use Find & Replace for spaces.
2. **Standardize emails to lowercase**: In Copilot, type: "Create a new column converting the Email column to lowercase using LOWER function, then copy values back to original." Verify by sorting the Email column for consistency.
3. **Remove duplicate emails**: In Copilot, type: "Remove duplicate rows based on the Email column, keeping the first occurrence." Manually fallback: Data tab > Remove Duplicates > select Email column only.
4. **Clean phone numbers**: In Copilot, type: "Standardize the Phone column by removing dashes, spaces, parentheses, and formatting as (XXX) XXX-XXXX using SUBSTITUTE and TEXT functions in a new column." Paste values back and delete original if needed.
5. **Split Name column into First and Last**: In Copilot, type: "Split the Name column into First Name and Last Name: if comma present, split after comma and reverse; otherwise split on last space. Create new columns." Manually fallback: Add columns, use formulas like =TRIM(LEFT(Name,FIND(",",Name)-1)) for Last if comma.
6. **Fill missing values (if any)**: In Copilot, type: "Identify rows with missing values in key columns (Name, Email, Phone) and suggest filling with 'Unknown' or averages where appropriate." Manually: Use Go To Special > Blanks, then type formula like ="Unknown".
7. **Final deduplication and sort**: In Copilot, type: "Remove any remaining duplicates based on Email and Phone, then sort by Last Name ascending." Convert table to range if needed for mail merge.
**Verification checklist**:
- All emails are lowercase with no duplicates (check unique count).
- Phones match (XXX) XXX-XXXX format (spot-check 10 rows).
- Names split correctly into First/Last with no 'Last, First' leftovers (sort and scan).
What makes these work
-
01Format your range as a table first
Copilot works dramatically better on Excel tables (Ctrl+T) than on raw ranges because it can reference columns by name. Convert before prompting, and give columns clear headers like 'CustomerEmail' instead of 'Column B'.
-
02Describe the mess, not just the goal
Say what's wrong ('some entries have trailing spaces, some use abbreviations like Corp. vs Corporation') before saying what you want. Copilot's formula choices depend on knowing the specific failure modes, not just the target state.
-
03Ask for a preview column, not in-place edits
Have Copilot generate a new column with the cleaned value so you can compare side by side. Once you verify the output matches expectations across edge cases, copy-paste-values over the original. This prevents silent data loss.
-
04Request Power Query for anything recurring
If you'll clean this same export next month, explicitly ask 'give me the Power Query M code for this' rather than a formula. You get a refreshable pipeline instead of a one-shot fix.
More example scenarios
A 4,000-row Excel table from HubSpot with a Company column containing entries like 'acme corp', 'ACME Corporation', 'Acme Corp.', 'acme corp' (double space). I want to normalize these so all Acme variants become 'Acme Corp' and all other companies use Title Case with trailing punctuation removed.
Copilot suggests adding a column with =PROPER(TRIM(SUBSTITUTE([@Company],".",""))) for general normalization, then a nested IF or SWITCH to map known variants: =IF(ISNUMBER(SEARCH("acme",[@Company])),"Acme Corp",PROPER(TRIM(SUBSTITUTE([@Company],".","")))). It also offers a Power Query alternative using Text.Trim, Text.Proper, and a Replace Values step.
Column B has 2,300 submission dates in formats like '3/5/2024', '2024-03-05', 'March 5, 2024', and '5-Mar-24'. Convert all to YYYY-MM-DD so I can sort chronologically.
Copilot recommends a helper column with =TEXT(DATEVALUE([@SubmittedAt]),"yyyy-mm-dd") and flags rows where DATEVALUE fails so you can fix them manually. For the problem rows, it suggests an IFERROR wrapper and a separate 'NeedsReview' column marking which entries to inspect.
3,100 rows with Name, Email, and Phone. Same customer appears multiple times with slight variations: 'John Smith' vs 'john smith' vs 'J. Smith', same email with different capitalization. Flag duplicates based on normalized email, keep the row with the most complete data.
Copilot adds a NormalizedEmail column with =LOWER(TRIM([@Email])), then a CompletenessScore column counting non-blank fields, then uses COUNTIF on NormalizedEmail to flag duplicates and RANK on score within each duplicate group. It suggests filtering where Rank>1 and deleting, or using Remove Duplicates on NormalizedEmail after sorting by score descending.
Column D contains values like '$1,234.56', '1234.56 USD', '€1.234,56', and '-$500.00'. I need a clean numeric column so I can sum and pivot.
Copilot generates =IFERROR(VALUE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE([@Amount],"$",""),",","")," USD","")),0) for USD-formatted values, and flags the Euro-formatted rows (which use comma as decimal) for separate handling with a conditional formula that swaps comma and period. It also suggests adding a Currency column to preserve the original unit.
SKU column has 12,000 entries that should all be format 'ABC-12345' but many are 'abc12345', 'ABC 12345', 'ABC_12345', or have leading zeros missing. Standardize to uppercase letters, dash, 5-digit number with leading zeros.
Copilot suggests a regex-style cleanup using multiple nested SUBSTITUTE calls plus UPPER, then TEXT to pad the numeric portion: =UPPER(LEFT([@SKU],3))&"-"&TEXT(VALUE(RIGHT(SUBSTITUTE(SUBSTITUTE([@SKU],"-",""),"_",""),5)),"00000"). For edge cases it recommends flagging rows where the letter prefix isn't exactly 3 characters.
Common mistakes to avoid
-
Prompting on unformatted ranges
Copilot often refuses or gives generic advice when data isn't in a table. You lose 10 minutes trying to debug prompts when the real fix is one Ctrl+T away.
-
Trusting the first output without spot-checks
Copilot formulas frequently work on the first 50 rows and silently fail on edge cases (blank cells, non-English characters, negative numbers). Always sort by the cleaned column ascending and descending and scan the extremes.
-
Asking for too many cleanups in one prompt
'Clean dates, fix names, dedupe emails, standardize phones' produces mediocre output on all four. Split into separate prompts and chain the results, verifying each step before moving on.
-
Ignoring the locale problem
European vs US number and date formats look similar but parse differently. If you tell Copilot 'these are dates' without specifying the source locale, you'll swap months and days on ambiguous entries like 03/05/2024 and not notice for weeks.
-
Overwriting the source column
Telling Copilot to 'replace the values in column C' means you lose the original. If the cleanup is wrong, you can't recover without re-importing. Always output to a new column first.
Related queries
Frequently asked questions
Does Excel Copilot actually clean data or just suggest formulas?
Both, depending on your prompt. For simple operations like 'remove duplicates' or 'trim whitespace' it can execute directly. For anything requiring logic it generates formulas, Power Query steps, or conditional formatting rules that you apply. It doesn't reach into arbitrary cells and overwrite them without showing you what it's doing.
Do I need a Microsoft 365 Copilot license to clean Excel data this way?
Yes, the in-app Copilot pane in Excel requires a Microsoft 365 Copilot add-on license (around $30/user/month as of 2024). The free Copilot web chat can help you write formulas if you paste sample data in, but it can't act on your workbook directly.
Can Copilot handle cleanup on 100,000+ row spreadsheets?
It can suggest formulas and Power Query steps for large tables, but direct actions may time out or be refused on very large ranges. For anything above a few hundred thousand rows, have Copilot generate Power Query M code and run the transformation through the Query Editor instead.
How do I clean data with Copilot without exposing it to Microsoft's cloud?
You can't, fully. Copilot in Excel sends your prompt and relevant data context to Microsoft's cloud endpoints. If you're handling regulated data, either use Power Query with manual transformations, write VBA or Office Scripts, or process the file locally with Python and openpyxl.
What's better for recurring data cleanup, Copilot or Power Query?
Power Query for recurring jobs, every time. Copilot is faster for one-off cleanup and exploring what's wrong with a dataset, but Power Query gives you a refreshable, auditable pipeline. The best workflow is using Copilot to draft the M code, then saving it as a Power Query step.
Why does Copilot sometimes say it can't process my table?
Usually one of three reasons: the range isn't formatted as a table, the sheet has merged cells or blank header rows, or the data exceeds its processing limit for that operation. Convert to a table, unmerge cells, ensure every column has a header, and try again.