Automate Resume Shortlisting From Application to Interview

Tested prompts for automated resume shortlisting compared across 5 leading AI models.

BEST BY JUDGE SCORE Claude Haiku 4.5 9/10

You have 200 resumes sitting in your inbox and interviews scheduled for next week. Manual screening means you or your HR team spending hours reading through applications, most of which are clearly unqualified, just to find the 10 worth calling. Automated resume shortlisting uses AI to do that first pass for you, scoring and ranking candidates against your job requirements in minutes instead of days.

The approach works by feeding a structured prompt to an AI model along with resume text and a job description. The model evaluates each candidate against your defined criteria, produces a ranked shortlist, and explains its reasoning. You get consistent, documented decisions instead of gut-feel screening that varies depending on who reviewed the pile that afternoon.

This page shows you the exact prompt format that produces reliable shortlists, compares how four different AI models handle the same screening task, and gives you the practical details you need to run this in your actual hiring workflow. Whether you are screening 20 applications or 2,000, the core method is the same.

When to use this

Automated resume shortlisting fits best when you have a defined role with clear, scorable requirements and more applications than your team can review manually in a reasonable time. It works for recurring roles where you screen frequently, roles with high application volume, and situations where you need a consistent scoring rubric across multiple reviewers or hiring rounds.

A job posting received 150+ applications and interviews need to be scheduled within 5 business days
You are hiring for the same role across multiple locations or departments and need consistent scoring criteria
A small recruiting team is handling multiple open positions simultaneously and manual screening creates a bottleneck
You want a documented audit trail showing why candidates were advanced or rejected
You are screening for technical roles where specific skills, certifications, or years of experience are hard requirements

When this format breaks down

Roles where culture fit, portfolio quality, or creative work samples are the primary screening criteria and cannot be captured in resume text alone
Highly senior or executive positions where nuanced career narrative and context matter more than keyword matching against a rubric
Applications under 15 resumes, where manual review is faster than setting up and running the prompt workflow
Situations where your job description is vague or incomplete, the AI will shortlist against whatever criteria you give it, and garbage input produces garbage output

The prompt we tested

You are an expert technical recruiter and resume screener. Your task is to automatically shortlist candidates by evaluating each resume against the provided job requirements.

Instructions: For each candidate, output a structured evaluation with: Candidate Name, Match Score (0-100), Shortlist Decision (SHORTLIST / MAYBE / REJECT), Key Matching Qualifications (bulleted), Gaps or Red Flags (bulleted), and a 1-sentence justification. After individual evaluations, provide a final ranked shortlist table of only SHORTLIST and MAYBE candidates with columns: Rank, Name, Score, Top Strength, Recommended Next Step. Be objective, evidence-based, and ignore demographic or non-job-relevant factors.

Job Requirements and Candidate Resumes:
Job: Senior Backend Engineer — Required: 5+ years Python, AWS, microservices, PostgreSQL; Nice-to-have: Kubernetes, fintech experience.

Resume 1 — Priya Shah: 7 years Python at a fintech startup, built microservices on AWS (ECS + Lambda), deep PostgreSQL expertise, some Kubernetes exposure.
Resume 2 — Marcus Lee: 3 years Java/Spring, 1 year Python side projects, no cloud experience listed, MongoDB background.
Resume 3 — Elena Rossi: 6 years Python, GCP (not AWS), strong microservices and Kubernetes, worked at a payments company, PostgreSQL and MySQL.

Produce the shortlisting report now.

Model comparison

Model	Quality	Cost	Latency	Tokens
Claude Opus 4.7	8	$0.0925	13.9s	548 / 1124
Claude Haiku 4.5	9	$0.0038	7.3s	340 / 698
GPT-5.4	8	$0.0089	8.5s	291 / 542
Gemini 2.5 Pro	8	$0.0011	17.3s	290 / 154
Grok 4.1 Fast Reasoning	8	$0.0003	10.0s	440 / 509

Outputs

Claude Opus 4.7 8/10

# Candidate Shortlisting Report
## Role: Senior Backend Engineer

---

### Candidate 1: Priya Shah

- **Match Score:** 95/100
- **Shortlist Decision:** ✅ **SHORTLIST**
- **Key Matching Qualifications:**
  - 7 years of Python experience (exceeds 5+ year requirement)
  - Built microservices on AWS using ECS and Lambda (direct match)
  - Deep PostgreSQL expertise (direct match)
  - Fintech domain experience (nice-to-have ✓)
  - Some Kubernetes exposure (nice-to-have ✓)
- **Gaps or Red Flags:**
  - Kubernetes experience described as "some exposure" rather than deep expertise (minor, as it is only nice-to-have)
- **Justification:** Priya meets every required criterion with depth and also satisfies both nice-to-haves, making her a near-ideal fit for this role.

---

### Candidate 2: Marcus Lee

- **Match Score:** 20/100
- **Shortlist Decision:** ❌ **REJECT**
- **Key Matching Qualifications:**
  - Some Python exposure via side projects
  - Backend engineering background (Java/Spring)
- **Gaps or Red Flags:**
  - Only ~1 year of Python (vs. 5+ required)
  - No AWS or cloud experience listed
  - No PostgreSQL experience (uses MongoDB)
  - No evidence of microservices architecture work
  - Neither nice-to-have qualification present
- **Justification:** Marcus falls significantly short on every required qualification and is not a viable candidate for a senior-level role.

---

### Candidate 3: Elena Rossi

- **Match Score:** 78/100
- **Shortlist Decision:** 🟡 **MAYBE**
- **Key Matching Qualifications:**
  - 6 years of Python (exceeds 5+ year requirement)
  - Strong microservices experience (direct match)
  - PostgreSQL experience (direct match)
  - Strong Kubernetes skills (nice-to-have ✓)
  - Payments/fintech domain experience (nice-to-have ✓)
- **Gaps or Red Flags:**
  - Cloud experience is on GCP, not AWS — would require ramp-up on AWS-specific services
- **Justification:** Elena is technically strong and domain-relevant, but her lack of direct AWS experience is a notable gap against a required qualification, warranting a screening call to assess transferability.

---

## Final Ranked Shortlist

| Rank | Name         | Score | Top Strength                                                       | Recommended Next Step                                                                 |
|------|--------------|-------|--------------------------------------------------------------------|---------------------------------------------------------------------------------------|
| 1    | Priya Shah   | 95    | Complete match: Python, AWS microservices, PostgreSQL, fintech     | Fast-track to technical interview with hiring manager                                 |
| 2    | Elena Rossi  | 78    | Deep microservices + Kubernetes + payments domain experience       | Initial screening call to evaluate AWS ramp-up ability and cross-cloud transferability |

*Marcus Lee not included — does not meet baseline requirements.*

Claude Haiku 4.5 9/10

# Technical Recruiter Screening Report
## Senior Backend Engineer Position

---

## Individual Candidate Evaluations

### Candidate 1: Priya Shah

**Match Score:** 95/100

**Shortlist Decision:** SHORTLIST

**Key Matching Qualifications:**
- 7 years Python experience (exceeds 5+ requirement)
- AWS microservices architecture (ECS + Lambda)
- Deep PostgreSQL expertise
- Fintech industry background (nice-to-have)
- Kubernetes exposure documented

**Gaps or Red Flags:**
- Kubernetes experience noted as "some" rather than demonstrating production-level proficiency

**Justification:** Priya exceeds all core requirements with strong fintech domain expertise and comprehensive AWS/microservices experience; minor gap in Kubernetes depth is negligible.

---

### Candidate 2: Marcus Lee

**Match Score:** 22/100

**Shortlist Decision:** REJECT

**Key Matching Qualifications:**
- 1 year Python exposure (side projects only)

**Gaps or Red Flags:**
- Only 3 years total experience; 5+ years required
- Primary expertise is Java/Spring, not Python
- No AWS or cloud platform experience documented
- MongoDB background (not PostgreSQL)
- No microservices or Kubernetes experience evident
- Does not meet 4 of 5 core requirements

**Justification:** Marcus lacks sufficient seniority, Python depth, and essential cloud/database skills required for a senior backend role.

---

### Candidate 3: Elena Rossi

**Match Score:** 78/100

**Shortlist Decision:** SHORTLIST

**Key Matching Qualifications:**
- 6 years Python experience (meets 5+ requirement)
- Strong microservices architecture background
- Production Kubernetes experience
- Payments/fintech industry experience (nice-to-have)
- PostgreSQL proficiency (plus MySQL)

**Gaps or Red Flags:**
- GCP background instead of AWS (core requirement mismatch)
- AWS-specific tools/services not demonstrated
- May require AWS ramp-up time despite equivalent cloud maturity

**Justification:** Elena meets experience and domain requirements with strong Kubernetes credentials, but GCP vs. AWS skill gap represents a meaningful qualification miss on a core requirement.

---

## Final Ranked Shortlist

| Rank | Name | Score | Top Strength | Recommended Next Step |
|------|------|-------|--------------|----------------------|
| 1 | Priya Shah | 95 | Fintech + AWS + Deep PostgreSQL | Proceed to technical interview; assess Kubernetes depth in detail |
| 2 | Elena Rossi | 78 | Kubernetes + Fintech Domain | Phone screening to assess AWS transferability and learning timeline |

---

## Summary
- **Shortlisted:** 2 candidates
- **Rejected:** 1 candidate
- **Recommendation:** Prioritize Priya Shah (ideal fit); Elena Rossi is viable if AWS onboarding timeline is acceptable.

GPT-5.4 8/10

## Candidate Evaluations

### 1) Candidate Name: Priya Shah
**Match Score:** 96/100  
**Shortlist Decision:** SHORTLIST

**Key Matching Qualifications**
- 7 years of Python experience, exceeding the 5+ years requirement
- Built microservices on AWS using ECS and Lambda
- Deep PostgreSQL expertise
- Fintech startup background aligns strongly with nice-to-have domain experience
- Some Kubernetes exposure

**Gaps or Red Flags**
- Kubernetes experience appears limited rather than deep

**Justification:** Priya is an excellent fit, meeting all required qualifications directly and also bringing relevant fintech experience plus some Kubernetes exposure.

---

### 2) Candidate Name: Marcus Lee
**Match Score:** 22/100  
**Shortlist Decision:** REJECT

**Key Matching Qualifications**
- Some Python exposure through side projects

**Gaps or Red Flags**
- Only 1 year of Python experience, far below the required 5+ years
- Primary background is Java/Spring rather than Python backend engineering
- No AWS or other cloud experience listed
- No microservices experience explicitly listed
- No PostgreSQL experience listed
- Database background is MongoDB, which does not match the PostgreSQL requirement
- No fintech experience mentioned
- No Kubernetes experience mentioned

**Justification:** Marcus does not meet several core requirements, especially Python tenure, AWS, PostgreSQL, and relevant backend stack alignment.

---

### 3) Candidate Name: Elena Rossi
**Match Score:** 84/100  
**Shortlist Decision:** SHORTLIST

**Key Matching Qualifications**
- 6 years of Python experience, meeting the 5+ years requirement
- Strong microservices experience
- Strong Kubernetes experience
- PostgreSQL experience
- Payments company background is highly relevant to fintech

**Gaps or Red Flags**
- Cloud experience is in GCP rather than AWS
- PostgreSQL expertise is present, but depth relative to “deep expertise” is not specified

**Justification:** Elena meets most core requirements and both nice-to-have areas, with the main gap being GCP experience instead of AWS.

---

## Final Ranked Shortlist

| Rank | Name         | Score | Top Strength                                      | Recommended Next Step                  |
|------|--------------|-------|---------------------------------------------------|----------------------------------------|
| 1    | Priya Shah   | 96    | Direct match across Python, AWS, microservices, PostgreSQL, and fintech | Fast-track to technical interview |
| 2    | Elena Rossi  | 84    | Strong Python, microservices, Kubernetes, PostgreSQL, and payments experience | Screen for AWS adaptability and depth |

Gemini 2.5 Pro 8/10

Of course. As an expert technical recruiter, here is the candidate shortlisting report.

***

### **Individual Candidate Evaluations**

---

**Candidate Name:** Priya Shah
**Match Score:** 95/100
**Shortlist Decision:** SHORTLIST

**Key Matching Qualifications:**
*   **Years of Experience:** 7 years of Python experience, exceeding the 5+ year requirement.
*   **Core Technology Stack:** Direct, professional experience with Python, AWS (ECS + Lambda), microservices, and PostgreSQL.
*   **Domain Experience:** Strong match with experience at a fintech startup.
*   **Nice-to-Haves:** Meets both nice-to-haves with fintech experience and some Kubernetes exposure.

Grok 4.1 Fast Reasoning 8/10

### Candidate Evaluations

**Candidate Name:** Priya Shah  
**Match Score:** 98  
**Shortlist Decision:** SHORTLIST  
**Key Matching Qualifications:**  
- 7 years Python experience exceeds 5+ years requirement  
- Built microservices on AWS (ECS + Lambda), directly matching AWS and microservices requirements  
- Deep PostgreSQL expertise fully aligns with database requirement  
- Fintech startup experience and some Kubernetes exposure cover both nice-to-haves  

**Gaps or Red Flags:**  
- Kubernetes exposure described as "some," not deep expertise  

**Justification:** Priya is an excellent match with all required skills demonstrated in a fintech context on AWS, making her a top candidate for immediate consideration.

**Candidate Name:** Marcus Lee  
**Match Score:** 25  
**Shortlist Decision:** REJECT  
**Key Matching Qualifications:**  
- 1 year Python via side projects shows basic familiarity  

**Gaps or Red Flags:**  
- Only 3 years Java/Spring primary experience with 1 year Python side projects, falling short of 5+ years Python requirement  
- No AWS or cloud experience listed  
- No mention of microservices  
- MongoDB background instead of PostgreSQL  

**Justification:** Marcus lacks the required 5+ years Python, AWS, microservices, and PostgreSQL experience, with his profile centered on Java and non-matching database tech.

**Candidate Name:** Elena Rossi  
**Match Score:** 82  
**Shortlist Decision:** MAYBE  
**Key Matching Qualifications:**  
- 6 years Python experience meets 5+ years requirement  
- Strong microservices experience aligns with requirement  
- PostgreSQL (plus MySQL) satisfies database requirement  
- Kubernetes expertise covers nice-to-have; payments company experience aligns with fintech nice-to-have  

**Gaps or Red Flags:**  
- GCP experience instead of required AWS (skills may transfer but not directly matching)  

**Justification:** Elena matches most requirements including Python tenure, microservices, PostgreSQL, and nice-to-haves, but requires validation of AWS transferability from her GCP background.

### Final Ranked Shortlist
| Rank | Name        | Score | Top Strength                          | Recommended Next Step          |
|------|-------------|-------|---------------------------------------|--------------------------------|
| 1    | Priya Shah | 98    | Fintech microservices on AWS + PostgreSQL | Schedule technical phone screen |
| 2    | Elena Rossi| 82    | Strong microservices/Kubernetes in payments | Assess AWS knowledge in interview |

What makes these work

01

Define hard vs. soft criteria upfront
Your prompt needs to tell the model which requirements are eliminators and which are nice-to-haves. If a certification is mandatory, label it as a hard requirement so the model does not score around it. Mixing mandatory and preferred criteria in one undifferentiated list produces inconsistent shortlists where deal-breaker gaps get smoothed over.
02

Request a numeric score with a rationale
Asking for a score out of 100 plus a written reason forces the model to be explicit about its reasoning rather than giving you a vague recommendation. This also makes it easy to set a threshold, such as auto-advance everyone above 80, flag 60-79 for human review, and decline below 60, without reading every output manually.
03

Paste the full job description, not a summary
The more specific your job description input, the more specific the evaluation output. A two-sentence job summary produces generic scoring. Including the actual responsibilities, required qualifications, and preferred qualifications gives the model real criteria to work against and reduces hallucinated requirements.
04

Batch resumes one at a time for accuracy
Feeding multiple resumes in a single prompt increases the chance of the model confusing details between candidates or producing compressed evaluations. Process one resume per prompt call, then aggregate the scores in a spreadsheet. For high-volume workflows, this is easy to automate with a simple loop in your preferred scripting language or no-code tool.

More example scenarios

#01 · Software engineering role with specific stack requirements

Input

Job: Senior Backend Engineer. Requirements: 5+ years Python, experience with PostgreSQL and AWS, prior fintech or payments domain preferred. Resume: Maria Chen, 7 years Python development, led backend for Stripe integration at two startups, AWS Certified Developer, PostgreSQL and Redis, B.S. Computer Science Stanford. No fintech listed explicitly but payment systems mentioned twice.

Expected output

Score: 88/100. Advance: Yes. Python and AWS requirements met and exceeded. PostgreSQL confirmed. Fintech domain not explicitly stated but direct payments infrastructure experience at two companies is functionally equivalent. Recommend tier-1 interview slot. Gap: no explicit fintech employer, low risk given payments context.

#02 · High-volume retail manager screening

Input

Job: Store Manager, 50-person team, P&L responsibility, 3+ years retail management required. Resume: James Okafor, assistant manager 2 years at Target, shift supervisor 1 year Walmart, no mention of P&L ownership, managed teams of 8-12, applied to 6 other roles at same company in same week.

Expected output

Score: 54/100. Advance: No. Falls short on tenure (3 years required, 3 years total but not in manager role) and no demonstrated P&L ownership. Team sizes managed are significantly below role scope. Consider for assistant manager pipeline if that position opens.

#03 · Healthcare compliance analyst with certification requirements

Input

Job: HIPAA Compliance Analyst, CHC certification required, healthcare data experience mandatory, 2+ years in compliance role. Resume: Sandra Torres, Certified in Healthcare Compliance (CHC) since 2021, 3 years compliance analyst at regional hospital system, led HIPAA audit remediation for 4 departments, experience with Epic EHR documentation.

Expected output

Score: 95/100. Advance: Yes. All hard requirements met. CHC certified, exceeds experience minimum, direct HIPAA audit experience is directly relevant. Epic familiarity is a bonus not listed in JD. Priority candidate, recommend fast-track to hiring manager review.

#04 · Marketing role where candidate over-qualifies

Input

Job: Marketing Coordinator, entry-level, social media scheduling and reporting, salary band $42,000-$50,000. Resume: David Park, 8 years marketing experience, former VP Marketing at Series B startup, managed $2M ad budget, now seeking coordinator role, no reason given for downward move.

Expected output

Score: 71/100. Advance: Flag for review, do not auto-advance. Skills exceed requirements significantly. Overqualification risk is high for a coordinator role at this salary band. Recommend recruiter screen call specifically to clarify motivation before committing an interview slot. If reason is credible, candidate could be strong.

#05 · Screening for a bilingual customer support role

Input

Job: Customer Support Specialist, fluent Spanish required, Zendesk experience preferred, 1+ year support experience. Resume: Aisha Reyes, 2 years customer support at SaaS company, Zendesk certified user, English native, Spanish conversational per resume self-assessment, handled Latin American customer accounts.

Expected output

Score: 78/100. Advance: Conditional. Experience and Zendesk requirements met. Spanish proficiency is self-reported as conversational, not fluent. Latin American account history suggests practical use. Recommend a Spanish-language portion in the phone screen to verify fluency before advancing to full interview.

Common mistakes to avoid

Using a vague or generic job description
The most common failure is inputting a job description that says things like 'strong communication skills' and 'team player' without concrete, measurable requirements. The AI will evaluate candidates against whatever you give it, and a weak job description produces meaningless scores. Rewrite your JD to include specific skills, experience levels, and certifications before you automate screening.
Treating AI scores as final decisions
Automated shortlisting is a filter, not a hiring decision. Using scores to automatically reject candidates without any human review creates legal exposure and misses edge cases the model flags incorrectly. Set your workflow so that AI output feeds a human decision point, especially for candidates scored near your advance threshold.
Not accounting for resume formatting variation
Resumes submitted as PDFs, Word docs, or through ATS exports often lose formatting when converted to plain text. A resume that reads cleanly in PDF may have garbled section headers as raw text, causing the model to miss sections entirely. Standardize your text extraction step before feeding resumes into the prompt.
Ignoring potential bias amplification
If your prompt includes criteria that correlate with protected characteristics, such as specific school names or location requirements, the model will score against them consistently at scale. Review your criteria for disparate impact before automating. Document your scoring rubric so you can demonstrate the process is based on job-relevant qualifications.
Skipping calibration on a test batch
Running your prompt on 200 resumes without first testing it on 10 resumes you have already reviewed manually is a mistake that costs you trust in the output. Calibrate first: compare AI scores to your own assessments on a small sample, adjust the criteria weighting in the prompt, then run at scale.

Related queries

Frequently asked questions

Is automated resume shortlisting legal for hiring?

It is legal in most jurisdictions if the criteria used are job-relevant and applied consistently. However, several U.S. states including New York City have laws requiring bias audits for automated employment decision tools. Check local regulations, ensure your screening criteria do not create disparate impact on protected classes, and keep documentation of your scoring rubric.

How accurate is AI resume screening compared to human reviewers?

For matching explicit, measurable criteria such as certifications, years of experience, or specific tools, AI screening is highly consistent and often more accurate than human reviewers who vary based on fatigue and personal bias. For nuanced judgment calls involving career narrative or leadership potential, human review still performs better. The two work well in combination.

Which AI model works best for resume shortlisting?

The comparison table on this page shows how GPT-4o, Claude 3.5 Sonnet, Gemini 1.5 Pro, and Llama 3 handle the same screening prompt. GPT-4o and Claude tend to produce more detailed reasoning per candidate, which is useful when you want to understand the score. For raw throughput on large batches via API, any of the top-tier models performs well given a structured prompt.

Can I use this with my existing ATS?

Most modern ATS platforms including Greenhouse, Lever, and Workday can export application data as CSV or via API. You extract the resume text, pass it through the AI screening prompt using a tool like Zapier, Make, or a custom script, and then feed the scores back into your ATS as a custom field or tag. You do not need to replace your ATS to add AI screening on top of it.

What should I do with candidates the AI scores in the middle range?

Candidates scored in a middle band, typically 55 to 75 out of 100, are the ones most worth a quick human look. The model is signaling partial fit, and the written rationale will tell you whether the gap is on a hard requirement or a preferred one. A 10-minute recruiter phone screen on borderline candidates costs less than a mis-hire from skipping them entirely.

How do I handle resumes that are mostly images or have unusual formatting?

Image-based resumes cannot be read by the AI without first running them through an OCR tool such as Adobe Acrobat, AWS Textract, or Google Document AI to extract text. Unusual formatting like multiple columns or heavy design templates often breaks OCR output. For high-volume screening, require applicants to submit plain PDF resumes or use your ATS plain-text export to standardize input before the AI step.