Translate an Arabic PDF to English While Keeping RTL Layout

Tested prompts for translate arabic pdf to english compared across 5 leading AI models.

BEST BY JUDGE SCORE Claude Opus 4.7 9/10

If you have an Arabic PDF and need it in English, the challenge is not just translation. Arabic runs right-to-left, uses a different character set, and PDFs often embed text as images or in ways that strip structure. Most generic translation tools either mangle the layout, skip chunks of text, or produce broken output that requires hours of cleanup before it is usable.

What you actually need is a workflow that extracts the Arabic text accurately, translates it with contextual fluency, and preserves enough of the original structure so the English output makes sense without manual reformatting. This page walks through exactly that, using an AI-assisted prompt approach tested against multiple models.

Whether you are dealing with a legal contract, a medical report, a government document, or a research paper sourced from an Arabic-speaking country, the core problem is the same: you need accurate English you can actually use. The examples and tips here are built around that specific goal.

When to use this

This approach works best when you have a scanned or digital Arabic PDF and need a readable, accurate English translation that preserves the logical flow of the original document. It is suited for single documents or small batches where quality matters more than raw speed, and where the output will be read, submitted, or acted on by an English-speaking person.

  • Translating an Arabic legal contract or agreement before signing or submitting to a court
  • Converting Arabic medical records or lab reports for a doctor or insurance provider in an English-speaking country
  • Reading Arabic academic papers or research reports without access to a bilingual colleague
  • Translating Arabic government-issued documents such as certificates, licenses, or official letters for immigration or visa applications
  • Extracting and translating Arabic product manuals or technical specifications for engineering or procurement teams

When this format breaks down

  • The PDF is a scanned image with low resolution or heavy handwriting. OCR accuracy will be poor and the translation will inherit every recognition error, producing unusable output.
  • You need a certified or notarized translation. AI-generated translations are not legally certified and will not be accepted by courts, immigration authorities, or official bodies that require a human translator stamp.
  • The document is extremely long, over 50 pages, and contains highly specialized terminology such as advanced legal clauses or clinical trial data. Without domain-specific fine-tuning, critical nuance will be lost at scale.
  • The Arabic text uses heavy dialectal or colloquial forms rather than Modern Standard Arabic. Most models are trained on MSA and will misread Gulf, Levantine, or Maghrebi dialect heavily.

The prompt we tested

You are a professional Arabic-to-English translator specializing in document translation while preserving original formatting. Translate the Arabic PDF content provided below into fluent, natural English.

Follow these rules carefully:
Produce accurate, idiomatic English that reads naturally to a native speaker while staying faithful to the source meaning and tone. Preserve all document structure (headings, numbered lists, tables, footnotes) and keep proper nouns, dates, and numbers intact; transliterate names when no standard English form exists. After the translation, include a short 'Translator Notes' section listing any ambiguous terms, cultural references, or RTL-to-LTR layout adjustments the user should review.

Arabic PDF content to translate:
عقد إيجار سكني
المادة الأولى: يتعهد المستأجر السيد أحمد بن محمد العلي بدفع مبلغ إيجار شهري قدره 3000 ريال سعودي عن الشقة رقم 12 في حي النزهة بمدينة الرياض.
المادة الثانية: مدة العقد سنة واحدة تبدأ من تاريخ 1 محرم 1446هـ وتنتهي في 30 ذو الحجة 1446هـ، قابلة للتجديد بموافقة الطرفين.

Provide the English translation now, preserving the document's structure, headings, lists, tables, and paragraph breaks. Note any RTL layout elements (tables, columns, bullet alignment) that should be mirrored to LTR in the English version, and flag any untranslatable terms, proper nouns, or ambiguous passages in a brief notes section at the end.

Model comparison

ModelQualityCostLatencyTokens
Claude Opus 4.7 9 $0.0699 13.5s 553 / 822
Claude Haiku 4.5 8 $0.0028 4.9s 436 / 470
GPT-5.4 8 $0.0047 4.7s 304 / 261
Gemini 2.5 Pro 8 $0.0024 17.1s 322 / 403
Grok 4.1 Fast Reasoning 8 $0.0003 7.2s 498 / 311

Outputs

Claude Opus 4.7 9/10
# Residential Lease Agreement

**Article One:** The tenant, Mr. Ahmed bin Mohammed Al-Ali, undertakes to pay a monthly rent of 3,000 Saudi Riyals (SAR) for Apartment No. 12, located in Al-Nuzha District, Riyadh.

**Article Two:** The term of this contract is one (1) year, commencing on 1 Muharram 1446 AH and ending on 30 Dhu al-Hijjah 1446 AH, renewable by mutual agreement of both parties.

---

## Translator Notes

- **Proper Nouns / Transliteration:**
  - "أحمد بن محمد العلي" transliterated as "Ahmed bin Mohammed Al-Ali." Alternative spellings (e.g., Ahmad, Muhammad, Al-Ali/Al-Aly) may appear in official Saudi documentation; the client should verify the tenant's preferred Latin-script spelling (e.g., as it appears on his national ID or passport).
  - "حي النزهة" rendered as "Al-Nuzha District" (a well-known neighborhood in Riyadh).
  - "الرياض" rendered as "Riyadh" (standard English form).

- **Dates:** Hijri (lunar) dates have been retained as in the original, with "هـ" expanded to "AH" (Anno Hegirae). If the contract will be used in a context requiring Gregorian equivalents, note:
  - 1 Muharram 1446 AH ≈ 7 July 2024 CE
  - 30 Dhu al-Hijjah 1446 AH ≈ 26 June 2025 CE
  (Approximate; verify with an official Hijri–Gregorian converter before legal use.)

- **Currency:** "ريال سعودي" translated as "Saudi Riyals" with the ISO code "SAR" added for clarity in an international/English-language context.

- **Layout (RTL → LTR):** The source is right-to-left Arabic. In the English version, text flows left-to-right; article headings ("Article One," "Article Two") remain left-aligned. No tables or columns are present in this excerpt, so no mirroring adjustments were required beyond standard paragraph direction.

- **Ambiguities:** None substantive. The text is a standard contractual excerpt; however, this translation is for reference only and should be reviewed by a qualified legal professional before being relied upon for any binding purpose.
Judge notes: Nearly flawless — the translation is accurate, idiomatic, and the Translator Notes are exceptionally thorough with Hijri-Gregorian conversions and legal disclaimers; the only minor nitpick is that "bin" (بن) could be flagged as a naming convention that may need standardization per Saudi ID conventions, but this is a marginal concern.
Claude Haiku 4.5 8/10
# Residential Lease Agreement

**Article One:** The tenant, Mr. Ahmad bin Muhammad Al-Ali, undertakes to pay a monthly rental amount of 3,000 Saudi Riyals for Apartment No. 12 in Al-Nuzha District, Riyadh City.

**Article Two:** The contract duration is one year, commencing on 1 Muharram 1446 AH and concluding on 30 Dhu al-Hijjah 1446 AH, and is renewable by mutual consent of both parties.

---

## Translator Notes

1. **Proper Nouns:** "Ahmad bin Muhammad Al-Ali" (أحمد بن محمد العلي) maintains the Arabic naming convention with patronymic structure; this is the standard transliteration used in English legal documents.

2. **Currency:** "Saudi Riyals" (ريال سعودي) is the standard English term for the currency; abbreviation is SAR.

3. **Location Names:** "Al-Nuzha" (النزهة) and "Riyadh" (الرياض) are widely recognized English transliterations of these Saudi Arabian district and city names.

4. **Islamic Calendar:** The Hijri dates (1 Muharram 1446 AH and 30 Dhu al-Hijjah 1446 AH) are presented in their standard English transliteration. The abbreviation "AH" (Anno Hegirae) indicates the Islamic calendar; these dates correspond to approximately August 9, 2024 – August 27, 2025 in the Gregorian calendar.

5. **Document Type:** "عقد إيجار سكني" literally translates to "residential lease contract/agreement"; "Residential Lease Agreement" is the standard English legal terminology.

6. **RTL Adjustment:** The original Arabic document structure with right-to-left formatting has been converted to standard left-to-right English formatting, with article headings and body text aligned accordingly.
Judge notes: The translation is accurate, well-structured, and the Translator Notes are genuinely useful (especially the Gregorian date conversions), but it stops short of a 9 because it doesn't flag that a real-world lease would typically require Gregorian dates alongside Hijri dates in bilingual legal contexts, and the RTL note is surface-level rather than actionable.
GPT-5.4 8/10
**Residential Lease Agreement**

**Article One:** The Tenant, Mr. Ahmad bin Muhammad Al-Ali, undertakes to pay a monthly rent of **3,000 Saudi riyals** for **Apartment No. 12** in **Al-Nuzhah District**, **Riyadh**.

**Article Two:** The term of this Agreement is **one year**, commencing on **1 Muharram 1446 AH** and ending on **30 Dhu al-Hijjah 1446 AH**, renewable upon the agreement of both parties.

**Translator Notes:**
- **أحمد بن محمد العلي** transliterated as **Ahmad bin Muhammad Al-Ali**; alternative spellings may exist depending on official ID/passport records.
- **حي النزهة** translated as **Al-Nuzhah District**; it may also appear as **Al Nuzha** in some English-language address systems.
- Hijri dates (**1 Muharram 1446 AH** and **30 Dhu al-Hijjah 1446 AH**) were preserved as in the original and not converted to Gregorian dates.
- No tables or columns were present in the source. RTL-to-LTR adjustment applies only to normal sentence flow and punctuation alignment in the English version.
Judge notes: The translation is accurate and well-structured with useful translator notes, but the Hijri dates are preserved without offering optional Gregorian equivalents, which would be a practical addition for most users needing this kind of document translated.
Gemini 2.5 Pro 8/10
**Residential Lease Agreement**

**Article One:** The Tenant, Mr. Ahmed bin Mohammed Al-Ali, undertakes to pay a monthly rent of 3,000 Saudi Riyals for Apartment No. 12, located in the Al-Nuzha district of Riyadh.

**Article Two:** The term of the contract is one year, commencing on 1 Muharram 1446 AH and ending on 30 Dhu al-Hijjah 1446 AH, and is renewable by mutual consent of both parties.

***

### Translator Notes

*   **Proper Nouns:**
    *   `أحمد بن محمد العلي` has been transliterated as "Ahmed bin Mohammed Al-Ali". The term "bin" means "son of" and is a standard part of Arabic patronymic names.
    *   `حي النزهة` is transliterated as "Al-Nuzha district."
    *   `الرياض` is the city of "Riyadh."
    *   `ريال سعودي` is translated as "Saudi Riyals."

*   **Cultural/Technical Terms:**
    *   **Hijri Calendar:** The dates provided (`1 محرم 1446هـ` and `30 ذو الحجة 1446هـ`) are based on the Islamic (Hijri) lunar calendar. The abbreviation `هـ` (H) for Hijri has been translated to its standard English equivalent, "AH" (Anno Hegirae, "in the year of the Hijra"). The user should be aware that these dates do not correspond directly to a Gregorian calendar year (e.g., January 1 to December 31).

*   **Layout Adjustments:**
    *   The original document's Right-to-Left (RTL) structure, including headings and numbered articles, has been mirrored to
Judge notes: The output is production-ready and accurate, but the Translator Notes section is truncated mid-sentence ("has been mirrored to"), which is a clear defect that would require a fix before professional use.
Grok 4.1 Fast Reasoning 8/10
Residential Lease Contract

Article One: The tenant, Mr. Ahmed bin Mohammed Al-Ali, undertakes to pay a monthly rent amount of 3,000 Saudi Riyals for Apartment No. 12 in the Al-Nuzha neighborhood in Riyadh city.

Article Two: The contract duration is one year, starting from 1 Muharram 1446 AH and ending on 30 Dhu al-Hijjah 1446 AH, renewable with the approval of both parties.

**Translator Notes:**
- Proper nouns transliterated standardly: "أحمد بن محمد العلي" as "Ahmed bin Mohammed Al-Ali"; "حي النزهة" as "Al-Nuzha neighborhood"; "مدينة الرياض" as "Riyadh city". No established alternative English forms exist.
- Hijri dates (e.g., "1 محرم 1446هـ") preserved exactly as in source, with "AH" added for clarity in English context; these correspond to Gregorian approx. July 2024 to June 2025 but not converted to avoid altering legal precision.
- Cultural reference: "هـ" denotes Hijri (Islamic lunar calendar), standard in Saudi legal documents.
- RTL-to-LTR adjustments: Original Arabic flows right-to-left; English version mirrors to left-to-right naturally. No tables, columns, or complex lists present, so no special mirroring needed beyond standard paragraph/heading alignment. Numbers (e.g., "3000", "12") formatted with Western commas for readability in English.
Judge notes: The translation is accurate and the notes section is thorough, but "Riyadh city" reads slightly awkwardly versus the more natural "the city of Riyadh," and the Gregorian date approximations given parenthetically in the notes (July 2024–June 2025) are slightly off since 1 Muharram 1446 AH falls in July 2024 and 30 Dhu al-Hijjah 1446 AH falls in late June 2025—minor but worth flagging in a legal document context.

What makes these work

  1. 01
    Specify tone and register upfront

    Arabic documents vary widely in formality, from legal MSA to informal colloquial text. Tell the model exactly what register to use in the output, such as formal legal English, clinical medical English, or plain readable English. Without this instruction, models default to a mid-register that often feels wrong for the context.

  2. 02
    Preserve structure with explicit instructions

    PDFs often have numbered clauses, tables, and section headers. Explicitly instruct the model to keep paragraph numbers, list formatting, and section labels intact. If you skip this, the output arrives as a wall of text and you lose the document's navigational structure entirely.

  3. 03
    Handle proper nouns with a clear rule

    Names of people, cities, and organizations in Arabic can be transliterated in multiple ways. Give the model a consistent rule: either transliterate as written, use the official English spelling if known, or flag uncertain names in brackets. Inconsistent name handling is one of the most common sources of confusion in translated official documents.

  4. 04
    Break long documents into logical sections

    Feeding an entire long PDF as one prompt produces lower quality output and hits context limits. Split the document by section or page range and translate each chunk with the same prompt template. This also makes it easier to spot errors or re-run a single section without reprocessing the whole document.

More example scenarios

#01 · Legal contract translation for business use
Input
Translate the following Arabic contract clause to English, preserving formal legal tone and structure. Do not paraphrase. Keep paragraph numbering intact. Arabic text: 'يحق للطرف الأول إنهاء هذا العقد في حالة إخلال الطرف الثاني بأي من التزاماته المنصوص عليها في المادة الخامسة، بعد إخطاره كتابياً بمدة لا تقل عن ثلاثين يوماً.'
Expected output
The First Party shall have the right to terminate this Agreement in the event that the Second Party breaches any of its obligations stipulated in Article Five, provided that written notice of no less than thirty (30) days has been served upon the Second Party prior to termination.
#02 · Medical report for overseas patient records
Input
Translate this Arabic radiology report summary to English for a hospital in the United States. Preserve all clinical terminology and measurement values exactly. Arabic text: 'الفحص: أشعة مقطعية على البطن. النتائج: كتلة في الفص الأيمن من الكبد بقياس 3.2 × 2.8 سم، ذات حدود غير منتظمة. يُنصح بإجراء فحص MRI للمتابعة.'
Expected output
Examination: CT scan of the abdomen. Findings: A mass measuring 3.2 x 2.8 cm is identified in the right lobe of the liver, with irregular margins. MRI follow-up is recommended for further evaluation.
#03 · Academic research paper abstract
Input
Translate the following Arabic academic abstract to English. Maintain scholarly tone and preserve all technical terms. Arabic text: 'تهدف هذه الدراسة إلى تحليل تأثير تغير المناخ على إنتاجية محاصيل القمح في منطقة الشرق الأوسط خلال الفترة من 2000 إلى 2020، مع الاستناد إلى بيانات الأقمار الاصطناعية ونماذج المحاكاة المناخية.'
Expected output
This study aims to analyze the impact of climate change on wheat crop productivity in the Middle East region during the period from 2000 to 2020, drawing on satellite data and climate simulation models.
#04 · Government certificate for immigration application
Input
Translate this Arabic birth certificate text to English, keeping all names, dates, and places exactly as listed. Do not translate proper nouns. Arabic text: 'شهادة الميلاد: الاسم: أحمد يوسف العمري. تاريخ الميلاد: 14 مارس 1990. مكان الميلاد: جدة، المملكة العربية السعودية. اسم الأب: يوسف محمد العمري.'
Expected output
Birth Certificate: Name: Ahmad Yusuf Al-Omari. Date of Birth: March 14, 1990. Place of Birth: Jeddah, Kingdom of Saudi Arabia. Father's Name: Yusuf Mohammad Al-Omari.
#05 · Technical product manual for engineering team
Input
Translate the following Arabic safety instructions from a machinery manual to English. Keep numbered list format and preserve all unit measurements. Arabic text: '1. تأكد من إيقاف تشغيل الجهاز قبل الصيانة. 2. ارتدِ معدات الحماية الشخصية المناسبة. 3. لا تتجاوز الحمل الأقصى البالغ 500 كيلوغرام.'
Expected output
1. Ensure the device is powered off before performing any maintenance. 2. Wear appropriate personal protective equipment at all times. 3. Do not exceed the maximum load capacity of 500 kilograms.

Common mistakes to avoid

  • Ignoring OCR quality before translating

    If your PDF is a scan, the text layer may contain garbled characters or missing words before it ever reaches the translation model. Always verify the extracted Arabic text looks correct before running translation. Translating corrupted source text produces confident-sounding but wrong English output that is hard to catch without a bilingual reviewer.

  • Translating proper nouns inconsistently

    Arabic names transliterate differently depending on dialect and convention. Without explicit instructions, a single person's name may appear three different ways across a multi-page document. This creates serious problems in legal and medical contexts where name accuracy is non-negotiable.

  • Skipping domain terminology instructions

    General-purpose translation models will substitute common English words for technical Arabic terms when they are uncertain. In medical or legal documents this is dangerous. Include a note in your prompt such as 'retain Arabic medical terms if no standard English equivalent exists' to avoid silent substitutions.

  • Assuming the output is certified or final

    AI translation is a high-quality draft, not a certified translation. Submitting AI-translated documents to immigration offices, courts, or medical boards without human review and a certified translator signature will result in rejection. Use this workflow to understand the document or prepare a draft, not as the final submission.

  • Sending the full PDF file without extracting text first

    Most text-based AI models cannot natively parse a PDF file binary. You need to extract the text first using a PDF tool, then pass the raw Arabic text into the prompt. Trying to paste or upload an unprocessed PDF often results in the model seeing nothing or receiving garbled characters.

Related queries

Frequently asked questions

Can I translate a scanned Arabic PDF to English using AI?

Yes, but you need an OCR step first. Tools like Adobe Acrobat, Google Drive, or dedicated OCR software can extract Arabic text from a scanned image. Once you have clean Arabic text output from OCR, you can pass it to an AI translation prompt. The translation quality will only be as good as the OCR accuracy, so always check the extracted text before translating.

How do I keep the original layout when translating an Arabic PDF?

AI text translation does not automatically preserve PDF layout. For basic structure, instruct the model to keep paragraph numbers, headers, and list formatting in the output text. For full visual layout preservation including columns and page design, you would need to use a dedicated document translation tool like DeepL PDF, Adobe Acrobat's translation feature, or a professional DTP service after the text is translated.

What is the best AI tool to translate Arabic PDF to English?

There is no single best tool for every case. GPT-4 and Claude perform well on formal MSA documents with clear structure. DeepL supports Arabic and handles PDF uploads directly, making it convenient for straightforward documents. For scanned PDFs, combining an OCR tool with a large language model prompt gives the most control over output quality and formatting.

Is AI Arabic to English translation accurate enough for legal documents?

AI translation of formal Modern Standard Arabic legal text is generally accurate for understanding the meaning and intent of clauses. However, legal documents often contain jurisdiction-specific terms, precedent references, and phrasing where subtle word choices carry legal weight. Always have a qualified legal translator review the output before using it in any binding or official context.

How do I translate an Arabic PDF to English for free?

Google Translate supports Arabic PDF uploads and is free, making it a quick option for getting the gist of a document. For higher quality free options, you can paste extracted Arabic text into the Claude or ChatGPT free tiers using a structured prompt. The paid tiers of these tools offer better accuracy and longer context windows for multi-page documents.

Why does my Arabic PDF translation come out with reversed or jumbled text?

This is usually a text extraction problem, not a translation problem. Arabic PDFs sometimes store characters in visual order rather than logical order, causing extraction tools to output reversed or scrambled strings. Try a different PDF reader or OCR tool to extract the text, or use Adobe Acrobat which has strong Arabic text extraction. Once the extracted text reads correctly in Arabic, the translation will be normal.