PDF Remediation

Six Strategies. One Goal:
Every PDF Accessible.

One platform with six built-in fix strategies. OCR, AI vision, structure rebuilds — everything is included. The system automatically selects the right approach for each document and escalates until Section 508 verification passes or review is required.

2,400+ PDFs Fixed

99%+ Compliance

6 Fix Strategies

The 6-Stage Remediation Pipeline

Quick Fix

Repairs document metadata, language tags, structure tree, and PDF/UA identifiers. Handles the most common compliance failures — missing titles, incorrect role mappings, broken tag hierarchies — without altering page content.

OCR Remediation

Extracts text from scanned or image-heavy PDFs using optical character recognition, then rebuilds the document with a proper structure tree, tagged content, and reading order. Turns flat images into searchable, screen-reader-accessible documents.

Vision AI Analysis

When structural repairs aren’t enough, a vision model analyzes each page visually — understanding headings, tables, lists, and reading flow from the rendered layout. Generates a semantic structure informed by what a human reader would see, not just what the file’s byte stream contains.

Structure Rebuild

For documents with deeply corrupted tag trees, the structure is linearized and rebuilt from scratch. Content is re-extracted, re-tagged, and re-validated, preserving the original appearance while replacing the accessibility layer entirely.

Strip & Rebuild

The most aggressive structural fix. All existing markup, annotations, and metadata are stripped to raw content streams, then rebuilt with a clean structure tree, proper artifact wrapping, and Section 508 validation evidence — effectively a fresh accessibility pass on the original visual content.

Render & Rebuild

The last resort for documents that resist all other strategies. Each page is rendered to a high-fidelity image, then OCR and vision AI reconstruct the document from pixels up — new text layer, new structure, new metadata. The visual appearance is preserved pixel-for-pixel.

Adaptive Escalation, Not Brute Force

Fix

Validate

Pass?

Done

Escalate

The system validates after every stage using the same 104-rule Matterhorn Protocol engine used by national archives. If a fix doesn’t satisfy the Section 508 gate, the next useful strategy takes over — each one more powerful than the last.

Clause-Targeted Refinement

After initial remediation, remaining failures are analyzed by specific PDF/UA clause. The system generates targeted fixes for each individual violation, re-validates, and iterates until Section 508 verification passes or the document is flagged for human review.

This iterative, clause-level approach is what pushes compliance from ~90% to 99%+. Instead of applying broad fixes that may introduce new issues, each iteration addresses exactly the violations that remain — surgical precision at scale.

Want to understand the specific checks we validate against? Our Remediation Guide covers all 28 WCAG checks, the Matterhorn Protocol's 136 failure conditions, and why most PDF checkers miss critical issues.

What You Get Back

.zip Archive or Individual Downloads

Every verified PDF available as a single .zip download or individual file links. Each is a drop-in replacement — same appearance, now with Section 508 validation evidence.

Compliance Verification Reports

Each PDF includes a validation report showing pass/fail against all 104 Matterhorn Protocol rules — documentation for auditors, legal, and procurement.

Ongoing Monitoring

Weekly rescans detect new PDFs and regressions automatically. New documents enter the same pipeline — compliance is maintained, not just achieved once.

All six strategies — including OCR, AI vision analysis, and full page rendering — are built into one platform. The Agent uses deeper strategies only when validation evidence shows they are useful.

Proven at Scale

2,400+ PDFs fixed and verified

7,000+ PDFs discovered & scanned

99%+ Section 508 compliance

Every fix verified against veraPDF, the same validation engine used by the Library of Congress and European national archives.

Ready to Make Your PDFs Accessible?

Start a free scan and see how many of your PDFs need remediation.

Start Free Scan Remediation Guide → How It Works →