Six Strategies. One Goal:
Every PDF Accessible.
One platform with six built-in fix strategies. OCR, AI vision, structure rebuilds — everything is included. The system automatically selects the right approach for each document and escalates until Section 508 verification passes or review is required.
The 6-Stage Remediation Pipeline
Quick Fix
Repairs document metadata, language tags, structure tree, and PDF/UA identifiers. Handles the most common compliance failures — missing titles, incorrect role mappings, broken tag hierarchies — without altering page content.
OCR Remediation
Extracts text from scanned or image-heavy PDFs using optical character recognition, then rebuilds the document with a proper structure tree, tagged content, and reading order. Turns flat images into searchable, screen-reader-accessible documents.
Vision AI Analysis
When structural repairs aren’t enough, a vision model analyzes each page visually — understanding headings, tables, lists, and reading flow from the rendered layout. Generates a semantic structure informed by what a human reader would see, not just what the file’s byte stream contains.
Structure Rebuild
For documents with deeply corrupted tag trees, the structure is linearized and rebuilt from scratch. Content is re-extracted, re-tagged, and re-validated, preserving the original appearance while replacing the accessibility layer entirely.
Strip & Rebuild
The most aggressive structural fix. All existing markup, annotations, and metadata are stripped to raw content streams, then rebuilt with a clean structure tree, proper artifact wrapping, and Section 508 validation evidence — effectively a fresh accessibility pass on the original visual content.
Render & Rebuild
The last resort for documents that resist all other strategies. Each page is rendered to a high-fidelity image, then OCR and vision AI reconstruct the document from pixels up — new text layer, new structure, new metadata. The visual appearance is preserved pixel-for-pixel.
Adaptive Escalation, Not Brute Force
The system validates after every stage using the same 104-rule Matterhorn Protocol engine used by national archives. If a fix doesn’t satisfy the Section 508 gate, the next useful strategy takes over — each one more powerful than the last.
Clause-Targeted Refinement
After initial remediation, remaining failures are analyzed by specific PDF/UA clause. The system generates targeted fixes for each individual violation, re-validates, and iterates until Section 508 verification passes or the document is flagged for human review.
This iterative, clause-level approach is what pushes compliance from ~90% to 99%+. Instead of applying broad fixes that may introduce new issues, each iteration addresses exactly the violations that remain — surgical precision at scale.
What You Get Back
.zip Archive or Individual Downloads
Every verified PDF available as a single .zip download or individual file links. Each is a drop-in replacement — same appearance, now with Section 508 validation evidence.
Compliance Verification Reports
Each PDF includes a validation report showing pass/fail against all 104 Matterhorn Protocol rules — documentation for auditors, legal, and procurement.
Ongoing Monitoring
Weekly rescans detect new PDFs and regressions automatically. New documents enter the same pipeline — compliance is maintained, not just achieved once.
All six strategies — including OCR, AI vision analysis, and full page rendering — are built into one platform. The Agent uses deeper strategies only when validation evidence shows they are useful.
Proven at Scale
Every fix verified against veraPDF, the same validation engine used by the Library of Congress and European national archives.
Ready to Make Your PDFs Accessible?
Start a free scan and see how many of your PDFs need remediation.