In a previous post on general PDF accessibility we’ve talked about the spectrum of PDFs and why starting with an accessible source file matters. This post focuses on one specific and especially challenging type of PDF, scanned image PDFs.
A scanned image PDF is a picture of a document saved as a PDF. These usually come from physical books, articles, or handwritten documents placed on a scanner or copier. While these PDFs may look fine on screen, they are often completely inaccessible to screen readers and other assistive technologies.
Scanned image PDFs are the hardest type of PDF to remediate because they start with no underlying structure. There is no real text, no headings, no reading order, and no tags. Everything that makes a PDF accessible must be added after the fact.
Adding these features requires either an automated tool, such as Panorama, or advanced manual remediation skills. Automated tools can help, but they have real limitations. The more complex the layout, such as multiple columns, tables, or footnotes, the harder it is for automated tools to make accurate guesses. Manual PDF remediation is possible, but it is a specialized skill that takes time and practice to develop.
We continue to work on obtaining better tools and creating better processes to help faculty obtain fully accessible PDFs. In the meantime, there are minimum expectations for scanned PDFs used in courses which include:
- OCR (optical character recognition) so the text can be read, searched, and selected
- Basic tags so assistive technology can understand the structure and reading order
- Alternative text for images, graphs, diagrams, etc. within the PDF so important visuals are accessible
These minimums alone are not sufficient to reach full accessibility, but as we continue to improve in this area, these are steps you can take now to improve accessibility. Faculty can find instructions on how to generate OCR, tags, and alt text in the Accessible PDFs guide.
As a reminder, before using a scanned PDF, faculty should first try to locate a more accessible version of the content, such as a digital article from the library, a publisher-provided PDF, or alternative materials that are already accessible. Simpson Library staff will also work with faculty who are trying to find more accessible versions of their course materials. Please reach out to them at umwlibaries@umw.edu.
Best Practices for Scanning
One place where accessibility can be improved from the start is at the point of scanning. The library’s public scanners include OCR capabilities and other tools for creating a clean scan, which makes them a much better choice than a departmental copier or mobile scanning apps.
When creating a new scan we encourage faculty to:
- Follow the Library’s best practices for creating scanned materials guide. This guide includes general best practices as well as a how to use their scanners.
- If the material is already owned by the Library, make a digitization request and they will scan it for you.
Get Support
If you have questions about creating accessible scanned image PDFs you can:
- Book an Accessibility Consultation
- Send an e-mail to t2access@umw.edu
