Logo image
pdf_access_scanner
Code

pdf_access_scanner

Andrew Weymouth
Winter 2025
Appears in  Data Repository

Abstract

accessibility Data Analysis Database Management

A Python tool for batch surveying PDF files for missing alt text fields, title metadata, tagging structure and correctly ordered headers. The script was created so PDF files in University of Idaho's VERSO institutional repository can be more transparently assessed for what kind of remediation work needs to be done to meet WCAG 2.1 standards. Folders are surveyed using the PikePDF library and results are generated in a CSV which prints each filename alongside a pass or fail judgement for each of the above measures. While there are more complex qualifications involved in meeting WCAG 2.1, this tool provides a good starting point to understand a collection's accessibility benchmarks holistically, as opposed to approaching remediation linearly on a file by file basis.

url
https://github.com/Scholarly-Projects/pdf_access_scannerView

Metrics

1 Record Views

Details

Logo image