pdf_access_scanner

Andrew Weymouth

Back

Code

pdf_access_scanner

Andrew Weymouth

Winter 2025

Appears in Data Repository

Abstract

accessibility

Data Analysis

Database Management

A Python tool for batch surveying PDF files for missing alt text fields, title metadata, tagging structure and correctly ordered headers. The script was created so PDF files in University of Idaho's VERSO institutional repository can be more transparently assessed for what kind of remediation work needs to be done to meet WCAG 2.1 standards. Folders are surveyed using the PikePDF library and results are generated in a CSV which prints each filename alongside a pass or fail judgement for each of the above measures. While there are more complex qualifications involved in meeting WCAG 2.1, this tool provides a good starting point to understand a collection's accessibility benchmarks holistically, as opposed to approaching remediation linearly on a file by file basis.

Files and links (1)

url

https://github.com/Scholarly-Projects/pdf_access_scannerView

Metrics

1 Record Views

Details

Title: pdf_access_scanner
Creators: Andrew Weymouth - University of Idaho, University of Idaho Library
Identifiers: 996944134201851
Academic Unit: University of Idaho Library
Language: English
Resource Type: Code