Transparent Practices: OCR and AI in the Archives

Rebecca Hastings; Andrew Weymouth

doi:10.1177/15501906261439241

Back

Transparent Practices: OCR and AI in the Archives

Journal article

Open access

Peer reviewed

Transparent Practices: OCR and AI in the Archives

Rebecca Hastings and Andrew Weymouth

Collections (Walnut Creek, Calif.), Vol.22(2), pp.130-152

06/01/2026

DOI: https://doi.org/10.1177/15501906261439241

Appears in Artificial Intelligence and Machine Learning Research

Abstract

optical character recognition

archival ethics

digital preservation

artificial intelligence

accessibility

digital stewardship

sustainable digital practices

Computer Vision

This paper examines optical character recognition (OCR) through the lens of archival ethics as outlined in the Society of American Archivists (SAA) Core Values Statement and Code of Ethics, given the current debates surrounding artificial intelligence (AI). A literature review highlights persistent challenges of authenticity and integrity, transparency and accountability, access and equity, and responsible stewardship and sustainability, as well as new concerns about bias, sustainability, and accountability using large language models (LLM). A case study describes systematic testing of LLM, transformer model (TM), and neural network (NN) architectures and examines the challenges in creating a reliable, scalable in-house OCR tool named Opticolumn. This case study finds that NN approaches better align with archival ethics than do LLM tools, which may generate fabrications, but that OCR tool choice will depend on the capacities and preferences of individual institutions.

Files and links (2)

pdf

hastings-weymouth-2026-transparent-practices-ocr-and-ai-in-the-archives2.17 MBDownload View

Open Access

url

Article Landing PageView

Metrics

1 Record Views

Details

Title: Transparent Practices: OCR and AI in the Archives
Creators: Rebecca Hastings - University of Idaho
Andrew Weymouth - University of Idaho
Publication Details: Collections (Walnut Creek, Calif.), Vol.22(2), pp.130-152
Publisher: Sage
Identifiers: 996944134401851
Academic Unit: University of Idaho Library
Language: English
Resource Type: Journal article