HiPSTAS NEH R&D Project: Automated Extraction of Metadata From Large Collections.
For over a decade organizations with legacy audio holdings have placed their resources and focus on digitization. Despite the fact that hundreds of thousands of hours have been digitized, searching through audio content has been constrained largely to text-based description, greatly restricting discovery. To address this reality, the High Performance Sound Technologies for Analysis and Scholarship (HiPSTAS) project is working on applying advanced computational techniques, such as spectral analysis and machine learning, to expand opportunities for discovery and research insights across audio collections. The presentation explores first results of the HiPSTAS project when applied to two bodies of materials. The first is the University of Texas Folklore Center Archives, containing collections from John and Alan Lomax, in which HiPSTAS enables discovery based on genre. The second is PennSound, containing poetry read by Allen Ginsberg, Robert Creeley, Cecilia Vicuña, and many others, in which HiPSTAS enables discovery based on dates, speaker, and venue. Tanya Clement is an Assistant Professor in the School of Information at the University of Texas at Austin. Presented at DAS: New York on May 8, 2015.
An Interview with Tanya Clement
Tanya Clement discusses her DAS: New York presentation on the HiPSTAS NEH R&D Project. Interviewed at AMIA's DAS: New York on May 8, 2015 by Snowden Becker, UCLA Moving Image Archiving Studies.