Transcription of British Museum Audio outputs

Done! View results

100% completed

Transcribing the British Museum's audio archive from SoundCloud

The British Museum has a very active programme of lectures, pod casts and other primarily audio activities and these are usually published to our SoundCloud profile.

These have not always been transcribed, and this application will allow us to get this task completed and open up this archive to scholarly research, more accessible uses and more serendipitous use that we may not have thought of. For instance, things we have thought of that could make use of these transcriptions included text mining, topic models, and searchable PDF or RTF files.

To achieve this, each episode was downloaded using the SoundScrape python package and then it was chopped into 10 second chunks via the use of PyDub. These are then made available here, and we would like you to listen to each clip and attempt to capture what is being said (and if possible by whom).

It is a very simple application!

These pod casts and associated data are provided under a CC-BY-NC-SA licence. This project uses the excellent WaveSurfer.js package for rendering our mp3 content for listening.

UCL Logo
University of Cambridge Museums logo
University of Stirling logo
British Museum logo
AHRC logo