CushVlog Transcription Workflow & Metadata Guidelines
Table of Contents
- Purpose and Scope
- Step 1: Transcription
- Step 2: Sectioning and Subheads
- Step 3: Metadata and Dublin Core
- Step 4: Data Entry in Google Sheets
- Best Practices
- Examples
---
Purpose and Scope
Purpose: The purpose of this project will be to transcribe, archive, and tag the CushVlogs with the stated goal of creating a series of essays to be published in book form. Each member will be assigned a handful of CushVlogs; the focus should be on quality as much as it is quantity.
Workflow: For each CushVlog assigned, the volunteer must:
1. Transcribe the vlog in full. Wherever you choose to do this is likely fine (Google Docs, Microsoft Word, etc.) so long as it can be saved as PDF and uploaded to Google Drive. Proper grammar/sentence structure is less important than getting all the words Matt says in order…but if you can, try to make it clearly readable to a potential editor.
2. Create subheads based on subject matter. This book will be organized into essays; after transcribing the vlog, the volunteer should create subheads for each section based on topics, with an included timestamp for the start of each topic. It is up to the transcriber when each “section” begins and ends, as long as they use the controlled vocabulary whenever possible.
3. Create metadata and upload to Google Drive. When the transcription is ready, upload the document to the Google Drive folder and generate metadata for the project following Dublin Core guidelines. This should include:
- Title
- Subjects (this is just a list of one/two word subjects Matt touches on; they will overlap with the sections, but it does not need to be 1:1. Consult the controlled vocabulary for these).
- Description (A brief, <200 word description of the vlog with timestamps included).
- Intro song(s) (write N/A if none).
- Date (be sure it's the date of the vlog itself, not when it was uploaded to YouTube!)
- Source (just a URL is fine).
- Transcriber (your name/handle).
- Extra Notes (anything miscellaneous such as “Matt’s mic cuts out,” “most of this is interactions with chat,” etc., mark N/A if nothing applies).
This metadata should be included in the document and on the Google Sheet. The subjects should conform to the controlled vocabulary to be established as the project goes on.
4. Markup materials in Google Sheets. For each CushVlog, go into Google Sheets and insert the metadata in there as well. Then mark that your work is “done.”
Step 1: Transcription
Transcription work proceeds by converting spoken CushVlog content into written text. The process emphasizes capturing all spoken words in order, with the transcript saved as a PDF or other accessible format and uploaded to Google Drive. The emphasis is on completeness of content; grammatical polish is secondary but desirable if feasible.
Step 2: Sectioning and Subheads
Create subheads based on subject matter. This book will be organized into essays; after transcribing the vlog, the volunteer should create subheads for each section based on topics, with an included timestamp for the start of each topic. It is up to the transcriber when each “section” begins and ends, as long as they use the controlled vocabulary whenever possible.
Step 3: Metadata and Dublin Core
Create metadata and upload to Google Drive. When the transcription is ready, upload the document to the Google Drive folder and generate metadata for the project following Dublin Core guidelines. This should include:
- Title
- Subjects (this is just a list of one/two word subjects Matt touches on; they will overlap with the sections, but it does not need to be 1:1. Consult the controlled vocabulary for these).
- Description (A brief, <200 word description of the vlog with timestamps included).
- Intro song(s) (write N/A if none).
- Date (be sure it's the date of the vlog itself, not when it was uploaded to YouTube!)
- Source (just a URL is fine).
- Transcriber (your name/handle).
- Extra Notes (anything miscellaneous such as “Matt’s mic cuts out,” “most of this is interactions with chat,” etc., mark N/A if nothing applies).
This metadata should be included in the document and on the Google Sheet. The subjects should conform to the controlled vocabulary to be established as the project goes on.
Step 4: Data Entry in Google Sheets
Markup materials in Google Sheets. For each CushVlog, go into Google Sheets and insert the metadata in there as well. Then mark that your work is “done.”
Best Practices
- Err on the side of being “too accurate,” but cutting any uhms, uhhs, or stammers is fine, as is slightly condensing/cutting sentences where Matt has a false start/interrupts himself.
- If you don’t know the word he says, bracket it with a question mark like [this?].
- Break chunks into paragraphs if possible.
- You can use AI tools to help transcribe this if you want to, but we do want some “human touch” on this regardless. If you do use AI for this, then be sure to go back and check if the transcription is accurate/legible/readable. Fix things up, make complete sentences, use human judgement.
Example:
https://docs.google.com/document/d/1xqdBRgagPwD7GjvdPgqRgtF1pukr5FLiT5oQYvaBWyc/
edit?tab=t.0
Blank Doc to Copy:
https://docs.google.com/document/d/1Hmov1Uej1mKi06PgfpG6Lr_tRXjeSmf7IUWpaLXazCw/
edit?tab=t.0