For the last two weeks, I have been trying out online tools that I may use for my research project and in this post, I’m going to discuss my two favorites.
When I interviewed for this post-bac position, I talked a lot about my frustration with the lack of transcription tools available online while I was working on the Digital Life Stories Archive for Regina Martin. My struggles with voice-to-text software was a learning experience for me that involved pirating old versions of Premier in order to use the now abandoned transcription tool only to find out it no longer functioned–even on older versions, transcribing an hour-long interview by hand, which took me nearly six hours, and spending multiple hours online searching for any tool, free or not, that would cut down on transcription time. We laughed during my interview about the fact that my expectations were too high because I knew Apple and Google created voice-to-text programs for Siri and Google Assistant and I expected there to be a “magical” solution to this transcription issue on Regina’s project.
Well, laugh no longer because there is a magical solution out there called Trint. This is an online program that you have to pay for (The rate is $12/hour right now). I conducted a practice interview with my brother, Spencer that I fed into Trint, and honestly, I held off trying it because I was so sure I would be disappointed in the results. Well, I’m writing this now to tell you I most certainly was not disappointed. Trint transcribed a half-hour interview in about 2 minutes. The audio was decent but not high-quality by any means because we conducted the interview over Zoom and Spencer’s internet was a little spotty. The results were incredible. I didn’t think it would function half as well as it did.
I don’t mean to suggest that Trint was perfect, certainly, there were some words that Trint mixed up. For example, Spencer and I have midwestern accents and we tend to slur words together, so depending on the context, we might say “a”, “I”, and “uh” the exact same way, so a sentence like “And uh, I was a little embarrassed” might come out like “And I I was uh little embarrassed.” This isn’t a hard fix. As you listen to the interview, Trint darkens the words being played and you can easily edit the text to match the intended words by just clicking a editing the transcript like a word document.
Another aspect that takes longer to edit is punctuation, as Trint only puts in periods at hard stops in the conversation. In order to better textually represent what’s being said, sometimes it’s necessary to put in additional punctuation. For example, Trint might write “You know I was at work and the dude well he so. He’s not nice.” Without punctuation we might understand the speaker’s intended meaning of these sentences, but punctuation would better capture the speaker’s actual phrasing, so that sentence becomes “You know, I was at work and the dude, well, he–so… he’s not nice.”
I will say that there were a number of difficult or unlikely phrases that Trint did accurately capture. Trint caught Spencer’s use of “sorta” and my use of “gotcha”, automatically capitalized “Black Lives Matter”, and cut out filler noises like “uh” and “um.” And aside from a few situations with homophones like “by and buy”, it was mostly correct.
I don’t think I can say enough positive things about Trint, so I’ll just say if you have an interview project or any audio or video that you want transcribed, try Trint out.
The second application I’m going to talk about is webscraper.io. This is an incredibly useful, simple tool to use. Although it appears to be really complex because, for whatever reason, the actual web scraper is the very last tab on the tool, the videos provided on the webscraper.io website (though quiet and oddly monotone), are very easy to follow and tell you everything you need to know about the tool in order to scrape websites.
I tend to resist watching video tutorials because I find them boring – as I’m sure many people do – and I enjoy figuring things out by playing with the tools on my own. However, web scraping and in particular webscraper.io uses its own language and without any context, I totally failed at my first scrape (I wouldn’t even say scrape, I’d say random clicking and nothing happening). However, once I gave in and watched the short video, I completed a web scrape of The Denisonian website in just a few minutes.
As I moved on to more complicated, slightly newer-looking websites, I found that having a little bit of knowledge of HTML was helpful because some of the newer websites disabled the point and click feature on the program and I had to edit the JSON code for the web scrape by hand. This was both frustrating and rewarding, but I’m very glad I was presented with that challenge because I feel I have a much more in-depth understanding of what web scraping is because of it.
Both of these applications took time – one in terms of waiting around for the application to be developed, the other in terms of my ability to learn a new program. But what I get out of both of these experiences, is an appreciation for the fluid learning skills that I’ve gained through a liberal arts education.
Right now, to me digital scholarship is another branch or level of understanding and interpreting the world, and it’s one that often frustrates the hell out of me. I spent weeks searching for a program that didn’t exist only to find the best version of it possible years later (thanks for the recommendation, Ben). I spent hours – probably longer than I would’ve spent had I just copied and pasted the data I wanted – teaching myself how to scrape different websites. But it’s also incredibly rewarding and gratifying. I stood up in triumph in the computer lab at OWU when my web scraping code worked after the 30th try. I called my mom to tell her about Trint because I was so excited it existed at the same time I was planning to take on another interview project.
I can’t say the digital liberal arts alone have instilled perseverance in me. I’ve always been stubborn. But it has transformed that stubbornness into something productive, the ability to try new methods, learn new ways, and maybe best of all, how to know when to quit (e.g. don’t illegally download an old version of Premier that will crash your 13-year-old computer).