Triptych, Audio Books, and AI

Book cover for the audio book "Triptych"

I am a huge fan of audio books. I eat through them on my daily commute. I also appreciate that many folks either want or need audio books to access content. So, I’ve just finished recording, editing, and submitting an audiobook version of Triptych. It should be available in a week or so, after it gets reviewed by Audible’s quality control process.

This is not the first time I’ve made an audio book version of my work. Both Boring Patient and Expect More (the first edition) are available in audio format. This is, however, teh first time I’ve had to think about AI and audio book production.

Triptych is published using Amazon’s KDP platform. When I uploaded an ebook version, Amazon asked if I wanted to make it an audio book. Yes, I said after hearing folks would be interested in one. Great, here’s a link to upload audio files, or find a voice talent to record it for you (normally for a percentage of sales)…or try our new beta of “virtual voice.”

Virtual voice? Yup, you guessed it, an AI produced audio version. You pick the “voice” and it makes the narration. You can even go in and edit the pronunciations. So in minutes, you have an audio book, with a nice little disclaimer about the use of AI.

I thought about it. My main goal with an audio book is to make it more accessible, particularly those who have trouble reading text. But, in the end, I thought I had already included as much AI in the book as I was comfortable with. Also, folks on Facebook said they wanted my voice.

So, down to the first floor of the iSchool I went to use one of two recording rooms we have set up.

While I sat in the small room reading text into my laptop (and marking all the edits I needed to make) next door Quinn was recording a screencast of a new software rollout. Quinn is the school’s long standing IT/instructional tech guru. He talked as a screen of the software in use was captured.

Screenshot of an Avatar sold by Synthesia

Here’s the big difference. One he was done he had AI create a transcript of his narration, fed that transcript into a new AI system that used it as a script to create a virtual avatar providing the blow-by-blow. When I messed up a line it was a process of editing and re-recording. Same for looking for breaths and bad pacing. He found an error, he just edited the text of the transcript, and the video avatar just re-rendered.

While folks asked for my voice, he wanted to use a voice without his Texas twang. Also, if he wanted it in Spanish, or French, of Japanese, that was a click away (BIG grain of salt on the effectiveness of AI language translation).

So folks wanted my voice, but did they care if it was really me in the recording room. There are services that will actually clone my voice and my image. When my wife asked me if I was OK to have some AI company have a copy of my voice (Little Mermaid anyone), my initial thought was no. Then I remembered the hundreds of hours of video presentations I have across the internet, it seems like they already had it if they wanted it.

Let me go back to my original reason for making the audio book. Accessibility and format preference for my readers. Does it matter that I sweated in a room with a microphone (foam walls and computer equipment in a small space equals hot)? Was it better to be able to quickly fix errors? Or was this another step in dehumanizing the connection between author and reader? Or professor and student? Are the audio equivalent of typos more desirable than listening to a soothing AI voice?

I don’t have a final great moral answer here. I will note, however, that all the books Amazon lists as using virtual voice are in the self-published genre…not a Steven King or John Scalzi to be found. Perhaps this is an option for the small fry author that simply can’t afford to have great voice talent enrich their book.

I have my choice – my voice and sweat instead of yet another AI coated bit of content. Still, I know me. I’m guessing a virtual Virtual Dave is coming so I can play and poke and try. Just not reading my book.