Digital Voices: Good, Bad, or the Devil?
I’ve mentioned before that I am not keen on the idea of computer-generated audiobooks. To sum up, I think that the technology will create a sound that is a perfect mimic of a human voice but one that will still land in the uncanny valley territory. I do think that the technology will have its uses, but I don’t think it will replace human narration.
I also dislike the idea on a philosophical level. The Big Tech moguls seem to want to create a world of perfect automation. The trouble is this vision of the future is a complete fantasy – automation is fine and dandy until something breaks or the power goes off or the Internet goes down. (As the current state of the US economy shows, we could use a lot more truck drivers and plumbers and a lot fewer Big Tech CEOs.) It’s like when non-farmers offer various Galaxy Brain Energy ideas on how to improve farming, and the actual farmers laugh themselves sick because they know the idea is 1.) bad, and 2.) will fail catastrophically. AI-narrated audiobooks seem to fall into that category of Galaxy Brain idea.
Anyway. That was a bit of a digression. Back to audiobooks!
All that said, I freely admit that my dislike of AI-narrated audiobooks could be curmudgeonly bias on my part. A good example is Alexa. I don’t use Alexa, Siri, or any of the other voice assistants, because I don’t like the idea of spending my time around an always-on microphone. Indeed, I find it slightly sinister. And my formative experiences with computers were back in the command line days, so deep down part of my brain thinks we should all go back to the command line.
However, I cannot deny that I have met many people – particularly elderly people or people with chronic illness that impairs their vision or mobility – for whom voice assistants like Alexa and Siri have considerably improved their quality of life. This was especially true in 2020 and 2021 when the worst of the pandemic restrictions were in full force. So it’s always good to remind myself that my opinion isn’t necessarily objective fact.
This relates to audiobooks because Google Play Books launched a free trial of an AI audiobook narration service. If you have a book on Google Play Books that doesn’t have an audiobook associated with it, you can use Google’s AI narration to generate an audiobook of it that you can then sell on the Google Play store. You can also sell the resultant MP3 files elsewhere, though that may be a moot point, since Audible and Findaway do not allow computer-generated audiobooks on their storefronts at the moment.
In the interests of 1.) testing my own preconceptions, and 2.) seeing if it was any good, I decided to give it a try.
I chose SILENT ORDER: IRON HAND for the experiment, because out of all my main fictional settings (FROSTBORN, THE GHOSTS, CLOAK GAMES, DEMONSOULED, and SILENT ORDER) SILENT ORDER definitely has the weakest sales, and so therefore I am very unlikely to ever make a human-narrated audiobook for it because the sales for the rest of the series wouldn’t justify it. To put it in perspective, the ebook sales of all ten SILENT ORDER books for June 2022 combined would cover the about 45% of the cost of narrating just the first book in audio. So it’s not like the experiment would have screwed anyone out of narration work on SILENT ORDER because there most likely wasn’t going to ever be any narration work on SILENT ORDER.
Anyway, the process of generating an AI audiobook is very simple. Basically point and a few clicks. You choose the book, select the voice and accent you wish to use, make sure that you have the appropriate sections of the book selected for narration, and that’s pretty much it. The servers grind away for a few hours, and then the Google Play dashboard generates an AI-narrated audiobook.
My opinion of the result…it is okay-ish. It sounds just like someone reading the book in a normal voice devoid of emotion. Listening to a few sentences, I suspect most people would not be able to tell that it’s not a human voice. However, if anyone listens for longer than just a few sentences, it becomes immediately apparent that it’s a computer generated voice because of the complete lack of emotion and absence of variation in the inflection. The voice I chose sounds a lot like the guy who used to do the voiceovers of the All-State Insurance commercials here in the US, and the trouble is that this is exactly what it sounds like – someone calmly reading the voiceover for an insurance commercial. This includes the violent or emotional scenes in SILENT ORDER: IRON HAND, of which there are many, so in the middle of an angry conversation or a gunfight, it still sounds like someone offering you a good deal on insurance over the radio. (“Surrender or die, Captain March! Also, did you know that you could save 17% on your starship insurance?”)
So the technology is better than it was, but it’s still lands in the uncanny valley – something that is close enough to human that it triggers an alarmed response in the lizard brain sector of the human mind.
My frank opinion is that the majority of serious audiobook listeners will hate AI-narrated audiobooks and refuse to buy them, and I don’t think the end product is good enough that I would feel comfortable charging money for it.
Granted, I expect the technology will have its uses. I heard about the new Google Play Books program on the Creative Penn podcast, and Joanna Penn argued that she expects to see the “bifurcation” of audio rights. Like, right now, a book has “audio rights”, and that covers any audiobooks. Eventually, there will probably be different kind of audio rights – machine-read and human-read. A vast majority of books will never have human-narrated audiobooks, so machine-read ones could fill the gap. Relating to the way that audio assistants like Alexa and Siri have helped people with vision and mobility challenges, I can definitely see how this kind of technology would be of immense benefit for people in those situations. In fact, given disability access laws, maybe the endpoint will be that eventually all ereader apps will have a “read aloud” button, and then you can choose between a human-narrated book (if available) or a computer-generated one.
But we all must make up our own minds on the topic. I put the AI-narrated audiobook of SILENT ORDER: IRON HAND on YouTube, and so you can listen and decide for yourself!
-JM
I find the concept of someone trying to kill someone whilst selling them starship insurance extremely amusing for some reason.
You should get Tom Stranger, Interdimensional Insurance Agent by Larry Correia.
I’ve heard of that, but I haven’t had a chance to check it out yet.
My late uncle was interested in this technology after glaucoma took his eyesight. His problem was that his military hearing loss made the WALL-E early computer voices very difficult to understand.
My brother uses Siri to read books to him while driving, but those books are reference materials and science articles, so the lack of emphasis isn’t a big issue.
I read much faster than the spoken word, so prefer to read versus listen. When driving, I prefer music as it is less distracting. Don’t expect me to purchase any audio books (but I’ll continue to buy all your ebooks).
Audio voices are just in their infancy now. But just a matter of time before you get thousands of standard speaking voices indistinguishable from real humans with software packages allowing you customize speaking tones, accents, pitch, flow, etc. You will probably have career jobs like [Voice Engineer] then.
The RPG industry would also be revolutionised. Instead of game or game expansions being postponed or under-delivering because the expensive VA is un-available (or died), your Gimli can now talk endless dialogues thought up by people. Games like Skyrim can now generate infinite quests with generated voice acting too!