(Optional) This text will appear in the inbox preview, but not the email body.
AUDIOBOOK PERFORMANCE BELIEVABILITY
Lorene Shyba
From the book "Ascenti: Humans Opening to AI"
My purpose with these Ascenti audiobook case studies was to gauge the believability of performances when using AI voices for audiobook narration. The experiments I am running involve live and AI performances of:
1. Solo-performer texts, both nonfiction and fiction, 2. Multi-cast, previously staged scripts 3. AI versions of a multi-cast script based on “The Totalizator.”
EXPERIMENT 1. SOLO PERFORMANCE NON-FICTION and FICTION
Project 1a. Non-fiction. Text from "Introduction," Indigenous Justice (Shyba, Yakeleya eds) Durvile 2023. “Returning sacred objects from museums to First Nations aligns with early steps of support from the Vatican to reflect on the dignity and rights of Indigenous Peoples. The Vatican’s recent rejection of the Doctrine of Discovery, a legal concept that justified Europeans claiming Indigenous lands, shows that dispossession of land was not legal and calls into question the manner of colonization.”
Above. Original human narration (Lorene Shyba)
Above. AI out-of-the-box US American voice ("Bill")
Above. AI out-of-the-box British voice ("George")
Above. French ("Judith") and Ukrainian ("Antoni") language versions
CONCLUSIONS, Indigenous Justice
As an experiment, two male AI voices were selected: British and American. Because of the connection of the text with colonization by the British Crown, the British voice (George) was more realistic than the American. Overall liveness believability, 4.5 stars. Industrial practicality, 4.5 stars.
---------
Project 1b. Fiction.Text from "Mother Son," Embrace Your Divine Flow (Hobson, Shyba eds) Durvile 2023.“I continue until I hear the first voices. I open my eyes slowly without moving a muscle in my body, completely still, like a lioness before she pounces. I am alert. I am back with the ego; theirs, mine. The eternal infinite experience is over, for now. It is time to learn. Learning through pain and suffering, finding wisdom and knowledge, and then finding further ways to evolve my self, which is integrated in permanence with my tissue. Every time I close my eyes. I know this, I feel this."
Above. Original human narration (Julian Hobson)
Above. AI out-of-the-box US American voice ("Nicolel")
Above. AI out-of-the-box Indian voice with British accent ("Aaryan")
Above. "Nicole's" AI voice, modificatied with pauses, speed, and music
Above. "Aaryan" speaking a Hindi language version
CONCLUSIONS, Embrace Your Divine Flow
In this experiment, a North American female AI voice and an Indian male voice were selected. Both were suitable on a basic level, but when the female voice was embellished with breaths, pacing, EQ filters, and music, the result was excellent. Overall liveness believability, 5 stars. Industrial practicality, 4.5 stars. Bonus: listen to the Indian male voice read the same passage in Hindi.
2a. "Spies in the Oilsands," by Lorene [as “Lori”] Shyba.
Video clip above, pay attention to the audio.
The play was a Forum Theatre piece where Tarzana is a stereotype red-neck oilsands worker; Terra is a stereotype eco-zealot; and Ajay is a bystander.
The play, a component part of the dissertation, “Beyond Fun and Games,” was performed at the Calgary One-Act Play Festival, March 9, 2007.
These illustrations were created in AI with the thought they might help with characterizations. (They didn't much but it was an amusing exercise.)
The Script snippet for analysis:
Tarzana: (To Ajay) You hungry? (To Terra) I’m hungry!
You bring any sandwiches for us?
Terra: Dream on. I’m the cook, not the waitress.
Tarzana: (To Ajay) Come on let’s go get lunch.
Ajay: Your truck or mine?
Tarzana: Take mine, she’s all warmed up. See, lucky I kept her runnin’. Way easier on the ignition.
Terra: Drive??!! No-nno-no-no, you’re not taking the truck.
The kitchen is 50 metres away.
Tarzana: Grab yer boots. Get in the fucking truck.
Terra: Get out of that truck. Get out of that truck right fucking now.
(Tarzana and Ajay get in the truck. Terra blocks their way. Truck revving SFX)
Terra: I can’t believe this (starts singing Koombaya).
Ajay: Geez, Tarzana you’re going to run her over??!!
Tarzana: Nah, she always gets out of the way, every time.
"Spies" Live Performance Narration (from the above video)
Two female voices, one male voice
"Spies" AI Multicast Narration: two female voices, one male voice, assembled with audience reactions, music, and SFX.
CONCLUSION, Spies in the Oilsands
In comparison with the staged version, the AI actors lacked believability as an ensemble performance, and the humour was lost.
Overall liveness believability, 2 stars. Industrial practicality, 2 stars.
The upside would be that the dreaded "he said, she said" would not be required after each line of audiobook dialogue if using AI voices. The notable downside is that the characters are obviously not listening to each other with their responses.
Bonus is, with multiple AI voices the dreaded "he said, she said" would not be required after each line of dialogue in an audiobook recording. "He said, she said" is a hazard to the storyline when there is only one narrator.
------
2b. "The Last Dance," by Eugene Stickland, from No Harm Done
This video was recorded at the recording session for the audiobook with actors Eugene Stickland and Liz Strom.
The Script Snippet for analysis:
RONNY: Tell me how it is.
LEANNE: You don’t want to know.
RONNY: I do. Go ahead.
LEANNE: I don’t know. It obviously sucks. Some days are better than others, I guess. But it creeps in and takes hold and you just think shoot me I want to die then it relaxes a little but it never really goes away, of course. It’s hard to get around, just even to walk anywhere, you’re always worried you’re going to fall. There’s spasms that just come over you, wrack your body with pain; you wonder what you ever did wrong to have this happen to you. Nights are the worst. I’m afraid to go to sleep for the dreams I could have. The hours crawl by. Can’t even read because the print jumps around the page. I won’t lie. It’s not easy.
RONNY: I’m sorry. That really sucks.
LEANNE: Yes. It does. It really sucks. Pause. He comes over and hugs her.
RONNY: You have Parkinson’s so you think you can’t dance.
LEANNE: I know I can’t.
RONNY: They have all kinds of research going on, don’t they?
LEANNE: I guess they do. It doesn’t seem to happen fast enough.
But yeah. There’s hope things will get better.
RONNY: Well, be that as it may. I’ve come a long way, and we are going to dance.
LEANNE: I can’t.
RONNY: I’ll hold you.
LEANNE: What would we even dance to?
RONNY: The last song they played that night at the senior prom. I asked around in the manner of Sherlock Holmes and found out what it was. Kind of an obvious choice, but effective, I thought.
LEANNE: But we don’t have any music.
RONNY: That’s what you think. I’m in show business, remember? It’s the magic of the theatre!
Above. Live Reading from the Audiobook recording: a female AI voice (Liz Strom), a male AI voice (Eugene Stickland).
Above. AI Multicast Narration: a female AI voice, a male AI voice.
CONCLUSION, The Last Dance
Overall liveness believability, 3 stars. Industrial practicality, 2.5 stars.
The upside would be that, as with Spies, the dreaded "he said, she said" would not be required after each line of dialogue if uusing AI voices. The notable downside is that the characters are obviously not listening to each other.
“The Last Dance” play was performed on May 17, 18, 2018 at the cSpace Theatre, Calgary, Alberta. It was commissioned by the Branch Out Foundation.
PROJECT 3. SNIPPET OF an AI-WRITTEN scene, based on Clem Martini's "TOTALIZATOR"
3. "Takeover in Hell," by AI MALACHI (whispering) My Lord, I bring news of a potential takeover in Hell.
Lucifer’s eyes narrow, intrigued. He leans forward, his voice dripping with authority.
LUCIFER Speak, Malachi. Who dares challenge my dominion?
MALACHI It is a group of rogue demons, led by a powerful fallen angel named Azazel. They have amassed a considerable following and plan to overthrow you, my Lord.
Lucifer’s face contorts with anger, his voice seething with fury.
LUCIFER Azazel... I should have known. No one defies me and gets away with it.
Lucifer rises from his throne, pacing back and forth, deep in thought. Malachi watches, eagerly awaiting his orders.
LUCIFER Summon my most trusted lieutenants. I want them here immediately.
Malachi bows, quickly exits the Throne Room.
"Takeover in Hell" video performed in Russian with English translation, and stage directions.
CONCLUSION, Totalizator AI scripts
In the end, this experiment came out a 4 believability and 3 utility. The utility is low because of the massive amount of post production work needed. This may change over the years in AI but for now, juxtaposing the voices takes longer with the AI voice than if actors were in the same room.
About Lorene Shyba
Lorene Shyba MFA PhD is publisher at Durvile & UpRoute and creative director at the audiobook production arm, Pepper Ranch Studios. She has performed, directed, and produced over 25 audiobook titles in many genres. Her doctoral degree is in Interactive Media and she incorporates multimedia in every corner of her work.
“The cultural and artistic roles of AI have received little attention so far. The Ascenti book is a welcome opening in that direction. It deals with AI used in creating visual arts, literature, and computer games, and analyzes both the new opportunities and limitations in these areas.” —Frans Mäyrä, PhD, Researcher of Culture and Society