Voices

Charisma uses text-to-speech to enable characters to speak the lines that you write.

The voice for each character can be selected using the voice selector from their character page. This is the voice that will be used in their character nodes normally. It is also possible to change the voice for a specific character node by clicking the cog next to the selected character name.

Available voices

Over 1000 voices are available through our partnerships with leading text-to-speech companies.

We offer the following additional providers for Pro stories:

Cereproc
Deepgram
Replica
Resemble

There are some extra voices only available to enterprise customers. If you're interested in voices from any of the below providers, or have other requirements like custom voices, please reach out and let us know:

ElevenLabs

Custom Voices

You can use voices from your ElevenLabs or Cartesia account by adding your API keys to a story.

On the Story Overview page, story managers can save each API key. Once a valid key is saved, your voices will be displayed in the voice selector, categorised as 'Cartesia (Custom)' and 'ElevenLabs (Custom)'.

If you are having trouble generating custom voices for your stories or in the character creator, it's worth checking if your API key is correct and if you have enough credit on your Cartesia or ElevenLabs account.

Using custom voices will not consume your Charisma credits.

SSML

Many text-to-speech services support SSML tags, a way of fine-tuning how a particular line is said, such as modifying the pitch of a certain word, or how a word is pronounced.

In Charisma, opening and closing SSML tags are prefixed with the string voice: to distinguish them from Charisma memories, which also use the angle bracket syntax, but do not allow colons.

For example, in regular SSML, you might write:

This is my <sub alias="ex-zam-pool">example</sub> of SSML!

But in Charisma-style SSML, you would write:

This is my <voice:sub alias="ex-zam-pool">example</voice:sub> of SSML!

Using uploaded clips

Sometimes, you'll want even more control than what SSML can give you, for example in delivering a very emotional line.

Audio clips can be uploaded directly onto a particular line in a Character node by right-clicking the text and selecting Upload voice clip. Once uploaded, it will replace the text-to-speech synthesis. It can be removed at any time by right-clicking the text again and selecting Remove voice clip.

This is great for when you're using a voice service that supports speech-to-speech conversion. You could speak the line how you want it to be delivered in your voice, have it converted into the character's voice, and upload the resulting audio clip into Charisma.

If you are using an SDK, make sure the encoding of the voice clip (such as MP3, OGG or WAV) matches the encoding you're using for text-to-speech clips, as uploaded clips are not dynamically transcoded to the requested encoding when playing the story.

Categories Collaboration