Skip to main content
Text-to-speech converts written text into spoken audio. You provide the text and optionally choose a voice. The agent generates an MP3 file you can play and download.

How to generate speech

1

Describe what you need

Tell the agent what text to convert. You can type the text directly or ask the agent to write it first.

Convert a product description to speech.

2

Choose a voice (optional)

If you do not specify a voice, the agent uses Rachel (the default). To use a different voice, mention it in your prompt.
VoiceDescription
Rachel (default)Neutral, clear, conversational
GeorgeMale, warm tone
SarahFemale, professional
CharlieMale, casual
LilyFemale, friendly
ChrisMale, energetic
You can also use a cloned voice by its ID. See Voice Cloning.
3

Review and download

The audio file appears in the chat with a waveform player. Click play to preview, then click Download to save the MP3 to your device.
Chat showing a generated audio file with waveform bars, play button, and download option

What you cannot do

  • You cannot change the speed or pitch of the generated speech. The voice model determines pacing naturally.
  • You cannot generate speech longer than a few minutes in a single operation. For longer content, break it into sections.
  • You cannot mix speech with background music in one step. Generate them separately.

Next steps