Skip to main content

Voice replies

Voice is optional and can be enabled per verb. Start with text only, then add voice once the personality feels right.

Voice cloning

Upload a short, clean sample to create a custom voice. The best results come from 6 to 12 seconds of clear speech with minimal background noise. Reference text is optional. If you can, paste the exact transcript of the sample. It improves similarity and keeps the voice more stable across replies.

Supported languages

The voice engine supports a focused language set:
  • Auto (recommended default)
  • English
  • Chinese
  • Japanese
  • Korean
  • German
  • French
  • Russian
  • Portuguese
  • Spanish
  • Italian
If you pick a language outside this list, the engine falls back to Auto.

Keep it natural

  • Short replies sound better.
  • Avoid long paragraphs in voice mode.
  • Set a frequency that feels human.
If the voice feels off, lower reply length and reduce random creativity.

Safety and permissions

Only upload audio you own or have permission to use.

AI engine

Lower temperature for clearer voice output.