Voice replies
Voice is optional and can be enabled per verb. Start with text only, then add voice once the personality feels right. If you want voice on every eligible reply, set voice response frequency to100. Lower values make voice replies intermittent on purpose.
Voice attachment visual
The Voice Engine settings also include a Voice Attachment Visual section. You can upload a custom still image for voice message videos instead of using the default Verba image.- The live preview uses the same
20:7crop as the final voice attachment - The generated video output is framed to
400x140 - You can replace or remove the image at any time
- If no custom image is uploaded, Verba falls back to the default voice banner
Voice cloning
Upload a short, clean sample to create a custom voice. The best results come from 6 to 12 seconds of clear speech with minimal background noise. Reference text is optional. If you can, paste the exact transcript of the sample. It improves similarity and keeps the voice more stable across replies.Supported languages
The voice engine supports a focused language set:- Auto (recommended default)
- English
- Chinese
- Japanese
- Korean
- German
- French
- Russian
- Portuguese
- Spanish
- Italian
Premium voice model access
Voice model availability is plan-based in the same way as the AI and Image engines. When a selected premium voice model is outside your current tier, Verba shows an upgrade prompt with the number of additional premium voice models available on a higher plan. That count is dynamic and can change as the voice catalog changes.Discord voice chat
On Discord, Ultra verbs can join voice chat with/vc-join, and they can also
join from a normal mention request such as asking the bot to join VC/call in
server chat.
Lower-tier verbs can still keep those commands enabled, but they respond with
an in-character upgrade message instead of joining live VC. Normal generated
voice messages stay free.
When the live voice path is healthy, the bot can:
- Listen in the connected voice channel
- Transcribe incoming speech
- Generate a reply
- Speak the reply back into VC
- Voice Engine is enabled for the verb
- The selected voice model is available to that plan
- The bot can access and speak in the target voice channel
- The active speech provider is healthy and has available quota
If the selected live speech provider is unavailable or out of quota, Discord
VC can join successfully but still fail to transcribe or speak until that
provider becomes available again.
Keep it natural
- Short replies sound better.
- Avoid long paragraphs in voice mode.
- Set a frequency that feels human.
Safety and permissions
Only upload audio you own or have permission to use.AI engine
Lower temperature for clearer voice output.

