Skip to main content

Questions this guide answers

  • Which settings affect creativity vs consistency?
  • Why are responses short or long?
  • How much context and max tokens can my plan use?
  • What does web search actually do?

Core controls

SettingRangeWhat it changes
Temperature0..2Randomness and creativity
Top-p0..1Diversity of token selection
Model Contextplan-limitedNumber of recent messages included
Reply StylepresetVoice/format tendency
Multi-messageon/off + delaySends split follow-up replies
Web searchon/offAllows live web-grounded replies

System instructions (behavior prompt)

System instructions are your verb’s persistent behavior rules. Where to set:
  • Dashboard -> Bot -> AI Engine -> Behavior
  • Field name: systemInstructions
Limit:
  • Up to 8000 characters
How they interact with other layers:
  • systemInstructions: behavior/rules/tone constraints
  • Training examples: style shaping and response pattern hints
  • Long-term memory: durable facts/preferences
  • Knowledge entries: factual/reference content
  • Conversation context: recent turns in current chat/thread/session
Practical rule:
  • Put “how to behave” in system instructions.
  • Put “facts to remember” in knowledge/memory.
  • Put “how to phrase outputs” in training examples.

Plan-based limits

Current default limits:
PlanMax model contextMax response tokens
Free504096
Plus758192
Pro10016384
Ultra15032768
If you request settings above your tier limits, they are clamped or rejected depending on the endpoint.

Reliable support

Temperature 0.4-0.7, top-p 0.7-0.9, moderate context.

Creative roleplay

Temperature 0.8-1.1, top-p 0.9-1.0, higher context.

Cost-aware

Lower context, web search off, shorter reply style.

Fast troubleshooting

Lower context + deterministic settings to reduce latency variance.

Multi-message behavior

When enabled:
  • Verba can split a response into multiple shorter messages.
  • Delay between parts is configurable (0..10000ms).
Use this for natural chat pacing; disable if you want one compact answer.

Web search behavior

When enabled, Verba may perform model-driven search planning before the final answer. This improves freshness, but can increase:
  • Latency
  • Token usage
  • Cost
Use web search for:
  • News
  • Live pricing
  • Fast-changing product details
Keep it off for:
  • Roleplay
  • Stable canon/lore
  • Deterministic support flows

System instructions best practices

  • Keep instructions concrete and scoped.
  • Avoid contradictory rules.
  • Prefer short imperative bullets over long prose.
  • Include failure policy (what to do when unknown).
  • Specify output format explicitly (headings, bullets, steps, code blocks).

Quick diagnostics

Lower temperature, tighten system instructions, and add targeted training examples.
Increase model context (within your plan limit).
Increase max tokens and use a fuller reply style.
Reduce context and disable web search unless needed.

Bot Settings Reference

See all AI-related fields and limits in one place.