AI engine settings - Verba Documentation

Questions this guide answers

Which settings affect creativity vs consistency?
Why are responses short or long?
How much context and max tokens can my plan use?
What does web search actually do?

Core controls

Setting	Range	What it changes
Temperature	`0..2`	Randomness and creativity
Top-p	`0..1`	Diversity of token selection
Model Context	plan-limited	Number of recent messages included
Reply Style	preset	Voice/format tendency
Multi-message	on/off + delay	Sends split follow-up replies
Web search	on/off	Allows live web-grounded replies

System instructions (behavior prompt)

System instructions are your verb’s persistent behavior rules. Where to set:

Dashboard -> Bot -> AI Engine -> Behavior
Field name: systemInstructions

Limit:

Up to 8000 characters

How they interact with other layers:

systemInstructions: behavior/rules/tone constraints
Training examples: style shaping and response pattern hints
Long-term memory: durable facts/preferences
Knowledge entries: factual/reference content
Conversation context: recent turns in current chat/thread/session

Practical rule:

Put “how to behave” in system instructions.
Put “facts to remember” in knowledge/memory.
Put “how to phrase outputs” in training examples.

Plan-based limits

Current default limits:

Plan	Max model context	Max response tokens
Free	`50`	`4096`
Plus	`75`	`8192`
Pro	`100`	`16384`
Ultra	`150`	`32768`

If you request settings above your tier limits, they are clamped or rejected depending on the endpoint.

Premium AI model access

AI model availability is plan-based. If you open the AI Engine and choose a model that is outside your current tier, Verba shows an upgrade prompt with the number of additional premium AI models available above your current plan. That upgrade hint is dynamic, so the exact count can differ depending on:

Your current plan
The active model catalog
Whether new premium models have been added since your last visit

Recommended presets

Reliable support

Temperature 0.4-0.7, top-p 0.7-0.9, moderate context.

Creative roleplay

Temperature 0.8-1.1, top-p 0.9-1.0, higher context.

Cost-aware

Lower context, web search off, shorter reply style.

Fast troubleshooting

Lower context + deterministic settings to reduce latency variance.

Multi-message behavior

When enabled:

Verba can split a response into multiple shorter messages.
Delay between parts is configurable (0..10000ms).

Use this for natural chat pacing; disable if you want one compact answer.

Web search behavior

When enabled, Verba may perform model-driven search planning before the final answer. This improves freshness, but can increase:

Latency
Token usage
Cost

Use web search for:

News
Live pricing
Fast-changing product details

Keep it off for:

Roleplay
Stable canon/lore
Deterministic support flows

System instructions best practices

Keep instructions concrete and scoped.
Avoid contradictory rules.
Prefer short imperative bullets over long prose.
Include failure policy (what to do when unknown).
Specify output format explicitly (headings, bullets, steps, code blocks).

Quick diagnostics

Replies are too generic

Lower temperature, tighten system instructions, and add targeted training examples.

​Questions this guide answers

​Core controls

​System instructions (behavior prompt)

​Plan-based limits

​Premium AI model access

​Recommended presets

Reliable support

Creative roleplay

Cost-aware

Fast troubleshooting

​Multi-message behavior

​Web search behavior

​System instructions best practices

​Quick diagnostics

Bot Settings Reference

Questions this guide answers

Core controls

System instructions (behavior prompt)

Plan-based limits

Premium AI model access

Recommended presets

Multi-message behavior

Web search behavior

System instructions best practices

Quick diagnostics