Make Voice Feel Natural
Tune SOUL.md and prompt design so your agent sounds good when spoken aloud.
By the end of this page, your agent's voice responses will sound natural — not like it's reading a list of bullet points out loud.
Time: ~15 minutes
The problem with default responses in voice mode
Your agent was probably configured for text. Text responses use markdown — bullet points, headers, code blocks. Read aloud, these sound like:
"Dash dash dash. Number one, install the package. Number two, run the command. Number three..."
You need a voice-specific SOUL.md configuration.
Voice-specific SOUL.md section
Add a section to ~/.openclaw/SOUL.md specifically for voice:
## Voice mode (Talk Mode)
When responding in Talk Mode (voice):
- No markdown. No bullet points, no headers, no asterisks.
- Write in natural spoken sentences. "First you'll want to... then you..." not "1. ... 2. ..."
- Keep responses short. 2-3 sentences for simple questions. 4-5 for complex ones.
- Never read out URLs. Say "I'll send you the link" or "check the link I'm sending" instead.
- End responses naturally, not with "Is there anything else?"
- For lists, use "and" and "also" — not enumeration.OpenClaw automatically passes a voice: true flag when Talk Mode is active, so your agent knows to apply these rules.
Prompt design for voice
For questions where you'd normally get a structured response, prime the format:
Instead of:
"What are the steps to deploy a Docker container?"
Try:
"Walk me through deploying a Docker container like you're explaining it to me out loud"
The difference in output is significant. The second prompt gets natural speech; the first gets a numbered list.
You can also add to SOUL.md:
When I ask "how do I [X]?" in voice mode, give me a narrative walkthrough, not steps.Response length tuning
Voice responses that are too long interrupt the flow. Tune for length:
## Voice response length
- Single question → 1-2 sentences
- "Explain [topic]" → 3-4 sentences max
- "Walk me through [process]" → 5-6 sentences, then pause and ask if I want to continueIf you want the agent to pause mid-response for long topics:
For long explanations in voice: give the first part, end with "Want me to continue?" and wait.Sending follow-up links via text
When your agent mentions something that has a URL (documentation, article), you can have it send the link as a text message after speaking:
Add to config:
{
"talk_mode": {
"fallbackChannel": "telegram"
}
}With this set, when your agent says "I'll send you the link", it automatically sends the URL to your Telegram. Voice for conversation, text for links and long content.
A good voice SOUL.md setup
Here's a complete voice section that works well:
## Voice mode
In Talk Mode:
- Respond in natural spoken language. No markdown, no lists, no asterisks.
- Keep it short. 2-3 sentences unless I ask for more.
- Never read URLs aloud — say "sending you the link" and use the fallback channel.
- For step-by-step instructions: narrate them like a person, not like a numbered list.
- Sound like a calm, knowledgeable person, not a text-to-speech demo.
- Don't end with "Is there anything else I can help with?"Goal complete
Your voice setup:
- Press
ctrl+space(or say your wake word) → speak → hear a natural response - Responses are tuned for speech, not reading
- Links and long content fall back to your text channel
Where to go next
- Level Up: All Your Channels → — voice input from mobile
- Level Up: Run It 24/7 → — so voice works even when the laptop lid is closed
- Pick another goal →