Make Voice Feel Natural

By the end of this page, your agent's voice responses will sound natural — not like it's reading a list of bullet points out loud.

Time: ~15 minutes

The problem with default responses in voice mode

Your agent was probably configured for text. Text responses use markdown — bullet points, headers, code blocks. Read aloud, these sound like:

"Dash dash dash. Number one, install the package. Number two, run the command. Number three..."

You need a voice-specific SOUL.md configuration.

Voice-specific SOUL.md section

Add a section to ~/.openclaw/SOUL.md specifically for voice:

## Voice mode (Talk Mode)
When responding in Talk Mode (voice):
- No markdown. No bullet points, no headers, no asterisks.
- Write in natural spoken sentences. "First you'll want to... then you..." not "1. ... 2. ..."
- Keep responses short. 2-3 sentences for simple questions. 4-5 for complex ones.
- Never read out URLs. Say "I'll send you the link" or "check the link I'm sending" instead.
- End responses naturally, not with "Is there anything else?"
- For lists, use "and" and "also" — not enumeration.

OpenClaw automatically passes a voice: true flag when Talk Mode is active, so your agent knows to apply these rules.

Prompt design for voice

For questions where you'd normally get a structured response, prime the format:

Instead of:

"What are the steps to deploy a Docker container?"

Try:

"Walk me through deploying a Docker container like you're explaining it to me out loud"

The difference in output is significant. The second prompt gets natural speech; the first gets a numbered list.

You can also add to SOUL.md:

When I ask "how do I [X]?" in voice mode, give me a narrative walkthrough, not steps.

Response length tuning

Voice responses that are too long interrupt the flow. Tune for length:

## Voice response length
- Single question → 1-2 sentences
- "Explain [topic]" → 3-4 sentences max
- "Walk me through [process]" → 5-6 sentences, then pause and ask if I want to continue

If you want the agent to pause mid-response for long topics:

For long explanations in voice: give the first part, end with "Want me to continue?" and wait.

Sending follow-up links via text

When your agent mentions something that has a URL (documentation, article), you can have it send the link as a text message after speaking:

Add to config:

{
  "talk_mode": {
    "fallbackChannel": "telegram"
  }
}

With this set, when your agent says "I'll send you the link", it automatically sends the URL to your Telegram. Voice for conversation, text for links and long content.

A good voice SOUL.md setup

Here's a complete voice section that works well:

## Voice mode
In Talk Mode:
- Respond in natural spoken language. No markdown, no lists, no asterisks.
- Keep it short. 2-3 sentences unless I ask for more.
- Never read URLs aloud — say "sending you the link" and use the fallback channel.
- For step-by-step instructions: narrate them like a person, not like a numbered list.
- Sound like a calm, knowledgeable person, not a text-to-speech demo.
- Don't end with "Is there anything else I can help with?"

Goal complete

Your voice setup:

Press ctrl+space (or say your wake word) → speak → hear a natural response
Responses are tuned for speech, not reading
Links and long content fall back to your text channel

Where to go next

Level Up: All Your Channels → — voice input from mobile
Level Up: Run It 24/7 → — so voice works even when the laptop lid is closed
Pick another goal →

Make Voice Feel Natural

On this page