Support & Getting Started

OpenVox – Local Voice AI

Getting Started with OpenVox

Welcome to OpenVox – Local Voice AI
Transform text into natural speech with 300+ AI voices across 23 languages, completely private and offline on your Mac.

Quick Start (5 Minutes)

1. Install OpenVox

  • • Download OpenVox from the Mac App Store
  • • Open from your Applications folder
  • • Grant any required permissions (microphone for voice cloning, file access for AudioBooks)

2. Your First Generation

Step 1: Enter Your Text

  • • Type or paste any text into the main text area
  • • Try something simple like: "Hello! This is my first audio generation with OpenVox."

Step 2: Choose a Voice

  • • Click the voice selector or browse voices
  • • For your first try, use Kokoro-82M voices (fast generation)
  • • Popular choices: "Bella" (US Female) or "Adam" (US Male)

Step 3: Generate

  • • Click "Generate" or press ⌘+Enter
  • • On first use, OpenVox will download the AI model (~2-5 minutes, one-time only)
  • • Watch the progress bar with real-time ETA
  • • Audio plays automatically when ready!

Step 4: Export (Optional)

  • • Click "Export" to save your audio
  • • Choose WAV (high quality) or MP3 (smaller file)
  • • Save to your desired location

🎉 Congratulations! You've created your first AI-generated audio.

Understanding OpenVox

Two Powerful AI Models

OpenVox includes two complementary models, each optimized for different needs:

🚀 Kokoro-82M (Fast & Efficient)
  • 60+ voices in 9 languages
  • • ⚡ Optimized for speed and long documents
  • • 📄 Best for: Articles, scripts, batch processing
  • • 🌍 Languages: US English, British English, Japanese, Mandarin Chinese, Spanish, French, Hindi, Italian, Portuguese
🎨 Chatterbox (Quality & Versatility)
  • 240+ voices in 23 languages
  • • 🎙️ Optimized for voice cloning and premium audio
  • • ✨ Best for: High-quality voiceovers, audiobooks, professional projects
  • • 🌍 Languages: All Kokoro languages plus Arabic, Danish, Dutch, Finnish, German, Greek, Hebrew, Korean, Malay, Norwegian, Polish, Russian, Swahili, Swedish, Turkish

Choosing the Right Model:

  • Need speed? → Use Kokoro-82M
  • Need quality? → Use Chatterbox
  • Need voice cloning? → Use Chatterbox (only model that supports cloning)

Main Features Tour

🎤 AI Speech Generation (Main Tab)

The primary feature for converting text to speech.

Basic Controls:

  • Text Input: Enter or paste your text (no length limits)
  • Voice Selector: Browse 300+ voices by model, language, gender, or favorites
  • Speed: Adjust from 0.5x to 2.0x (default: 1.0x)
  • Model Switcher: Choose between Kokoro-82M and Chatterbox

Advanced Controls:

  • Temperature: Control randomness (higher = more variation)
  • CFG Weight: Classifier-free guidance (Chatterbox only)
  • Exaggeration: Voice characteristic intensity (Chatterbox only)
  • Post-Processing: Silence removal, audio normalization

Tips for Best Results:

  • • Use proper punctuation for natural pacing
  • • Break very long texts into paragraphs
  • • Use commas for pauses
  • • Avoid excessive ALL CAPS or exclamation marks!!!
  • • For technical terms, use phonetic spelling if mispronounced

📖 AI AudioBook Generation

Create complete audiobooks from PDF or text files.

How to Use:

  1. Click the AudioBook tab in the sidebar
  2. Click "New AudioBook" or import a PDF/text file
  3. OpenVox auto-detects chapters (or create manually)
  4. Set voice and settings per chapter (or use same for all)
  5. Generate individual chapters or batch process entire book
  6. Export final audio when complete

Features:

  • • Chapter-by-chapter management
  • • Per-chapter voice customization
  • • Batch processing for entire books
  • • Integrated audio player for preview
  • • Chapter reordering and deletion

Best For: Converting books to audio format, creating narrated content, long-form content with multiple sections.

🔄 AI Voice Changer

Transform existing audio to different voice characteristics.

How to Use:

  1. Click the Voice Changer tab
  2. Import source audio (MP3 or WAV)
  3. Select target voice from 300+ options
  4. Adjust exaggeration control (how much to transform)
  5. Click "Convert"
  6. For audio >30 seconds, automatic chunking handles it
  7. Export transformed audio

Best For: Character voice variations, podcast voice consistency, audio enhancement, creative voice effects.

🎙️ Voice Cloning

Clone voices from your own audio samples (Chatterbox only).

How to Use:

  1. Click the Voice Clone tab
  2. Click "New Voice"
  3. Choose Language: 23 languages supported
  4. Select Gender: Male or Female
  5. Provide Audio: Upload audio file (15-30 seconds recommended) OR record directly in the app
  6. Add Transcript: Type what the audio says (improves accuracy)
  7. Click "Create Voice"
  8. Use your cloned voice in AI Speech or AudioBook

Requirements:

  • • Clear audio sample (15-30 seconds minimum)
  • • One speaker only (no background voices)
  • • Reference transcript matching audio
  • • Supported format: MP3 or WAV

Tips:

  • • Use high-quality audio (no background noise)
  • • Speak naturally at normal pace
  • • Provide accurate transcript
  • • Longer samples (30-60s) = better results

💾 Generation History

All your generations are automatically saved locally.

Features:

  • • Search by text content or voice name
  • • Filter by AI Speech or Voice Changer
  • • Grid or list view
  • • Replay audio instantly
  • • Reuse settings from previous generations
  • • Export or delete past generations

Access: Click History tab in sidebar, search bar at top for quick filtering, click any item to replay audio, right-click for export or delete options.

Voice Library

OpenVox includes 300+ professional voices optimized for different use cases.

Browsing Voices

By Model:

  • Kokoro-82M: Fast generation, 60+ voices, 9 languages
  • Chatterbox: High quality, 240+ voices, 23 languages

By Language:

  • • Filter by your target language
  • • See voice count per language
  • • Preview voices with sample audio

By Characteristics:

  • Gender: Male, Female
  • Age: Young, Middle-aged, Old (Chatterbox only)
  • Accent: American, British (Chatterbox only)

Favorites: Click the star icon to save favorites, quick access to your preferred voices, works across all features.

Voice Previews

  • • Click the play icon next to any voice
  • • Listen to sample audio before generating
  • • Preview shows voice characteristics
  • • Helps you choose the right voice for your project

Managing Models

Model Library

Access via Sidebar → Manage Models button.

Available Models:

  • Kokoro-82M: ~327MB (fast, 9 languages)
  • Chatterbox (Standard): ~1.2GB (high quality)
  • Chatterbox (8-bit): ~600MB (balanced)
  • Chatterbox (4-bit): ~400MB (memory efficient)
  • Chatterbox Multilingual: ~800MB-1.5GB (23 languages)

Actions:

  • • View download status
  • • Download models in advance
  • • Delete unused models to free space
  • • Switch between quantization levels

Choosing Quantization: Standard: Best quality, largest size; 8-bit: Balanced quality/size; 4-bit: Smallest size, slightly lower quality.

First-Time Model Download

On your first generation, OpenVox automatically downloads the required model:

  • Kokoro-82M: 2-5 minutes
  • Chatterbox: 5-15 minutes (varies by version)

Progress Tracking:

  • • Real-time download progress
  • • ETA displayed
  • • Can use app while downloading

Internet Required: One-time download only, downloaded from HuggingFace, models cached locally in ~/.cache/huggingface/, After download: Completely offline!

Keyboard Shortcuts

Speed up your workflow with these shortcuts:

General

  • ⌘+Enter – Generate audio
  • ⌘+L – Focus text input
  • Space – Play/Pause audio
  • ⌘+H – Open History

Text Editing

  • ⌘+A – Select all text
  • ⌘+C – Copy
  • ⌘+V – Paste
  • ⌘+Z – Undo

Navigation

  • ⌘+1 – AI Speech tab
  • ⌘+2 – AudioBook tab
  • ⌘+3 – Voice Changer tab
  • ⌘+4 – Voice Clone tab
  • ⌘+5 – History tab

Tips for Best Results

Text Input

Do:

  • • ✅ Use proper punctuation for natural pacing
  • • ✅ Break long texts into paragraphs
  • • ✅ Use commas for pauses
  • • ✅ Write in complete sentences
  • • ✅ Use quotes for dialogue: "Hello," she said.

Don't:

  • • ❌ Use excessive exclamation marks!!!
  • • ❌ Write in ALL CAPS (unless emphasizing)
  • • ❌ Include URLs or code (spell them out instead)
  • • ❌ Use special characters excessively ($$$, ***, etc.)

Voice Selection

For Narration:

  • • Use Narrative voices (Kokoro)
  • • Professional, clear, storytelling tone

For Conversational:

  • • Use Conversational A/B voices (Kokoro)
  • • Natural, friendly tone

For Professional:

  • • Use Professional voices (Chatterbox)
  • • Formal, clear, business-appropriate

For Character Voices:

  • • Use Chatterbox with exaggeration control
  • • Experiment with different ages and accents

Speed Settings

  • 0.5x-0.8x: Slow, deliberate (learning content)
  • 1.0x: Natural pace (default)
  • 1.2x-1.5x: Faster (podcast-style)
  • 1.5x-2.0x: Very fast (time-saving)

Long Documents

For texts over 5,000 words: Kokoro-82M: Fast generation, ideal for long docs; AudioBook Feature: Best for books with chapters; Batch Processing: Generate multiple sections at once.

Export & File Management

Export Formats

WAV (Recommended for Quality)

  • • 24kHz, 16-bit, lossless
  • • Best for editing or professional use
  • • Larger file size

MP3 (Recommended for Sharing)

  • • Compressed, widely compatible
  • • Smaller file size
  • • Good for podcasts, videos, web

Export Options

From Generation View:

  • • Click "Export" button after generation
  • • Choose format and location
  • • Audio saved with timestamp

From History:

  • • Right-click any generation
  • • Select "Export"
  • • Batch export multiple items

Drag & Drop: Drag audio from history to Finder, quick export without dialogs.

Privacy & Offline Use

100% Private

Your data never leaves your Mac:

  • • ✅ All AI processing happens locally
  • • ✅ No cloud services or servers
  • • ✅ No analytics or tracking
  • • ✅ No account required
  • • ✅ No internet after model download

You can verify: Use Activity Monitor or Little Snitch, after initial model download, zero network activity, all data stored in local app container.

Completely Offline

After initial setup:

  • • ✅ No internet required for generation
  • • ✅ Perfect for travel
  • • ✅ Works on planes, trains, remote areas
  • • ✅ No API rate limits
  • • ✅ Unlimited generations

Only need internet for: Initial model download (one-time), app updates from Mac App Store.

Data Storage

Where your data lives:

  • Generation History: Local SwiftData database
  • Cloned Voices: Local app container
  • Preferences: macOS UserDefaults
  • AI Models: ~/.cache/huggingface/hub/
  • Exported Audio: Your chosen location
  • Nothing in the cloud!

Troubleshooting

Model Download Issues

Problem: Download stuck or slow

Solutions:

  • • Check internet connection
  • • Try smaller model (4-bit vs standard)
  • • Check disk space (need 1-3GB free)
  • • Restart app and retry
  • • Check Model Library for progress

Generation Issues

Problem: Audio sounds robotic

Solutions:

  • • Reset speed to 1.0x
  • • Try different voice
  • • Simplify punctuation
  • • Break long sentences

Problem: Words mispronounced

Solutions:

  • • Use phonetic spelling (e.g., "Nee-chuh" for "Nietzsche")
  • • Add hyphens (e.g., "data-base")
  • • Use commas for pacing

Performance Issues

Problem: Generation is slow

Solutions:

  • • Close other intensive apps
  • • Ensure Mac is plugged in (not low-power mode)
  • • Use Kokoro-82M for faster generation
  • • Restart Mac to clear memory
  • • Check Activity Monitor for runaway processes

Can't Find Features

Problem: Where is voice cloning?

Answer: Click "Voice Clone" tab in sidebar

Problem: How to import PDF for audiobook?

Answer: Click "AudioBook" tab → "New AudioBook" → Import PDF

Problem: Where are advanced controls?

Answer: Expand "Advanced" section below voice selector

System Requirements

Minimum Requirements

  • Mac: Apple Silicon (M1, M2, M3, M4, or later)
  • macOS: macOS 15.0 (Sequoia)
  • Disk Space: 1-3GB (varies by models installed)
  • RAM: 8GB minimum (16GB recommended)
  • Internet: For one-time model download only

Intel Macs Not Supported

OpenVox requires Apple Silicon and uses Apple's MLX framework, which is not available on Intel Macs. There are no plans for Intel support as MLX is Apple Silicon-only.

Next Steps

Explore More Features

Try AudioBook Generation:

  1. Click AudioBook tab
  2. Import a PDF or text file
  3. Generate chapter by chapter
  4. Create your first audiobook!

Experiment with Voice Cloning:

  1. Click Voice Clone tab
  2. Record or upload 30 seconds of audio
  3. Create your custom voice
  4. Use it in AI Speech

Use Voice Changer:

  1. Click Voice Changer tab
  2. Import existing audio
  3. Transform to different voice
  4. Export transformed audio

Optimize Your Workflow

Set Favorites:

  • • Star your preferred voices
  • • Quick access across all features

Use Keyboard Shortcuts:

  • ⌘+Enter to generate
  • Space to play/pause
  • ⌘+L to focus text

Organize History: Use search to find past generations, reuse settings from history, export batches for projects.

Learn Advanced Techniques

Fine-Tune Generation:

  • • Experiment with Temperature
  • • Adjust CFG Weight (Chatterbox)
  • • Use Exaggeration for character voices

Optimize for Use Case:

  • Podcasts: Conversational voices at 1.2x
  • Audiobooks: Narrative voices with chapters
  • Professional: Professional voices at 1.0x
  • Character Work: Chatterbox with high exaggeration

Batch Processing: Use AudioBook for multi-chapter content, clone voices for consistency, export in bulk from History.

Getting Help

Documentation

  • FAQ: Comprehensive answers to common questions
  • Support Docs: Detailed technical information
  • This Guide: Getting started and feature overview

Contact Support

  • Email: support@theoracleguy.in
  • Website: theoracleguy.in/support
  • Response Time: 24-48 hours

Report Issues

Include in your report:

  • • Description of the problem
  • • Steps to reproduce
  • • Sample text (if generation-related)
  • • System info (macOS version, Mac model)
  • • Screenshots or error messages

Welcome to OpenVox!

You're now ready to transform text into natural speech with complete privacy and control.

Remember:

  • • 🔒 100% private – No cloud, no tracking
  • • 🌐 Completely offline after setup
  • • 🎙️ 300+ voices across 23 languages
  • • 🚀 Two models: Kokoro-82M (speed) and Chatterbox (quality)
  • • ⚡ Apple Silicon accelerated via MLX
  • • 🎨 Voice cloning, AudioBooks, Voice Changer included

Need Help?

  • • Check the FAQ for quick answers
  • • Visit theoracleguy.in/support for more information
  • • Email support@theoracleguy.in

Enjoy creating amazing audio with OpenVox!

*Last Updated: January 27, 2026 | Version 1.0.0*

Frequently Asked Questions

Need more help?

Email us at support@theoracleguy.in or visit the OpenVox page for more information.

Last Updated: January 27, 2026 | Version: 1.0.0