Product roadmap

OpenVox roadmap.

What shipped, what is next, and how OpenVox is evolving.

Explore 2.3.0

Active milestone

2.3.0

Now shipping

More Select & Read voices and smarter pronunciation

200+ New Voices

Voice Avatars

Smart Numbers & Dates

Reading Improvements

Release spotlight

This update expands voice choices for Select & Read and improves reading across OpenVox.

Added 200+ new Supertonic voices with support for 30+ languages in Select & Read mode.
Added avatar images across the Voice Library for easier voice recognition.
Improved number, date, percentage, currency, and decimal pronunciation with Smart Numbers & Dates, previously called Convert Numbers to Words.
Fixed previous AI models remaining in memory after switching to Supertonic.
Improved EPUB imports containing decorative drop caps.
Improved word replacements containing comma-formatted numbers.
Added UI fixes and improvements across OpenVox.

Direction

From voice app to local voice infrastructure.

Future roadmap

Next up

Planned

Watch Folder Automation

A watch folder that auto-converts any text file or document dropped into it using your preset voice and model, with no UI interaction needed.

Planned

SRT / Subtitle Export

Native .srt export so generated speech can drop straight into video editing timelines without hours of manual subtitle work in post-production.

Planned

Release history

Recent version history

Recent releases focused on MCP support, Select & Read, audiobook workflow improvements, Supertonic 3 fixes, and model parameter controls.

2.3.0

Select & Read voices and smarter pronunciation

Added 200+ new Supertonic voices with support for 30+ languages in Select & Read mode.
Added avatar images across the Voice Library for easier voice recognition.
Improved number, date, percentage, currency, and decimal pronunciation with Smart Numbers & Dates, previously called Convert Numbers to Words.
Fixed previous AI models remaining in memory after switching to Supertonic.
Improved EPUB imports containing decorative drop caps.
Improved word replacements containing comma-formatted numbers.
Added UI fixes and improvements across OpenVox.

2.2.0

Automation and read-aloud improvements

Introducing an all new revamped menu bar with quick access to new actions with resources status.
Introducing MCP support, making it easier to connect OpenVox with Claude Code, Cursor, and other AI agents.
Improved Select & Read Page layout and improved handling for larger models and added notch model loading animation.
Fixed model generation defaults so each model keeps its own parameter settings.
Fixed Voice Changer respecting the "Auto-play generated audio" setting.
Fixed audiobook play buttons remaining after generated audio or local data was cleared.

2.1.0

Select & Read and audiobook workflow

Added Select & Read to read selected text aloud from any Mac app using a global shortcut.
Added a compact notch-style player for Select & Read playback.
Added menu bar mode with quick access to show the app, change voice/speed, and start or stop the local API.
Added startup options for launching OpenVox in window mode or menu bar mode.
Added Start OpenVox at Login setting.
Improved audiobook generation and export progress overlays with clearer chapter and full-book progress.
Improved audiobook delete options so you can delete the full book or only generated audio.
Added global word replacement settings under Text Preprocessing, including support for pause tags.
Fixed multiple bugs in Audiobook workflow.

2.0.2

Audiobook editing polish

Polished UI elements across the AI Speech page and Voice Library.
Improved AI Audiobook editing with manual chapter creation and drag-and-drop chapter reordering.
Refined audiobook controls with clearer hover states, better selection styling, and play/pause feedback per chapter.
Added a larger full-column settings view on the audiobook page for easier model, voice, and generation tuning.

2.0.1

Audiobook and word replacement fixes

Fixed a bug where voices did not show on the Audiobook page for Supertonic 3 Model.
Fixed reported issues in word replacement feature.

2.0.0

Supertonic 3

Added Supertonic 3, a blazing-fast multilingual TTS model with 31 languages support that can run up to 2-3x faster than Kokoro.
Near real-time model loading makes it easier to jump into speech generation instantly.
Ultra-fast replies are ideal for voice agents, assistants, API workflows, and rapid iteration.
Great fit for long-form generation where speed matters, including audiobooks and batch voice work.

1.9.1

Voice Clone trimming and Local API playback

Improved Voice Clone trimming with more accurate waveform alignment at all zoom levels.
Added Local API playback support with play: true, so agents and scripts can ask OpenVox to generate speech and play it directly in the app.
Improved Local API parameter handling so speed now works consistently across all models.

1.9.0

Voice Clone workflow improvements

Improved Voice Clone with guided trimming for longer audio clips, including waveform preview, zoom, drag handles, and quality checks for clean start/end points.
Updated Voice Clone sample requirements to 5-12 seconds for more stable results.
Added a confirmation step to ensure the Reference Script matches the audio clip before saving a cloned voice.
Improved Voice Changer voice selection so custom voices from all languages appear correctly.
Added M4A and AIFF source audio selection support in Voice Changer.
Added model parameter controls to the Local API.

1.8.1

Pocket TTS reliability

Fixed a Pocket TTS loading issue that could affect some macOS user accounts.
Added the ability to create new chapters inside an existing audiobook.
Chapter title is now editable in the chapter detail editor.

1.8.0

Pocket TTS

Added Pocket TTS with support for English, French, German, Italian, Portuguese, and Spanish.
Delivers fast, low-latency generation for AI agents and Local API workflows using cloned or custom voices.
Efficiently handles longer generations, including conversations and audiobooks.
Works with voices from your Voice Library and custom voice recordings.
Save your preferred model, language, and voice as defaults across AI Speech, Conversations, and Audiobooks.

1.7.0

Voice creation

Speed controls now support precise 0.05 adjustments.
Voice Clone now only accepts 5–15 second audio samples to ensure stability of clone voice outputs.
Voice Design saving is now integrated into the generation results.
Improved MLX performance tuning for Macs with higher memory.
Added an interactive guided tour to help explore OpenVox.

1.6.6

Script pauses & presets

Add precise pauses using tags.
Save reusable parameter presets for your favourite voice and model combinations.
Add custom audiobook cover artwork and embed it in exported audiobooks.
Fixed crashes and improved generation and export reliability.

1.6.5

Backup & API

Added portable ZIP backup support so you can transfer saved voices between Macs or restore them after reinstalling OpenVox.
Added an auto-start option for the Local API server, making it easier to use OpenVox with agents, automations, and local workflows.
Expanded Conversations support from 4 speakers to up to 16 speakers, making it much better for podcasts, scripts, plays, audiobooks, and larger dialogue projects.

1.6.4

Audiobook export

Added M4B export for better audiobook compatibility with chapters and timeline support, and removed the slower MP3 export option.
Improved reliability and reduced memory usage when merging and exporting large audiobooks, with exports now up to 10x faster.
Added EPUB cover support, including thumbnail display inside the app.
Added cover artwork embedding for exported M4A and M4B audiobooks.
Fixed an issue where batch audiobook generation could show 100% progress before the final chapter had actually finished.

1.6.3

Export speed

Added an audiobook export format picker, with M4A selected by default for faster and smaller exports.
Added real export progress, showing actual bytes processed instead of a generic loading animation.
Audiobook chapters now use efficient M4A intermediates instead of large temporary WAV files, reducing storage usage by up to 10x.
Improved cleanup so temporary audiobook files are removed when books or chapters are deleted.
Added a completion dialog after successful export, with a Show in Finder button to reveal the exported file.
Polished the export save dialog for a cleaner, more consistent experience.

1.6.2

Settings access

Settings are now easier to access from the top title bar, helping you adjust preferences faster and keep your workflow smoother.

1.6.1

Conversations fix

Cloned voices and Voice Design voices now work correctly in Conversations and through the Local API.

1.5.0

Local API

Added OpenVox Local API support for AI agents and external tools.
Enhanced Voice Templates so gender tags stay in sync with the voice design input.
Added PDF import in AI Speech single mode for faster text-to-audio workflows.
Improved word replacements under Pre Processing so they apply more reliably.

1.4.1

Stability

Fixed reversed typing in Batch mode on older Macs.
Fixed microphone permission detection and recording access for voice cloning.

1.4.0

OmniVoice

Added OmniVoice support for more natural, expressive, and context-aware speech.
Expanded language coverage to 600+ languages with stronger multilingual performance.
Added new library voices for 25 more languages using the OmniVoice model.

1.3.0

Audiobooks

Added speed control support to more TTS models beyond Kokoro.
Improved the AI Audiobook experience across import, chapter handling, and export.
Added EPUB import for AI AudioBook.
Large ebook imports now show a proper loading animation to keep long imports responsive.
Merge and export now include processing feedback for long audiobook exports.
Individual chapters can now be deleted after ebook import.
Exported chapter filenames are cleaner and easier to sort.
Fixed batch paragraph editor text direction behavior for RTL languages on older macOS versions.
The Daily Power badge is now hidden when Pro is active.

1.2.3

Batch mode

Added support for custom model storage paths in Settings, including moving existing model files to a new location.
Batch Mode now supports importing text from .txt, .rtf, and .csv files.
Added a downloadable CSV template and faster cleanup with the Clear All button.

1.2.2

Fixes

Fixed a launch crash affecting users with M1 and M1 Max Macs.
Model downloads now persist across updates and do not need to be downloaded again after updating.
Paste now always inserts plain text, avoiding unwanted formatting from browsers and docs.
Fixed post-processing failures affecting silence removal and normalization.

1.2.1

Voice changer

Fixed an issue where AI Voice Changer could fail for certain audio clip lengths.
Improved AI Voice Changer processing for longer audio clips.

1.2

Voice design

Added Qwen3 TTS support with high-quality reference voice cloning across 10 languages.
Added Voice Design so you can describe a voice in natural language and generate a brand new reusable voice locally on your Mac.
Improved output generation times across the app.
Added a speech recognition helper for Voice Cloning that can detect the spoken reference script from uploaded samples.
Improved Model Manager download and setup flows.
Improved model memory management.

1.1

Core fixes

Fixed a Voice Conversion issue where MP3 inputs could fail during processing with a format error.
Fixed an issue where Manage Models opened the Kokoro model page instead of the main Model Manager page.
Fixed Resource Manager status so the currently loaded model is shown correctly instead of displaying "no model loaded" when a model is active.