text-to-speech
Get startedRealtime speech for AI agents. Studio-quality voices at $0.40 an hour.
372 / 1,000
⌘+Enter
we're cheaper
The same realtime speech, at a fraction of the cost.
| Provider | Cost | Latency |
|---|---|---|
| Kova | $0.40 / hr | TBD |
| Google Chirp 3 HD | $1.35 / hr | |
| ElevenLabs v3 | $4.50 / hr | |
| OpenAI tts-1-hd | $1.35 / hr | |
| Inworld TTS-2 | $1.13 / hr | <200 ms |
Cost converts each provider's published list price for the noted model to dollars per hour of audio, assuming about 150 words per minute (45,000 characters per hour). Latency, where shown, is the vendor's reported server-side time to first audio (excludes network); left blank where no comparable figure is published.
...and better
The same script, in each provider's voice. Kova first.
Originally, we built our own voice model for internal use because we couldn’t find one that made the unit economics work. Over time, we realized the technology had potential far beyond our own needs. So, we decided to make it public—hoping it could unlock entirely new categories of businesses and experiences that simply haven’t been possible at current industry pricing.
Kova
0:00
Google
soon
ElevenLabs
soon
OpenAI
soon
Inworld
0:00