text-to-speech

Get started

Realtime speech for AI agents. Studio-quality voices at $0.40 an hour.

372 / 1,000

we're cheaper

The same realtime speech, at a fraction of the cost.

ProviderCostLatency
Kova$0.40 / hrTBD
Google Chirp 3 HD$1.35 / hr
ElevenLabs v3$4.50 / hr
OpenAI tts-1-hd$1.35 / hr
Inworld TTS-2$1.13 / hr<200 ms

Cost converts each provider's published list price for the noted model to dollars per hour of audio, assuming about 150 words per minute (45,000 characters per hour). Latency, where shown, is the vendor's reported server-side time to first audio (excludes network); left blank where no comparable figure is published.


...and better

The same script, in each provider's voice. Kova first.

Originally, we built our own voice model for internal use because we couldn’t find one that made the unit economics work. Over time, we realized the technology had potential far beyond our own needs. So, we decided to make it public—hoping it could unlock entirely new categories of businesses and experiences that simply haven’t been possible at current industry pricing.
Kova
0:00
Google
soon
ElevenLabs
soon
OpenAI
soon
Inworld
0:00