Sesame
Conversational speech model that achieves voice presence
About
The page presents research by Sesame AI on conversational AI, focusing on their CSM (Conversational Speech Model) and its evaluation using the Expresso dataset. It details two CMOS studies: one without context to assess naturalness (where generated and human speech were indistinguishable), and one with 90 seconds of audio and text context to assess appropriateness (where human speech was consistently favored, indicating a gap in prosody). The article announces the open-sourcing of their work under an Apache 2.0 license, discusses current limitations (English-centric, no pre-trained LM utilization), and outlines future plans including scaling, multilingual expansion, and development of fully duplex multimodal models. It concludes with a call for recruitment.
Categories & Tags
Color Palette
Text Black
#000000
Background White
#FFFFFF
Link Blue
#0000EE
Light Grey
#CCCCCC
Typography
Sans-serif
Body Text
Sans-serif
Headings
Design Review
Similar Products
Clear for Slack
Clear messages get answered quicker
Griply 2026
Achieve your goals with a goal-oriented task manager
HappyMail
We made email simple again
Blober.io
The easiest way to transfer files between cloud providers.
Supaguard
Scan, Detect & Protect Your Supabase Data
Timelines Time Tracking 4
Track your time to achieve your New Year resolutions.
SoftReveal — Reveal less. Engage more.
Hide Content, Reveal on Click
Reword
Rewrite messages without leaving your workflow
MoovAI
Launch viral AI ads & pro social content in minutes
Resell AI
Reselling workflow with market-based price suggestions
Qwen-Image-2512
SOTA open-source T2I model with even greater realism
Friendware
Tab-to-complete everywhere on MacOS