Use Cases > Speech-powered agents

Speech-powered agents

FoundationaLLM makes it easy to interact with LLM powered agents. In addition to interacting over text chat, FoundationaLLM support interactions over voice. This is accomplished by integrating speech to text models (like OpenAI Whisper or Microsoft AI Speech) that translate spoken audio into text that is submitted to the agent conversation on behalf of the user. Similarly, when the agent has a response, a text to speech model (like Microsoft AI Speech) can be used to speak out loud the text the LLM produced in response. You can also give the agent a voice that sounds like somebody you know using services like Microsoft Custom Neural Voice.