Skip to content

Speech Services

Version badge

Speech services integrate Speech-to-Text (STT) or Text-to-Speech (TTS) vendors in the Voice Gateway Self-Service Portal. To ensure the AI Agent gets a voice, a speech service must be selected within the Application. By connecting with a speech vendor of your choice, you can select between multiple voices, genders, accents, and languages. You can add multiple speech vendors to the Voice Gateway Self-Service Portal, or install multiple configurations of one speech vendor, to quickly switch between different setups.

Voice Gateway supports the following speech vendor configurations:

For the list of supported vendors and their Speech-To-Text and Text-To-Speech capabilities, see the TTS and STT Vendors reference.

If you need to create multiple speech services from the same vendor, use the Label field to create a unique speech service.

After creating a speech service, you can edit or delete it.

Warning

Users with an Account scope can only edit speech services they have created, as well as speech services created by other users with the same scope. They can still use and view speech services shared by Service providers or Admins.

Speech Services

Cloud-Based Speech Services

To configure the connection for a cloud-based speech service:

  1. Open the Voice Gateway interface.
  2. In the left-side menu, select Speech.
  3. Click Add Speech Service and select your preferred vendor.
  4. Select the account you want to use it with.
  5. (Optional) If you need to create multiple speech services from the same vendor, use the Label field to create a unique speech service.
  6. Select Speech-To-Text, Text-To-Speech or both, depending on your use case.
  7. Follow according to your selected vendor:
  1. Enter the Access Key in the Access key ID field. For more information on AWS Access Keys, read the Amazon AWS documentation.
  2. Enter the Secret Access Key in the Secret access key field.
  3. Select a region from the Region list.
  1. Enter an API key in the API key field. For more information on API keys in Deepgram, read the Deepgram documentation.
  1. Enter an API key in the API key field. For more information on Elevenlabs API keys, read the Elevenlabs documentation.
  2. Select your language model from the Model list.
  3. (Optional) Edit the JSON code for additional options by selecting the Extra Options.
  1. Upload your Service Key to the Service key field. For more information on creating Service Keys in Google Cloud, read the Google Cloud documentation.
  1. Select Use hosted Azure service.
  2. Select a region from the Region list.
  3. Enter an API key in the API key field. For more information on linking API keys, read the Microsoft Speech Services Billing documentation.
  4. (Optional) Select a custom voice model for TTS by providing a custom voice endpoint ID in the Custom voice deployment ID field.
  5. (Optional) Select a custom speech model for STT by providing a custom speech endpoint ID in the Custom speech endpoint ID field.
  1. Enter the client ID in the Client ID field. You can obtain the client ID as well as the secret key from your Nuance program manager. For more information, read the Nuance documentation.
  2. Enter the secret key in the Secret field.
  1. Enter an API key in the API key field. For more information on Soniox API keys, read the Soniox Quick Start Guide.

Save your changes by clicking Save. Once you created a speech service, add this service to the Application.

On-Premises Speech Services

  1. Open the Voice Gateway interface.
  2. In the left-side menu, select Speech.
  3. Click Add Speech Service and select your preferred vendor.
  4. Select the account you want to use it with.
  5. (Optional) If you need to create multiple speech services from the same vendor, use the Label field to create a unique speech service.
  6. Select Speech-To-Text, Text-To-Speech or both, depending on your use case.
  7. Follow according to your selected vendor:
  1. Select Use on-prem Deepgram container.
  2. Enter the container URI for TTS in the Container URI field.
  3. Select Use TLS, if required.
  1. Select Use on-prem TTS and enter the IP port in the TTS URI field.
  2. Select Use on-prem STT to enter the IP port in the STT URI field.
  1. Select Use Azure Docker container (on-prem).
  2. Enter the container URL for TTS in the Container URL for TTS field.
  3. Enter the container URL for STT in the Container URL for STT field.
  4. Enter an API key in the Subscription key field, if required. Whether the subscription key is required will depend on your custom on-premises setup. For more information on Microsoft Azure Subscriptions, read the Subscriptions in Azure API Management documentation.

Save your changes by clicking Save. Once you created a speech service, add this service to the Application.

Add a Custom Speech Vendor

If the desired vendor is not included in the list of preinstalled vendors, or if you want to modify the configuration of an existing one, you can add a custom vendor.

Before adding a vendor to the Voice Gateway, you need to create it. To do this, use the custom-speech-example template on GitHub. Using the same template, you can customize vendors that are provided in it as examples, such as Google, AssemblyAI, and Vosk, or create a new one. After you have created the custom provider, deploy it on a server, for example, in the AWS Cloud, then copy the address of the custom provider for use in the Voice Gateway.

To add a custom speech vendor, follow these steps:

  1. Open The Cognigy Voice Gateway Self-Service Portal.
  2. In the left-side menu, select Speech.
  3. On the Speech services page, click Add speech service.
  4. On the Add a speech service page, select Custom from the Vendor list.
  5. In the Name field, specify a unique name for your provider. You need to reuse this name in the Node configuration.
  6. From the Account list, select a specific account or leave the All accounts value if you want that custom speech provider will be available for all available accounts.
  7. In the Label field, create a label only if you need to create multiple speech services from the same vendor. Then, use the label in your application to specify which service to use.
  8. Activate the Use for text-to-speech setting to use this provider as a TTS vendor. Enter the TTS HTTP URL of the server where your custom vendor is deployed.
  9. Activate the Use for speech-to-text setting to use this provider as an STT vendor. Enter the STT websocket URL of the server where your custom vendor is deployed.
  10. In the Authentication Token field, enter the key that you get from your TTS or STT vendor to set up a connection.
  11. Click Save.

To start using your speech provider, you need to specify the provider name in the Custom parameter of the relevant Nodes, such as Set Session Config, Say, Question or Optional Question, or Session Speech Parameters Config.

More Information