Voice Preview¶
The Voice Preview feature lets users quickly test their voice output without going through the entire flow with a mock call. It supports cloud speech-to-text and text-to-speech services from providers such as Google, Microsoft, and Amazon AWS. To use this feature, you need to input raw text or SSML syntax, select the language and voice. However, note that the SSML syntax must be supported by the chosen Voice Preview provider.
Restrictions¶
Only Google, Microsoft, and Amazon AWS speech providers support the Voice Preview feature. When testing a voice Flow through the Interaction Panel, you should check the STT and TTS settings of the voice Nodes to ensure that one of these providers is specified. If any other provider, such as Evenlabs, is specified in the Nodes, the Interaction Panel won't support the voice conversation.
Access STT and TTS Providers¶
If your environment has network connection restrictions or requires strict security rules, make sure both Speech-to-Text (STT) and Text-to-Speech (TTS) providers are accessible. Add the IP address ranges of your STT and TTS providers to the firewall's whitelist. By doing so, the Cognigy.AI server can interact smoothly with both providers, guaranteeing the expected functionality of the Voice Preview feature.
Set up a Voice Preview Provider¶
To set up a voice provider to test your voice agents, follow these steps:
- In the left-side menu of your project, select Manage > Settings.
-
Go to the Voice Preview Settings section and select one of the following providers:
2.1 Next to the Speech Connection field, click + to enter credentials.
2.2 In the New Connection window, fill in the following fields:
- Connection — specify a unique connection name.
- Api Key — specify an API key for Microsoft Azure Speech Services. Log in to the Azure portal, navigate to your Speech Services resource, and copy the API key labeled as Key1 from the Keys and Endpoint section. For more information, read the Microsoft Azure AI documentation.
- Region — this parameter is optional. Enter a specific region if necessary.2.1 Next to the Speech Connection field, click + to enter credentials.
2.2 In the New Connection window, specify a unique name for your connection in the Connection field.
2.3 Click Upload JSON File and upload the JSON file that you received from Google Speech Services. To obtain a JSON key file for accessing the Google Speech Services, first create a service account in the Google Cloud Console under IAM & Admin. Assign the appropriate roles, generate a JSON key file for the service account, and download this key.2.1 Next to the Speech Connection field, click + to enter credentials.
2.2 In the New Connection window, fill in the following fields:
- Access Key ID — specify an Access Key ID. Log in to the AWS Management Console, go to the IAM dashboard, select Users, and choose the IAM user. Navigate to the Security credentials tab, and under Access keys, create a new access key if one hasn't been created. Copy the Access Key ID provided after creation.
- Secret Access Key — specify a Secret Access Key. After creating the access key, you'll be prompted to download a file containing the Access Key ID and the Secret Access Key. Alternatively, you can retrieve the Secret Access Key by navigating to the IAM dashboard, selecting the user, going to the Security credentials tab, and clicking Show next to the Access Key ID to reveal and copy the Secret Access Key.
- Session Token — this parameter is optional. If you use temporary security credentials, obtain the token when using AWS STS (Security Token Service) to assume a role or federate users.
- Region — this parameter is optional. Enter the AWS region where your Amazon Polly resources are located, for example,us-east-1
for the US East (N. Virginia) region. -
Click Create.
- To check the connection, click Test.
Use Voice Preview¶
You can access the Voice Preview feature in three different ways:
- Use the hotkey Ctrl+Alt+P or Cmd+Option+P in the Flow editor to open the Voice Panel.
- Click the Voice Preview button from the Flow editor.
- Click the Voice Preview button in the interactive tooltip of a message output within the Interaction Panel. This action will copy the output text to the Voice Preview input field. The Voice Preview button displays for certain types of output, including regular text output, fallback text, and text or SSML output from channels that support voice.