Voice Gateway Parameter Details¶
Cognigy Voice Gateway has many configuration settings that are controlled directly from within your Flow. These settings can be applied individually to the scopes:
- Setting Session Parameters. Session parameters can comfortably be set with the Set Session Config Node. When executed, the settings will apply for the remainder of the session.
- Setting Activity Parameters. Activity parameters can be set per activity/node. If, for example, barge-in is set on the Play Node, barge-in will only be activated during the execution of this Node. Therefore, the user can interrupt the virtual agent during this output but not afterward. These configurations are also available in the Say, Question, and Optional Question Nodes.
Settings¶
Synthesizer - Text-To-Speech¶
The TTS settings can be chosen from a pre-filled dropdown for Microsoft Azure, AWS, Google, Nuance, or a custom vendor.
Parameter | Type | Description |
---|---|---|
TTS Vendor | Dropdown | Defines the desired TTS Vendor. You can select a custom vendor. |
Custom (Vendor) | CognigyScript | Allows for specifying an TTS Vendor, which is not in the dropdown list. This option is only available on Voice Gateway. For pre-installed providers, use all lowercase letters, for example, microsoft , google , aws . For custom providers, use the name that you specified on the Speech Service page in Voice Gateway. The Custom field appears if you selected Custom from the TTS Vendor list. |
TTS Language | Dropdown | Defines the language of the Voice virtual agent output. |
Custom (Language) | CognigyScript | Allows for choosing a TTS language, which is not in the dropdown list. Defines the language of the virtual agent output. The format to use depends on the TTS Vendor, for example, de-DE , fr-FR , en-US . The Custom field appears if you selected Custom from the TTS Language list. |
TTS Voice | Dropdown | Defines the voice that should be used for the Voice virtual agent output. |
Custom (Voice) | CognigyScript | Allows for choosing a TTS voice, which is not in the dropdown list. This setting can be the case for region-specific voices. The format to use depends on the TTS Vendor, for example, de-DE-ConradNeural . |
TTS Label | CognigyScript | The alternative name of the vendor is the one you specify in the Voice Gateway Self-Service Portal. If you have created multiple speech services from the same vendor, use the label to specify which service to use. |
Enable Advanced TTS Config | Toggle | Enables the addition of a URL for an Azure Custom Voice Endpoint. |
Disable TTS Audio Caching | Toggle | Disables TTS audio caching. By default, the setting is deactivated. In this case, previously requested TTS audio results are stored in the virtual agent cache. When a new TTS request is made, and the audio text has been previously requested, the virtual agent retrieves the cached result instead of sending another request to the TTS provider. When the setting is activated, the virtual agent no longer caches TTS results. In this case, each request is directly sent to your speech provider. Note that disabling caching can increase TTS costs. For detailed information, contact your speech provider. |
Recognizer - Speech-To-Text¶
The STT settings can be chosen from a pre-filled dropdown for Microsoft Azure, AWS, Google, Nuance, Soniox, or a custom vendor.
Parameter | Type | Description |
---|---|---|
STT Vendor | Dropdown | Defines the desired STT Vendor. You can select a custom vendor. |
Custom (Vendor) | CognigyScript | Allows for specifying an STT Vendor, which is not in the dropdown list. Select the desired STT Vendor. This option is only available on Voice Gateway. For pre-installed providers, use all lowercase letters like microsoft , google , aws . For custom providers, use the name that you specified on the Speech Service page in Voice Gateway. The Custom field appears if you selected Custom from the STT Vendor list. |
STT Language | Dropdown | Defines the language that should be recognized. |
Custom (Language) | CognigyScript | Allows for choosing a STT language, which is not in the dropdown list. This setting can be the case for region-specific voices. The format to use depends on the STT Vendor, for example, de-DE , fr-FR , en-US . The Custom field appears if you selected Custom from the STT Language list. |
Deepgram Tier | Dropdown | This parameter is active only when Deepgram is selected in the STT Vendor setting. Choose a tier for your API request, and ensure that the model is available for the chosen STT language. For detailed information about Deepgram tiers, refer to the Deepgram documentation. |
Deepgram Model | Dropdown | This parameter is active only when Deepgram is selected in the STT Vendor setting. Choose a model for processing submitted audio. Each model is associated with a tier. Ensure that the selected tier is available for the chosen STT language. For detailed information about Deepgram models, refer to the Deepgram documentation. |
Endpointing | Toggle | This parameter is active only when Deepgram is selected in the STT Vendor setting. Deepgram's Endpointing feature watches streaming audio for long pauses that signal the end of speech. When it spots an endpoint, it finalizes predictions and returns the transcript, marking it as complete with the speech_final parameter set to true . For detailed information about Deepgram Endpointing, refer to the Deepgram documentation.The duration for detecting the end of speech is preconfigured with a default value (10 milliseconds). If you want to change this value, use the Endpointing Time setting. |
Endpointing Time | Number | This parameter is active only when Deepgram is selected in the STT Vendor setting and the Endpointing toggle is enabled. Customize the duration (in milliseconds) for detecting the end of speech. The default is 10 milliseconds of silence. Transcripts are sent after detecting silence, and the system waits until the speaker resumes or the required silence time is reached. Once either condition is met, a transcript is sent back with speech_final set to true . |
Smart Formatting | Toggle | This parameter is active only when Deepgram is selected in the STT Vendor setting. Deepgram's Smart Format feature applies additional formatting to transcripts to optimize them for human readability. Smart Format capabilities vary between models. When Smart Formatting is turned on, Deepgram will always apply the best-available formatting for your chosen model, tier, and language combination. For detailed examples, refer to the Deepgram documentation. Note that when Smart Formatting is turned on, punctuation will be activated, even if you have the Disable STT Punctuation setting enabled. |
STT Hints | Text | Array of words or phrases to assist speech detection. Note: This requires support from the STT engine. The field is not available for the Nuance speech vendor. |
Dynamic Hints | CognigyScript | Uses context or input for adding array hints. For example, {{context.hints}} or {{input.hints}} . You can override these settings using Advanced parameters. |
STT Label | CognigyScript | The alternative name of the vendor is the one you specify in the Voice Gateway Self-Service Portal. If you have created multiple speech services from the same vendor, use the label to specify which service to use. |
Google Model | Dropdown | This parameter is active only when Google is selected in the STT Vendor setting. Utilizes one of Google Cloud Speech-to-Text transcription models, with the latest_short model being the default choice. For a detailed list of Google models, refer to the Transcription models section in the Google Documentation. Keep in mind that the default value is a Google Model type that can be used if other models don't suit your specific scenario. |
Enable Voice Activity Detection | Toggle | Delays connection to cloud recognizer until the speech is detected. |
VAD Sensitivity | Slider | Detection sensitivity, the lowest value has the highest sensitivity. |
Minimal Voice Duration | Slider | Milliseconds of speech activity required before connecting to the cloud recognizer. |
Disable STT Punctuation | Toggle | This parameter is active only when Google or Deepgram is selected in the STT Vendor setting. Prevents the STT response from the virtual agent to include punctuation marks. |
Enable Advanced TTS Config | Toggle | Enables the addition of an ID for an Azure's Custom Speech model deployment. |
Enable Audio Logging | Toggle | Enables recording and logging of audio from the user on Azure. |
Recognize Language | Toggle | Enables the addition of alternative languages for recognition. You can select a maximum of 3 languages. To reuse these languages in other Nodes, such as the child Nodes of the Lookup Node, use the following format: de-DE , fr-FR , en-US . |
Barge In¶
Barge In enables the interruption of the virtual agent.
Parameter | Type | Description |
---|---|---|
Barge In On Speech | Toggle | Enables interrupting the virtual agent with speech. |
Barge In On DTMF | Toggle | Enables interrupting the virtual agent with DTMF digits. |
Barge In Minimum Words | Slider | Defines the minimum number of words that the user must say for the Voice Gateway to consider it a barge-in. |
User Input Timeout¶
Defines what should happen when there is no input from the user.
Parameter | Type | Description |
---|---|---|
User No Input Mode | Dropdown | Defines the action if a user does not provide an input to the virtual agent in time. |
User No Input Timeout | Number | Defines the timeout for user input in ms. |
User No Input Retries | Number | Defines how often the virtual agent should retry to get an input from a user before completing the call. |
DTMF¶
Enables DTMF collection.
Parameter | Type | Description |
---|---|---|
Capture DTMF signals | Toggle | Enables capturing DTMF signals by the virtual agent. |
DTMF Inter Digit Timeout | Number | Defines the timeout between collected DTMF digits. |
DTMF Max Digits | Number | Defines the maximum number of digits the user can enter. The digits are submitted automatically once this limit is reached. |
DTMF Min Digits | Number | Defines the minimum number of digits before they are forwarded to the virtual agent. A submit digit can override this. |
DTMF Submit Digit | CognigyScript | Defines the DTMF submit digit, which is used for submitting the previously entered digits. This action overrides the minimum digits validation. |
Continuous ASR¶
Continuous ASR enables the Voice Gateway to concatenate multiple STT recognitions of the user and then send them as a single textual message to the virtual agent.
Parameter | Type | Description |
---|---|---|
Enable Continuous ASR | Toggle | Enable or disable Continuous ASR. |
Continuous ASR Submit Digit | CognigyScript | Defines a special DTMF key, which sends the accumulated recognitions to the flow. |
Continuous ASR Timeout | Number | Defines the number of milliseconds of silence before the accumulated recognitions are sent to the flow. |
Advanced¶
Parameter | Type | Description |
---|---|---|
Additional Session Parameters | JSON | Allows for configuring settings using JSON. If you have already made changes using the UI settings above, this field will overwrite them. Also, you can specify additional parameters in the JSON, which are unavailable in the UI, such as vendor credentials. If you want to specify a custom TTS or STT provider in the vendor parameter, use the custom:<provider-name> format, for example, "vendor": "custom:My Speech provider" . |
JSON example:
{
"synthesizer": {
"vendor": "microsoft",
"language": "de-DE""voice": "en-US-JennyNeural"
},
"recognizer": {
"vendor": "google",
"language": "de-DE",
"hints": [
"help",
"skip",
"confirm"
],
"hintBoost": 20
}
}