The Answering Machine Detection feature can be enabled on outbound calls to provide an indication of whether a call has been answered by a person or a machine. To use this feature, provide the amd property in a dial verb. In this example, the Answering Machine Detection feature is activated as soon as the call is answered, and later sends a webhook to amd to determine if a human or a machine has answered the call.

{
  "verb": "dial",
  "actionHook": "dial",
  "callerId": "+49XXXXXXXXXXX",
  "target": [
    {
      "type": "phone",
      "number": "+49XXXXXXXXXXX",
      "trunk": "Twilio"
    }
  ],
  "amd": {
        "actionHook": "amd",
        "recognizer": {
            "vendor": "microsoft",
            "language": "en-US",
        }
    }
}

Example of a webhook payload:

{"type":"amd_human_detected"} 

{"type":"amd_machine_detected","reason":"hint","hint":"call has been forwarded","language":"en-us"}

{"type":"amd_no_speech_detected"}

Configuration

The following table lists the available parameters:

Parameter	Type	Description	Required
actionHook	string \| object	A webhook to receive an HTTP POST for AMD events. The default value is `amd`.	Yes
thresholdWordCount	number	The number of spoken words in a greeting that result in an `amd_machine_detected` result. The default value is `9`.	No
recognizer	object	Speech recognition parameters, used as per the gather and transcribe verbs. The default value is `application`.	No
recognizer.vendor	string	The speech recognition provider to use, for example, Google, Amazon, or Azure. The vendor determines transcription quality, supported languages, and feature availability.	Yes
recognizer.label	string	A custom label to identify this recognizer instance in logs or dashboards. Helpful when multiple recognizers are configured.	No
recognizer.language	string	The primary language code for transcription, for example, `en-US` for English, `fr-FR` for French. Determines how speech is interpreted.	No
recognizer.hints	array	An array of words or phrases that may appear in the audio and should be recognized more accurately. Useful for domain-specific terms, names, or technical vocabulary.	No
recognizer.hintsBoost	number	A numeric value specifying how strongly the recognizer should prioritize the hint words. Higher numbers give stronger emphasis, improving accuracy for key terms.	No
recognizer.altLanguages	array	An array of additional language codes that the recognizer can use for multilingual audio. Allows recognition of mixed-language content.	No
recognizer.profanityFilter	boolean	If `true`, the recognizer will automatically remove or mask profanity from the transcription output.	No
recognizer.interim	boolean	If `true`, returns partial transcription results as the audio is being processed. Useful for live captions or real-time feedback.	No
recognizer.punctuation	boolean	If `true`, punctuation marks, for example, periods or commas, are included in the transcription to improve readability.	No
recognizer.diarization	boolean	If `true`, enables speaker diarization, which assigns segments of the transcript to individual speakers.	No
recognizer.diarizationMinSpeakers	number	The minimum number of speakers expected in the audio. Helps the diarization algorithm distinguish between speakers accurately.	No
recognizer.diarizationMaxSpeakers	number	The maximum number of speakers expected in the audio. Prevents the algorithm from splitting speech unnecessarily.	No
recognizer.vad	object	Voice Activity Detection settings. Determines how the system detects when someone is speaking vs. silence, improving transcription timing and accuracy.	No
recognizer.fallbackVendor	string	Specifies an alternative transcription vendor to use if the primary vendor fails. Ensures reliability in critical workflows.	No
recognizer.fallbackLanguage	string	Language code to use for the fallback vendor. Must match a language supported by the fallback provider.	No
timers	object	An object containing various timeouts.	No
timers.noSpeechTimeoutMs	number	The time in milliseconds to wait for speech before returning `amd_no_speech_detected`. The default value is `5000`.	No
timers.decisionTimeoutMs	number	The time in milliseconds to wait before returning `amd_decision_timeout`. The default value is `15000`.	No
timers.toneTimeoutMs	number	The time in milliseconds to wait to hear a tone. The default value is `20000`.	No
timers.greetingCompletionTimeoutMs	number	The silence in milliseconds to wait for during greeting before returning `amd_machine_stopped_speaking`. The default value is `2000`.	No

Events

The payload included in the actionHook always contains a type property describing the event type. Some event types may include additional properties.

Event	Description	Additional Properties
amd_human_detected	A human is speaking.	`{reason, greeting, language}`, where: - `reason` — a short greeting, - `greeting` — a recognized greeting. - `language` — a recognized language.
amd_machine_detected	A machine is speaking.	`{reason, hint, transcript, language}`, where: - `reason` — a hint or long greeting. - `hint` — a recognized hint. - `transcript` — a recognized greeting. - `language` — a recognized language.
amd_no_speech_detected	No speech was detected.	-
amd_decision_timeout	No decision was able to be made in the time given.	-
amd_machine_stopped_speaking	Machine has completed the greeting.	-
amd_tone_detected	A beep was detected.	-
amd_error	An error has occurred.	An error message.
amd_stopped	Answering Machine Detection was stopped.	-

Multiple events can occur during a single call. For example, on a call to an answering machine, the sequence could be:

amd_machine_detected
amd_tone_detected
amd_machine_stopped_speaking

Inbound calls

You can use Answering Machine Detection for incoming calls by adding an amd property in a config verb. It can be useful in situations where Voice Gateway is located behind a dialer. In these cases, the dialer initiates the outbound call and then links it to Voice Gateway via an INVITE request.

More information

ANSWERING_MACHINE_DETECTION

Overview

Self-Service Portal

Outbound Calls

References

Answering Machine Detection

Configuration

Events

Inbound calls

More information

Overview

Self-Service Portal

Outbound Calls

References

​Configuration

​Events

​Inbound calls

​More information

Configuration

Events

Inbound calls

More information