> ## Documentation Index
> Fetch the complete documentation index at: https://docs.cognigy.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Answering Machine Detection

The *Answering Machine Detection* feature can be enabled on outbound calls to provide an indication of whether a call has been answered by a person or a machine. To use this feature, provide the `amd` property in a [dial](/voice-gateway/references/verbs/dial) verb.

In this example, the Answering Machine Detection feature is activated as soon as the call is answered, and later sends a webhook to `amd` to determine if a human or a machine has answered the call.

```json theme={null}
{
  "verb": "dial",
  "actionHook": "dial",
  "callerId": "+49XXXXXXXXXXX",
  "target": [
    {
      "type": "phone",
      "number": "+49XXXXXXXXXXX",
      "trunk": "Twilio"
    }
  ],
  "amd": {
        "actionHook": "amd",
        "recognizer": {
            "vendor": "microsoft",
            "language": "en-US",
        }
    }
}
```

Example of a webhook payload:

```json theme={null}
{"type":"amd_human_detected"} 

{"type":"amd_machine_detected","reason":"hint","hint":"call has been forwarded","language":"en-us"}

{"type":"amd_no_speech_detected"}
```

## Configuration

The following table lists the available parameters:

| Parameter                          | Type             | Description                                                                                                                                                                                             | Required |
| ---------------------------------- | ---------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -------- |
| actionHook                         | string \| object | A webhook to receive an HTTP POST for AMD events. The default value is `amd`.                                                                                                                           | Yes      |
| thresholdWordCount                 | number           | The number of spoken words in a greeting that result in an `amd_machine_detected` result. The default value is `9`.                                                                                     | No       |
| recognizer                         | object           | Speech recognition parameters, used as per the [gather](/voice-gateway/references/verbs/gather) and [transcribe](/voice-gateway/references/verbs/transcribe) verbs. The default value is `application`. | No       |
| recognizer.vendor                  | string           | The speech recognition provider to use, for example, Google, Amazon, or Azure. The vendor determines transcription quality, supported languages, and feature availability.                              | Yes      |
| recognizer.label                   | string           | A custom label to identify this recognizer instance in logs or dashboards. Helpful when multiple recognizers are configured.                                                                            | No       |
| recognizer.language                | string           | The primary language code for transcription, for example, `en-US` for English, `fr-FR` for French. Determines how speech is interpreted.                                                                | No       |
| recognizer.hints                   | array            | An array of words or phrases that may appear in the audio and should be recognized more accurately. Useful for domain-specific terms, names, or technical vocabulary.                                   | No       |
| recognizer.hintsBoost              | number           | A numeric value specifying how strongly the recognizer should prioritize the hint words. Higher numbers give stronger emphasis, improving accuracy for key terms.                                       | No       |
| recognizer.altLanguages            | array            | An array of additional language codes that the recognizer can use for multilingual audio. Allows recognition of mixed-language content.                                                                 | No       |
| recognizer.profanityFilter         | boolean          | If `true`, the recognizer will automatically remove or mask profanity from the transcription output.                                                                                                    | No       |
| recognizer.interim                 | boolean          | If `true`, returns partial transcription results as the audio is being processed. Useful for live captions or real-time feedback.                                                                       | No       |
| recognizer.punctuation             | boolean          | If `true`, punctuation marks, for example, periods or commas, are included in the transcription to improve readability.                                                                                 | No       |
| recognizer.diarization             | boolean          | If `true`, enables speaker diarization, which assigns segments of the transcript to individual speakers.                                                                                                | No       |
| recognizer.diarizationMinSpeakers  | number           | The minimum number of speakers expected in the audio. Helps the diarization algorithm distinguish between speakers accurately.                                                                          | No       |
| recognizer.diarizationMaxSpeakers  | number           | The maximum number of speakers expected in the audio. Prevents the algorithm from splitting speech unnecessarily.                                                                                       | No       |
| recognizer.vad                     | object           | Voice Activity Detection settings. Determines how the system detects when someone is speaking vs. silence, improving transcription timing and accuracy.                                                 | No       |
| recognizer.fallbackVendor          | string           | Specifies an alternative transcription vendor to use if the primary vendor fails. Ensures reliability in critical workflows.                                                                            | No       |
| recognizer.fallbackLanguage        | string           | Language code to use for the fallback vendor. Must match a language supported by the fallback provider.                                                                                                 | No       |
| timers                             | object           | An object containing various timeouts.                                                                                                                                                                  | No       |
| timers.noSpeechTimeoutMs           | number           | The time in milliseconds to wait for speech before returning `amd_no_speech_detected`. The default value is `5000`.                                                                                     | No       |
| timers.decisionTimeoutMs           | number           | The time in milliseconds to wait before returning `amd_decision_timeout`. The default value is `15000`.                                                                                                 | No       |
| timers.toneTimeoutMs               | number           | The time in milliseconds to wait to hear a tone. The default value is `20000`.                                                                                                                          | No       |
| timers.greetingCompletionTimeoutMs | number           | The silence in milliseconds to wait for during greeting before returning `amd_machine_stopped_speaking`. The default value is `2000`.                                                                   | No       |

# Events

The payload included in the `actionHook` always contains a type property describing the event type.
Some event types may include additional properties.

| Event                           | Description                                        | Additional Properties                                                                                                                                                                                                        |
| ------------------------------- | -------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| amd\_human\_detected            | A human is speaking.                               | `{reason, greeting, language}`, where: <br /> - `reason` — a short greeting, <br /> - `greeting` — a recognized greeting. <br /> - `language` — a recognized language.                                                       |
| amd\_machine\_detected          | A machine is speaking.                             | `{reason, hint, transcript, language}`, where: <br /> - `reason` — a hint or long greeting. <br /> - `hint` — a recognized hint. <br /> - `transcript` — a recognized greeting. <br /> - `language` — a recognized language. |
| amd\_no\_speech\_detected       | No speech was detected.                            | -                                                                                                                                                                                                                            |
| amd\_decision\_timeout          | No decision was able to be made in the time given. | -                                                                                                                                                                                                                            |
| amd\_machine\_stopped\_speaking | Machine has completed the greeting.                | -                                                                                                                                                                                                                            |
| amd\_tone\_detected             | A beep was detected.                               | -                                                                                                                                                                                                                            |
| amd\_error                      | An error has occurred.                             | An error message.                                                                                                                                                                                                            |
| amd\_stopped                    | Answering Machine Detection was stopped.           | -                                                                                                                                                                                                                            |

Multiple events can occur during a single call. For example, on a call to an answering machine, the sequence could be:

1. `amd_machine_detected`
2. `amd_tone_detected`
3. `amd_machine_stopped_speaking`

## Inbound calls

You can use Answering Machine Detection for incoming calls by adding an `amd` property in a [`config`](/voice-gateway/references/verbs/config) verb. It can be useful in situations where Voice Gateway is located behind a dialer. In these cases, the dialer initiates the outbound call and then links it to Voice Gateway via an `INVITE` request.

## More information

* [ANSWERING\_MACHINE\_DETECTION](/voice-gateway/references/events/ANSWERING_MACHINE_DETECTION)
