Learn about answering machine detection and how it works

Answering machine detection (or “AMD”) is technology that detects whether a call has been answered by a human or by an answering machine (or voice mail). Answering machine detection is not perfect — some approaches claim to achieve a high degree of accuracy (up to 97 or 98%), but most approaches are slightly less effective than that.

How does answering machine detection work?

Answering machine detection is usually used in conjunction with dialer systems (including predictive dialers or other auto dialers) when lots of outbound calls are to be made as efficiently as possible.

When agents make outbound calls and the call is answered by an voice mail or answering machine, it reduces the number of contacts that the agent can make in a shift.

Answering machine detection algorithms are used in these situations to avoid wasting the agent's time. Instead, as can be shown in the call flow figure below, the dialer is used to dial outbound leads, and agents are not presented with the call until the answering machine detection algorithm has determined that the call was picked up by a human instead of an answering machine.

The calls that follow the left hand side of the call flow are humans that are connected to an agent (thereby giving the agent a very high contact rate). The calls that follow the right hand side of the call flow are calls that were answered by answering machines or voice mail and that do not need an agent involved.

Some call centers will still connect the right hand type of calls to agents (and require that the agent leave a personalized voice mail message).

The flow below also increases agent efficiency because calls that are not answered (e.g., that ring multiple times without an answer) are kept from the agent as are calls that are made to disconnected or bad phone numbers.

answering machine detection call flow — Call flow — answering machine detection

Answering machine detection algorithms can significantly improve a call center's outbound answer and connect rates. For example, many predictive dialer sysems will automatically tag a call that ends up with an answering machine with a disposition code (such as “answering machine”) so that those calls can be redialed at a later time. However, it is important to configure AMD processes properly to get the best results. First, let's take a look at the different ways that call center software systems implement answering machine detection.

Types of answering machine detection

There are several approaches to answering machine detection. Some rely on algorithms that are based on call progress analysis (CPA), while others (more recently) use machine learning techniques to predict when an answering machine has answered vs. a human. Still others rely on “energy analysis”. A brief review of each is discussed below.

Energy analysis answering machine detection

“Energy analysis” forms of answering machine detection use an algorithm that uses a short time energy function to determine when a human is speaking and when they have transitioned to silence and makes a determination as to whether the called party is a live person or an answering machine.

The algorithms typically use a short time zero crossing rate to determine if the energy detected is speech or a sinusoidal tone (or a pair of tones such as with DTMF). The short time zero crossing rate is an audio signal processing measurement used for a number of purposes.

The answering machine detection algorithm will reject tones and not mistake them for possible speech.

Energy analysis-type AMD is commonly used in predictive dialing applications.

Call Progress Analysis answering machine detection

Call progress analysis answering machine detection uses a few characteristics of answering machine messages to determine when a machine picked up instead of a human.

In particular, answering machines or voicemail messages generally use multiple words to tell the caller what to do. A human who answers typically says just a few words (like “hello, this is John” or simply “hello”).

As shown in the figures below, some AMD algorithms simply make a determination of how long an answer is.

answering machine detection detected a human rather than a machine — Answering machine detection — detecting a human

A similar analysis is done to detect an answering machine. Now, we know that most answering machines use messages that are more than a few words long (you typically wouldn't use just a single word as the message on your voicemail greeting).

In the figure below, we see that the answering machine (or voicemail) messages last longer, and many answering machine detection algorithms use this fact to determine that an answering machine answered rather than a human. These types of algorithms are sometimes referred to as “heuristic” algorithms.

answering machine detection -- detected a machine rather than human — Answering machine detection — detected that an answering machine answered

AI/ML answering machine detection

More recently, machine learning techniques have been used to achieve greater accuracy in answering machine detection. Classification models are typically trained to classify an answer as a human answer or a machine answer. These classification models can use audio analysis or image analysis.

Here's how an image analysis model works. A machine learning training set of data is obtained (which includes both human answers and voicemail answers). The sounds are converted to mel-spectrograms (which may be represented as images).

Once the model has been trained, the model can be deployed for use in a call center and snippets of audio are converted (in real time) to mel-spectograms and processed using the machine learning model to classify each call as answered by either a human or a machine.

As these machine learning models improve, these types of answering machine detection approaches will likely be the most accurate.

When should answering machine detection be used?

Answering machine detection is a good candidate for any large outbound calling campaign that requires agents to be very efficient. For example, outbound scheduling campaigns are a good candidate for AMD.

In an outbound schedule setting campaign, the revenue made by the call center is typically low, so agents need to be very efficient. Also, when an answering machine is reached, it is possible to leave a standard voicemail message instructing the customer to call back to schedule the appointment (or event better, instructing the customer to visit a scheduling web page to self-schedule an appointment).

How will you use answering machine detection in your call center?