Applies to: Exchange Server 2007 SP3, Exchange Server 2007 SP2, Exchange Server 2007 SP1, Exchange Server 2007
Topic Last Modified: 2007-07-21

When you install the Unified Messaging (UM) server role on a computer that is running Microsoft Exchange Server 2007, several UM-specific components and services are installed. The Unified Messaging services and components that are installed by Setup enable a Unified Messaging server to answer and process incoming voice and fax calls and enable users to interact with the Unified Messaging system by using Outlook Voice Access or by hearing a UM auto attendant when they call in to the UM system. This topic discusses the interaction between these UM components and services and how the services and components provide the features that are offered by Unified Messaging.

Overview of Unified Messaging Services

The features and components of Unified Messaging rely on the functionality of two Exchange 2007 services: the Microsoft Exchange Unified Messaging service (UMservice.exe) and the Microsoft Exchange Speech Engine service (SpeechService.exe). The Service Control Manager controls and monitors both of these services and their related processes.

The Microsoft Exchange Unified Messaging service enables voice and fax messages to be stored in an Exchange 2007 mailbox and gives users telephone access to e-mail, voice mail, calendar, and contacts. If you stop this service, Unified Messaging features will not be available for users in your organization. For the Microsoft Exchange Unified Messaging service to work, the Microsoft Exchange Speech Engine service must already be started and functioning correctly.

The Microsoft Exchange Speech Engine service controls the following:

  • The dual tone multi-frequency (DTMF), also known as touchtone, interface

  • Automatic Speech Recognition (ASR) that is used with the Voice User Interface (VUI) in Outlook Voice Access

  • The Text-to-Speech (TTS) engine that reads e-mail, voice mail, and calendar items and plays the menu prompts for callers

When the Microsoft Exchange Unified Messaging service and Microsoft Exchange Speech Engine service are starting, they each create their own worker processes: the UM worker process (UMWorkerProcess.exe) and the Speech Engine service worker process (SESWorker.exe). Each UM worker process enables the Microsoft Exchange Unified Messaging service and the Microsoft Exchange Speech Engine service to interact to provide Outlook Voice Access and call answering. The Speech Engine service worker process provides the TTS engine features, enables callers to use both Outlook Voice Access interfaces, and plays the system prompts for callers. For more information about Outlook Voice Access, see Understanding Unified Messaging Subscriber Access. For more information about Unified Messaging system prompts, see Understanding Unified Messaging Audio Prompts.

The following figure illustrates the relationships between Unified Messaging components.


Unified Messaging Architecture

Service Ports

The Microsoft Exchange Unified Messaging service and the UM worker process use multiple Transmission Control Protocol (TCP) and User Datagram Protocol (UDP) service ports to communicate with IP gateways and the Speech Engine service worker process that is created by the Microsoft Exchange Speech Engine service at startup. The Microsoft Exchange Unified Messaging service and the UM worker process use Session Initiation Protocol (SIP) over TCP. By default, the Microsoft Exchange Unified Messaging service listens on TCP port 5060 in unsecured mode and TCP port 5061 when Mutual Transport Layer Security (MTLS) is used. Each UM worker process that is created listens on TCP port 5065 and 5066. However, when an IP gateway or IP sends Realtime Transport Protocol (RTP) traffic to the Speech Engine service worker process, the IP gateway or IP PBX will use a valid UDP port that ranges from 1024 through 65535.

A TCP control port is also used on a Unified Messaging server. When a UM worker process is created, the Microsoft Exchange Unified Messaging service passes the appropriate configuration options to the UM worker process. The configuration options that are sent include the parameters for the TCP control port number that is used for communication between the Microsoft Exchange Unified Messaging service and the UM worker process. The TCP control port that is chosen will fall between TCP ports 16,000 to 17,000.

New in Service Pack 1 (SP1)

  • The Microsoft Exchange Unified Messaging service will listen on TCP ports 5060 and 5061 at the same time.

  • Each UM worker process that is created listens on port 5065 and 5067 (unsecured) and 5066 and 5068 (secured).

Unified Messaging Services

The Microsoft Exchange Unified Messaging service is one of the two services that provide Unified Messaging services for your network. The Microsoft Exchange Unified Messaging service performs the following functions:

  • Retrieves the dial plan configuration from the Active Directory directory service

  • Loads the configuration information for monitoring Unified Messaging worker processes from the UmRecycleConfig.xml file

  • Initializes the UM Worker Process Manager and the startup of a UM worker process

  • Registers SIP endpoints

The Microsoft Exchange Unified Messaging service first accepts all incoming connections, and then reroutes those requests to a UM worker process that handles the incoming request. In addition, the Microsoft Exchange Unified Messaging service monitors any UM worker process that is created and ensures that the UM worker process is functioning correctly. If a UM worker process becomes unresponsive, the Microsoft Exchange Unified Messaging service stops the UM worker process, and then create a new UM worker process to replace it.

Note:
By default, each UM worker process will be recycled every seven days or 604,800 seconds. The setting can be found in the \bin\umrecyclerconfig.xml file.

The Microsoft Exchange Unified Messaging service works with the Microsoft Exchange Speech Engine service to implement all the telephony features that are offered by Exchange 2007 Unified Messaging. The Microsoft Exchange Unified Messaging service handles call control and interacts with the Microsoft Exchange Speech Engine service to handle the incoming media streams that are negotiated in the SIP signaling information between the Microsoft Exchange Unified Messaging service and a SIP-enabled telephony device such as an IP gateway or IP PBX. The following events occur when an incoming call is initiated by the Microsoft Exchange Unified Messaging service:

  1. A call session is initiated by the Microsoft Exchange Unified Messaging service.

  2. The Microsoft Exchange Unified Messaging service redirects the call to a UM worker process.

  3. The UM worker process requests that a media session be established with the Microsoft Exchange Speech Engine service, and then the UM worker process relays the media information back to the caller.

  4. The Speech Engine Service worker process that is created by the Microsoft Exchange Speech Engine service provides a UDP port for the RTP stream.

  5. The UM worker process uses the SIP signaling information to inform the Speech Engine Services worker process to end the call session when the RTP media stream is no longer needed.

Unified Messaging Worker Process

A Unified Messaging worker process is a process that is created during the startup of the Microsoft Exchange Unified Messaging service. UM worker processes interact with all incoming and outgoing requests that have been received by the Microsoft Exchange Unified Messaging service.

The Unified Messaging Worker Process Manager is also a component of the Microsoft Exchange Unified Messaging service. The UM Worker Process Manager handles the creation and monitoring of all the UM worker processes that are created. The UM Worker Process Manager creates new instances of a UM worker process based on the configuration settings that are located in the UmRecyclerConfig.xml file and also monitors the health of these processes. As a new incoming call arrives, the UM Worker Process Manager determines the appropriate instance of a UM worker process to which to redirect the call. The UM worker process then interacts with the Microsoft Exchange Speech Engine service components to correctly process incoming and outgoing requests. The UM worker process is responsible for the following startup tasks:

  • Allocation of the runtime management objects

  • Loading of the UM configuration from UMConfig.xml

  • Initialization of the fax job listener

  • Registration of the process with the Microsoft Exchange Speech Engine service

  • Initialization of Simple Mail Transfer Protocol (SMTP) message submission

For more information about Voice over IP (VoIP) security in Unified Messaging, see Understanding Unified Messaging VoIP Security.

The Unified Messaging worker process also contains a fax provider that lets users receive fax messages in their Exchange 2007 mailbox. The fax provider that is included in a UM worker process uses the T.38 protocol over UDP Transport Layer (UDPTL). This UM worker process transfers the fax message and then creates and processes the compressed Tagged Image File Format (TIFF) of the fax message that is received. For more information about faxing in Unified Messaging, see Understanding Faxing in Unified Messaging.

Microsoft Exchange Speech Services

The Microsoft Exchange Speech Engine service is an embedded speech engine that is installed when you install the Unified Messaging server role. This Microsoft Exchange Speech Engine service is an Interactive Voice Response (IVR) platform that provides speech recognition capability that is used to recognize user input and provide Text-to-Speech (TTS) capabilities.

The applications in an IVR platform communicate with end users through a telephony or VoIP network. The Microsoft Exchange Speech Engine service supports SIP and RTP for telephony connectivity and TLS. For Unified Messaging, when an incoming call is received, the Microsoft Exchange Speech Engine service processes the RTP stream that is associated with the call, and then passes the information and events to the UM worker process that is managing the SIP connection. The Microsoft Exchange Speech Engine service supports the following features in Unified Messaging:

  • Automatic Speech Recognition (ASR) input recognition

  • DTMF, or touchtone, input recognition

  • The TTS conversion process

  • Recording e-mail and voice mail messages

  • Playing e-mail and voice mail messages to the user

For more information about Automatic Speech Recognition, see Understanding Automatic Speech Recognition Directory Lookups. For more information about the TTS engine, see Understanding Unified Messaging Audio Prompts.

When the Microsoft Exchange Speech Engine service is starting, it creates the Speech Engine Service worker process. During call flow, the Speech Engine Service worker process is responsible for recognizing touchtone or voice input from the user. For example, if a caller uses ASR or voice inputs to navigate the main menu, the following steps occur: 

  1. An Outlook Voice Access user calls a subscriber access number and logs on to their mailbox or an outside caller dials in to a number that is configured to have a UM auto attendant and they use ASR or voice inputs to navigate the main menu.

  2. When a call is received by a Unified Messaging server, the Unified Messaging server determines whether the menu is speech-enabled. If the menu is speech-enabled, the Unified Messaging server uses specific prompts and grammars.

  3. The UM worker process notifies the Speech Engine service worker process to begin recognition based on the grammar file that is needed. For this example, the main menu is needed. Therefore, the Speech Engine service worker process loads the mainmenu.grxml file. The Microsoft Exchange Speech Engine service plays the main menu prompts over the telephone to the Outlook Voice Access user.

  4. For example, the user may respond by saying “e-mail”. The voice traffic that is created is sent over an RTP stream and is received by the Speech Engine Service worker process. The Speech Engine Service worker process, which has already loaded the mainmenu.grxml file, compares the voice recognition results to the contents in the file. The result is sent to the UM worker process.

  5. The UM worker process determines what transition to make based on the results from the Speech Engine Service worker process. For this example, the next transition state is to play the menu of e-mail options to the user.

  6. The correct activity manager is loaded into memory for playing the e-mail menu. The corresponding grammar file for the e-mail menu, which is email.grxml, is then loaded by the Speech Engine Service worker process.

  7. The UM worker process sends a request to the Microsoft Exchange Speech Engine service to play the corresponding prompts for the e-mail menu.

For more information about the grammar files that are used in Unified Messaging, see Understanding Automatic Speech Recognition Directory Lookups.

A similar series of events occurs when a caller is using DTMF, or touchtone, inputs to navigate the menus. Handling of DTMF input resembles handling voice inputs, except that the Speech Engine Service worker process notifies the UM worker process when DTMF events are detected in the RTP stream. The data that is passed by this event corresponds to the number pressed by the caller. For more information about the DTMF interface, see Understanding the DTMF Interface.

For More Information

For an overview of Unified Messaging, see Unified Messaging.

For more information about telephony concepts and components, see Overview of Telephony Concepts and Components.



Unified Messaging architecture