Understanding Load Balancing in Exchange 2010

Applies to: Exchange Server 2010 SP3, Exchange Server 2010 SP2

Topic Last Modified: 2012-02-29

Load balancing is a way to manage which of your servers receive traffic. Load balancing provides failover redundancy to ensure your users continue to receive Exchange service in case of computer failure. It also enables your deployment to handle more traffic than one server can process while offering a single host name for your clients.

In addition to load balancing, Microsoft Exchange Server 2010 provides several solutions for switchover and failover redundancy. These solutions include the following:

High availability and site resilience You can deploy two Active Directory sites in separate geographic locations, keep the mailbox data synchronized between the two, and have one of the sites take on the entire load if the other fails. Exchange 2010 uses database availability groups (DAGs) to keep multiple copies of your mailboxes on different servers synchronized.
Online mailbox moves In an online mailbox move, end users can access their e-mail accounts during the move. Users are only locked out of their accounts for a brief time at the end of the process, when the final synchronization occurs. Online mailbox moves are supported between Exchange 2010 databases and between Exchange Server 2007 Service Pack 3 (SP3) or a later version of Exchange 2007 and Exchange 2010 databases. You can perform online mailbox moves across forests or in the same forest.
Shadow redundancy Shadow redundancy protects the availability and recoverability of messages while they're in transit. With shadow redundancy, the deletion of a message from the transport databases is delayed until the transport server verifies that all the next hops for that message have completed. If any of the next hops fail before reporting successful delivery, the message is resubmitted for delivery to the hop that didn't complete.

Contents

Overview of Load Balancing

Understanding Exchange 2010 Traffic Loads

Understanding Load Balancing Options

Load Balancing Recommendations

Affinity Options

Overview of Load Balancing

Load balancing serves two primary purposes. It reduces the impact of a single Client Access server failure within one of your Active Directory sites. In addition, load balancing ensures that the load on your Client Access server and Hub Transport computers is evenly distributed.

Architectural Changes in Exchange 2010 Load Balancing

Several changes in Exchange 2010 make load balancing important for your organization. The Exchange RPC Client Access service and the Exchange Address Book service on the Client Access server role improve the user's experience during Mailbox failovers by moving the connection endpoints for mailbox access from Outlook and other MAPI clients to the Client Access server role instead of to the Mailbox server role. In earlier versions of Exchange, Outlook connected directly to the Mailbox server hosting the user's mailbox, and directory connections were either proxied through the Mailbox server role or referred directly to a particular Active Directory global catalog server. Now that these connections are handled by the Client Access server role, both external and internal Outlook connections must be load balanced across the array of Client Access servers in a deployment to achieve fault tolerance.

A load-balanced array of Client Access servers is recommended for each Active Directory site and for each version of Exchange. It isn't possible to share one load-balanced array of Client Access servers for multiple Active Directory sites or to mix different versions of Exchange or service pack versions of Exchange within the same array.

When you install Exchange 2010 within your existing organization and configure a legacy namespace for coexistence with previous versions of Exchange, your clients will automatically connect to the Exchange 2010 Client Access server or server array. The Exchange 2010 Client Access server or Client Access server array will then proxy or redirect client requests for mailboxes on older Exchange versions to either Exchange 2003 front-end servers or Exchange 2007 Client Access servers that match the mailbox version. For more information, see Understanding Upgrade to Exchange 2010.

Note:
You can mix Quick Fix Engineering (QFEs) and update rollups when you apply them to all or parts of an array. We recommend that you apply QFEs and update rollups to all computers within an array.

Your load balancing configuration will have a direct effect on the host names that your clients use to connect and the Secure Sockets Layer (SSL) certificates that you use. For more information about Exchange 2010 certificates, see Understanding Digital Certificates and SSL.

Configuring the Client Access Server Array

You can configure one Client Access server array per Active Directory site. As soon as the Client Access server array has been configured, you can configure the Mailbox database to use the Client Access server array as the MAPI endpoint instead of a specific Client Access server.

For more information about the Client Access server array and how to configure the Mailbox database to use the Client Access server array for the specific Active Directory site, see Understanding RPC Client Access.

Understanding Exchange 2010 Traffic Loads

Before you configure load balancing, you should understand the loads that are placed on an Exchange 2010 Client Access server. An Exchange 2010 Client Access server receives the following three types of traffic:

Traffic from external clients
Traffic from internal clients
Proxy traffic from other Client Access servers

Proxy traffic from other Client Access servers is traffic that is originally sent by an external or internal client to one Client Access server but is then proxied to another Client Access server. This can happen for several reasons, but generally it happens because the originating client can't connect directly to the destination Client Access server. This can occur when a user is trying to access a mailbox from the Internet, but the mailbox is located in a non-Internet facing Active Directory site. For more information about proxying, see Understanding Proxying and Redirection.

Each of the types of traffic received by Client Access servers includes requests from a list of protocols and comes from client devices and computers with different characteristics. These differences affect which load balancing strategies can be used.

Return to top

Understanding Load Balancing Options

There are several key technology differences between the different load balancing solutions.

Performance How many requests per second can the solution handle?
Manageability How simple is it to configure and deploy the load balancing solution?
Failover automation and detection How smart is the load balancer about detecting when a Client Access server or service has failed?
Affinity Which types of client to Client Access server affinity does the load balancing solution support?

Understanding Affinity

When a load balancing solution provides client-to-Client Access server affinity, it means that there is a long-standing association between a particular client and a particular Client Access server. The client can be Outlook running on a laptop, Microsoft Exchange ActiveSync running on a mobile device, Microsoft Office Outlook Web App, Exchange Web Services, or another client application.

This long-standing association, or affinity, ensures that all requests sent from the client go to the same Client Access server. Some Exchange 2010 protocols require affinity and other Exchange protocols do not.

Windows Network Load Balancing

Windows Network Load Balancing (WNLB) is the most common software load balancer used for Exchange servers. There are several limitations associated with deploying WNLB with Microsoft Exchange.

WNLB can't be used on Exchange servers where mailbox DAGs are also being used because WNLB is incompatible with Windows failover clustering. If you're using an Exchange 2010 DAG and you want to use WNLB, you need to have the Client Access server role and the Mailbox server role running on separate servers.
Due to performance issues, we don't recommend putting more than eight Client Access servers in an array that's load balanced by WNLB.
WNLB doesn't detect service outages. WNLB only detects server outages by IP address. This means if a particular Web service, such as Outlook Web App, fails, but the server is still functioning, WNLB won’t detect the failure and will still route requests to that Client Access server. Manual intervention is required to remove the Client Access server experiencing the outage from the load balancing pool.
WNLB configuration can result in port flooding, which can overwhelm networks.
Because WNLB only performs client affinity using the source IP address, it's not an effective solution when the source IP pool is small. This can occur when the source IP pool is from a remote network subnet or when your organization is using network address translation.

Load Balancing Recommendations

There are several load balancing options available. The option you use depends on the size and configuration of your network.

Windows Network Load Balancing with Source IP Affinity

The first load balancing option is WNLB with source IP affinity. This solution is suitable if you have more than one Client Access server per Active Directory site but fewer than eight. This solution is built into Windows and doesn't require additional computers.

There are two scenarios in which you will not want to use WNLB.

Your organization has a reverse proxy server that communicates directly with the Client Access server and not through the WNLB virtual IP address. The reverse proxy server hides the client IP addresses from the Client Access server array. Therefore, source IP affinity won't work as expected. However you may still want to use WNLB to load balance internal traffic.
Your organization has many clients accessing your Client Access servers through a very small set of IP addresses. WNLB tends to affinitize an entire class C subnet to one Client Access server.

Hardware Load Balancing

If you have more than eight Client Access servers in a single Active Directory site, your organization will need a more robust load balancing solution. Although there are robust software load balancing solutions available, a hardware load balancing solution provides the most capacity. For more information about Exchange 2010 server load balancing solutions, see Microsoft Unified Communications Hardware Load Balancer Deployment.

Hardware load balancers support very high traffic throughput and can be configured to load balance in many ways. Most hardware load balancer vendors have detailed documentation about how their product works with Exchange 2010. The simplest way to configure hardware load balancers is to create a fallback list of the affinity methods that will be applied by the load balancer. For example, the load balancer will try cookie-based affinity first, then SSL session ID, and then source IP affinity.

Reverse Proxy Solutions

If you have a reverse proxy solution that can perform load balancing for the servers it publishes to the Internet, such as Microsoft Forefront Threat Management Gateway (TMG) or Forefront Unified Access Gateway (UAG), we recommend that you use it.

As traffic passes through the reverse proxy server to reach your Client Access servers, the client's original IP address is replaced by the IP address of the reverse proxy server. This breaks source IP affinity. There are ways to resolve this problem, including configuring the reverse proxy server to be the default gateway for the subnet it is proxying to.

However, most current reverse proxy servers can do load balancing for the services they publish to the Internet. These reverse proxy servers support load balancer-created cookie load balancing for the Exchange services that support this. This solution is more reliable than source IP load balancing. For this to work, the reverse proxy server must be able to read and modify the HTTP data stream. If you're using SSL, this means that the reverse proxy server must decrypt the traffic to read the contents and create the cookie within the stream. This decryption isn't possible in some circumstances, such as when you're using client certificate authentication, where the client connects to the Client Access server.

Return to top

Affinity Options

Different load balancing solutions offer different methods for associating clients with a specific Client Access server. There are several common types of affinity available in different load balancing products, both hardware and software. Not all types of affinity will be available in every load balancing option, as described in the following examples:

WNLB only supports source IP affinity or no affinity.
A software load balancer in a separate server array can use load balancer-created cookies for the protocols that support those cookies and source IP affinity for the remaining protocols.
Hardware load balancers with SSL offloading let you configure more complex behavior. For example, you can configure a set of existing cookies that will take effect for protocols that support those cookies, as well as a load balancer-created cookie, SSL session ID, and source IP.

In addition to the options that are supported by the different load balancing solutions, you can also configure some of these steps to be applied only for certain Exchange protocols and services. Because each protocol behaves differently, this can help optimize performance.

Existing Cookies or HTTP Headers

Using existing cookies or HTTP headers is the most reliable way to identify a client and associate it with a specific Client Access server. These cookies and headers are created by the client or server as part of the communications protocol. This option also doesn't require the load balancer to modify the traffic, which helps performance.

When you use this affinity option, be aware of the following:

Your load balancer must support this type of affinity. Currently only hardware load balancers support this affinity.
This affinity only works for protocols that pass traffic on HTTP.
There must be an existing cookie or header that remains constant during the client session and is unique to each specific client, or small set of clients, in the protocol.
The load balancer solution must be able to read and interpret the HTTP data stream. If you're using SSL, this means that the load balancer must decrypt the traffic to read the contents. Sometimes this results in an increased load on the load balancer. Also, this decryption isn't possible in some circumstances, such as when you use client certificate authentication with the SSL session where the client connects to the Client Access server.

The existing cookies and HTTP headers suitable for load balancing that are available in Exchange 2010 protocols are the following:

HTTP Basic authentication authorization header This header works when HTTP Basic authentication is used. Basic authentication is the default and most commonly used type of authentication for Exchange ActiveSync. This header is uncommon for other protocols and authentication methods. The Basic authentication authorization header sends all traffic that uses Basic authentication and that is from a specific user to the same Client Access server. This header is also used when Outlook traffic is transmitted completely via HTTP and clients are behind a reverse proxy server.

HTTP OWA UserContext cookie This cookie works for Outlook Web App, which is the only client that uses it. When you use forms-based authentication (FBA) with Outlook Web App, which is the default configuration, a small set of requests are made at the start of an Outlook Web App session before the UserContext cookie is created. To ensure that those requests use affinity to connect the client to the same Client Access server, which is required for forms-based authentication to work, there has to be a fallback affinity option when you use the UserContext cookie. We recommend that you use the SSL session ID or source IP affinity as a fallback to provide affinity for those initial requests, before the load balancer gets the UserContext cookie to use.

Note:
Outlook Web App requests that use explicit logon to access a specific mailbox result in the use of a UserContext cookie with a different name and ID. The cookie starts with UserContext, but a string that identifies the individual mailbox is appended. This complicates load balancing with the UserContext cookie because the load balancer must first find a cookie starting with UserContext. This can result in decreased performance.

Note:

Outlook Web App requests that use explicit logon to access a specific mailbox result in the use of a UserContext cookie with a different name and ID. The cookie starts with UserContext, but a string that identifies the individual mailbox is appended. This complicates load balancing with the UserContext cookie because the load balancer must first find a cookie starting with UserContext. This can result in decreased performance.

HTTP Exchange Control Panel msExchEcpCanary cookie This cookie only works for the Exchange Control Panel.

HTTP Outlook 2010 OutlookSession cookie Hardware load balancers support the OutlookSession cookie and other generic cookies. The following table describes the OutlookSession client cookie support requirements for Outlook RPC/HTTP:

	Windows XP	Windows Vista	Windows 7
Outlook 2003	Not supported	Not supported	Not supported
Outlook 2007	Not supported	Not supported	Not supported
Outlook 2007 Hosting Pack (KB2544404)	Not supported	Supported	Supported
Outlook 2010	Not supported	Supported	Supported

Note:
Microsoft Outlook running on Windows XP does not support the OutlookSession cookie for load balancing. In this scenario, we recommend that you use IP load balancing.

HTTP Remote PowerShell MS-WSMAN cookie This method works only for Remote PowerShell.

Return to top

Load Balancer Created Cookie

The second most reliable way to associate a client session with a Client Access server is by using a load balancer-created cookie. The load balancer adds an HTTP cookie to the client/server protocol conversation and then uses that cookie to determine which Client Access server should handle an incoming request. The Exchange 2010 applications that support this method are Outlook Web App, Exchange Control Panel, and Remote PowerShell. This type of cookie has several limitations.

The load balancer must support this type of affinity. Currently only hardware load balancers and software load balancers that run on a separate server tier support this affinity.
This method only works for protocols that pass traffic on HTTP. You can't use this method for the RPC Client Access service, Exchange Address Book service, POP3, or IMAP4.
The load balancer solution must be able to read and interpret the HTTP data stream. If you're using SSL, this means that the load balancer must decrypt the traffic to read the contents. Sometimes this results in a bigger load on the load balancer. In other cases, it isn't possible for the load balancer to interpret the HTTP data stream, such as when you use client certificate authentication on the Client Access server.
The client must be able to receive arbitrary cookies from the server and must then include those cookies in all future requests sent from the client to the server. Exchange ActiveSync clients, Outlook Anywhere clients, and some Exchange Web Service clients such as Microsoft Office Communications Server 2007 devices don't support this.

SSL Session ID

Load balancing based on the SSL session ID provides more detail than source IP affinity and lets you split up traffic from different clients even if those clients are coming in from the same IP address. SSL session ID load balancing also has the advantage of letting you load balance without decrypting the SSL traffic. This is required when you use client certificate authentication and when you end the SSL connection at the Client Access server.

SSL session ID affinity isn't recommended in the following two situations:

Some clients, such as Internet Explorer 8, re-create their SSL session for each browser process that runs on the client computer. This results in a new SSL session for each Outlook Web App window. Because this breaks client affinity for Outlook Web App, deploying load balancing in this manner is not supported for Exchange 2010. Some mobile devices, such as the Apple iPhone, also create new SSL sessions for some parts of their Exchange ActiveSync communication with Exchange.

Note:
When you use client certificate authentication, browsers will use the same SSL session for all traffic to a given host name. As long as client certificate authentication is enabled, SSL session ID is a valid affinity option for Outlook Web App and Exchange Control Panel.

In the case of Outlook Anywhere, the Client Access servers will use the Windows RPC Proxy component to pair up the RPC_DATA_IN and the RPC_DATA_OUT connections. This can adversely affect performance.

Source IP

The most common way to provide affinity between clients and Client Access servers is by using source IP affinity. The load balancer examines a client's IP address and sends all traffic from a specific source IP to a specific Client Access server. This is the only type of affinity supported by WNLB. There are two important aspects to consider when you use source IP affinity.

Affinity breaks when the client changes IP address. This can occur when a laptop is moved from a wired LAN to Wi-Fi or roams between different Wi-Fi networks. There is a user impact when the client changes IP address. For example, when they're using Outlook Web App, users will have to authenticate every time their computer obtains a new IP address.
If many of your clients access your load balancing solution from the same IP address, the load distribution will become uneven. The impact of this depends on how many clients are masked behind a given IP address. For example, if you have four Client Access servers and 50 percent of your clients access your load balancer from the same IP address, at least 50 percent of your traffic will go to one Client Access server and the other three Client Access servers will handle the rest of the traffic. There are two main reasons why most clients will access your Exchange organization through a single IP address.
- Network address translators (NATs) or outgoing proxy servers, such as Microsoft Forefront Threat Management Gateway (TMG). When there is a NAT or outgoing proxy server between your clients and your Client Access servers, the original client IP addresses are masked by the NAT or outgoing proxy server IP address.
- Client Access server to Client Access server proxy traffic. In some scenarios, one Client Access server proxies traffic to another Client Access server. Typically, this happens between Active Directory sites because a client must access the Client Access server within the same Active Directory site as their mailbox. For more information about proxying, see Understanding Proxying and Redirection.

No Affinity

The last type of affinity is no affinity. When you don't use affinity, each request from a client is assigned to a random Client Access server. We don't recommend this option for protocols that require affinity or those that experience performance benefits from affinity.

It's recommended that you not use affinity for protocols that don't need affinity when SSL offloading is configured.

Return to top

Summary of Load Balancing Options

The following table provides a summary of the load balancing options that are available.

Solution	Client to Client Access server affinity	Failover method	Capacity	Cost
Hardware load balancer	Depending on the protocol and the client, fall back between the following: Existing cookie Load balancer-created cookie SSL ID Source IP	Automatic failover with minimal client downtime. Hardware load balancers also are able to provide failover for a specific protocol.	++++	$$$
Software load balancer in a separate server layer Note: TMG and UAG are the only workable solutions for external traffic.	Either load balancer-created cookie or source IP, depending on the protocol and client.	Automatic failover with minimal client downtime.	++	$$
Software load balancer in the same server layer as the Client Access server (WNLB)	Source IP.	Automatic failover with minimal client downtime.	+	$
DNS round robin	Each client gets a random Client Access server IP address.	Manual steps to detect issues and recover. Browser and operating system DNS caching behavior may inhibit client connections even after recovery has been performed by an administrator. This solution breaks affinity for many protocols, including RPC Client Access, Outlook Web App, Exchange Web Services, and Exchange Control Panel.	+++	$
No load balancer	Separate host names are manually assigned for each Client Access server.	Manual steps to detect issues and failover. Client DNS caches cause slow failover.	+	N/A

There are several advantages and disadvantages to each of these options.

Hardware load balancers usually include performance and security functionality such as SSL offloading and traffic inspection.
Software load balancers in a separate server layer are usually included as parts of larger software packages, with reverse proxy capabilities like pre-authentication, SSL offloading, and extensive traffic inspection. When software load balancers pre-authenticate users, those users don't need to re-authenticate if the Client Access server they are affinitized to fails. However, some software load balancers require an affinity between the client and the reverse proxy server. In this case, you need an additional load balancing layer in front of the reverse proxy servers before those reverse proxy servers can perform load balancing duties for your Client Access servers.