Applies to: Exchange Server 2007 SP3, Exchange Server
2007 SP2, Exchange Server 2007 SP1, Exchange Server 2007
Topic Last Modified: 2009-04-01
This High Availability content area includes topics that you can use to design, build, and operate a highly available messaging system based on the release to manufacturing (RTM) version of Microsoft Exchange Server 2007 and Exchange 2007 Service Pack 1 (SP1). The documentation in this area includes:
- Single Copy
Clustered Mailbox Server Setup
Clustered Mailbox Servers to Exchange 2007 SP1 or SP2
Exchange 2007 Update Rollups to Clustered Mailbox Servers
Clustered Mailbox Servers
- Troubleshooting High
We recommend that you review the applicable documentation prior to designing or deploying a highly available messaging solution based on Exchange 2007 SP1.
The documentation in this area has been updated to include the latest recommendations and best practices for deploying Exchange 2007 SP1 on Windows Server 2008 and Windows Server 2003 Service Pack 2 (SP2).
High Availability for Exchange Server 2007
While minimum uptime requirements vary among organizations, every organization would like to achieve a high level of uptime. Organizations for which messaging is business-critical often choose to design a highly available messaging system to provide this uptime.
Exchange 2007 RTM and Exchange 2007 SP1 include the following built-in features that can provide quick recovery, high availability, and site resilience for Exchange 2007 Mailbox servers:
- Local continuous replication (LCR) LCR
is a single-server solution that uses built-in asynchronous log
shipping technology to create and maintain a copy of a storage
group on a second set of disks that are connected to the same
server as the production storage group. LCR provides log shipping,
log replay, and a quick manual switch to a secondary copy of the
- Cluster continuous replication
(CCR) CCR, which is a non-shared storage
failover cluster solution, is one of two types of clustered mailbox
server (CMS) deployments available in Exchange 2007. CCR is a
clustered solution (referred to as a CCR environment) that
uses built-in asynchronous log shipping technology to create and
maintain a copy of each storage group on a second server in a
failover cluster. CCR is designed to be either a one or two
data center solution, providing both high availability and
site resilience. CCR is very different from clustering in previous
versions of Exchange Server. For details about some of the
differences, see Cluster Continuous
Replication Resource Model and Cluster Continuous
Replication Recovery Behavior.
- Standby continuous replication
(SCR) SCR is a new feature introduced in
Exchange 2007 SP1. As its name implies, SCR is designed for
scenarios that use or enable the use of standby recovery servers.
SCR extends the existing continuous replication features and
enables new data availability scenarios for Exchange 2007
Mailbox servers. SCR uses the same log shipping and replay
technology used by LCR and CCR to provide added deployment options
and configurations by providing the administrator with the ability
to create additional storage group copies. SCR can be used to
replicate data from stand-alone Mailbox servers and from clustered
- Single copy clusters (SCC) SCC, which
is a shared storage failover cluster solution, is the other of two
types of clustered mailbox server deployments available in
Exchange 2007. SCC is a clustered solution that uses a single
copy of a storage group on storage that is shared between the nodes
in the cluster. SCC is somewhat similar to clustering in previous
versions of Exchange Server; however, along with numerous
improvements, there are also some significant changes. For details
about some of those changes, see Single Copy Cluster
Resource Model and Single Copy Cluster
For details about other high availability features and functionality introduced in SP1, see New High Availability Features in Exchange 2007 SP1.
High Availability for Mailbox Servers
High availability for Mailbox servers comes in two forms: service availability and data availability. Service availability is provided through the use of a Windows Server failover cluster. Data availability is provided through a built-in feature called continuous replication.
Clustered Mailbox Servers
Both CCR and SCC are solutions that are deployed in a Windows Server failover cluster. Only the Mailbox server role can be installed in a failover cluster. No other roles can be installed in a failover cluster. A Mailbox server that is deployed in a failover cluster is referred to as a clustered mailbox server. Clustered mailbox servers running in a CCR environment are very different from clustered mailbox servers running in an SCC environment. Furthermore, clustered mailbox servers in Exchange 2007 RTM and Exchange 2007 SP1 are very different from clustered mailbox servers in previous versions of Microsoft Exchange.
You can use
Get-MailboxServer <CMSName> |
fl Name, ClusteredStorageType in the Exchange Management
Shell to determine if a clustered mailbox server is hosted in a CCR
environment or in an SCC. A value of NonShared indicates
that the clustered mailbox server is in a CCR environment, and a
value of Shared indicates that the clustered mailbox server
is in an SCC. A value of Disabled indicates that the Mailbox
server is a stand-alone server.
You can also check Active Directory to determine if a clustered mailbox server is hosted in a CCR environment or in an SCC by examining the value for the msExchClusterStorageType attribute of the Mailbox server object. A value of 1 for the msExchClusterStorageType attribute indicates that the clustered mailbox server is hosted in a CCR environment, and a value of 2 indicates that the clustered mailbox server is in an SCC. A value of <Not Set> indicates that the Mailbox server is a stand-alone server.
Exchange 2007 RTM and Exchange 2007 SP1 support a maximum of two nodes that have the Mailbox server role installed (one active and one passive) in a CCR environment. A three-node failover cluster that uses a voter node and a traditional Majority Node Set quorum is also supported, but it is not the preferred cluster model. However, we recommend that most customers deploy CCR environments that use only two nodes, and either a Node and File Share Majority quorum (Windows Server 2008) or a Majority Node Set with File Share Witness quorum (Windows Server 2003). Thus, the documentation about CCR is oriented toward two-node failover clusters that use one of these quorum models.
|A single node failover cluster deployed in a CCR environment is also supported, but it is not considered to be a high availability solution because no redundancy exists in the cluster. When using a single node failover cluster deployed in a CCR environment, you should use a Majority Node Set quorum (traditional, without a file share witness).|
Single Copy Clusters
Exchange 2007 RTM and Exchange 2007 SP1 support a maximum of eight nodes in an SCC. Valid combinations of Exchange 2007 SP1 SCCs on Windows Server failover clusters include:
- 7 Active / 1 Passive
- 6 Active / 1 or 2 Passive
- 5 Active / 1, 2, or 3 Passive
- 4 Active / 1, 2, 3, or 4 Passive
- 3 Active / 1, 2, 3, 4, or 5 Passive
- 2 Active / 1, 2, 3, 4, 5, or 6 Passive
- 1 Active / 0, 1, 2, 3, 4, 5, 6, or 7 Passive
Note: The 64-bit version of Windows Server 2008 supports up to 16 nodes in a single failover cluster; however, Exchange 2007 supports a maximum of 8 nodes in the cluster. The failover cluster can still contain up to 16 nodes, but Exchange 2007 should be installed on no more than 8 nodes in the failover cluster.
Typically, there is no need for more than one passive node in the cluster for each active node in the cluster. As a result, a configuration of one active node and one passive node is preferred over configurations with one active node and multiple passive nodes. When using a single node SCC, you can use either a shared storage quorum, or a Majority Node Set quorum (traditional, without a file share witness). Although single-node SCCs are supported, they are not considered to be a high availability solution because no redundancy exists within the cluster.
A stretch cluster, also known as a geographically dispersed cluster, is a failover cluster that is stretched (that is, it spans) more than one physical datacenter. Stretch clusters can be used as part of a site resilience design for your Exchange organization. Because CCR does not use shared storage, it can be easily deployed in a geographically dispersed failover cluster, including a multi-subnet stretch cluster on Windows Server 2008. SCC is also supported in a stretch cluster; however, stretching SCC requires third-party synchronous replication technology. For more information about stretch clusters, see Site Resilience Configurations.
Another type of cluster that is supported by Exchange 2007 and Exchange 2007 SP1 is called a standby cluster. A standby cluster is a Windows Server failover cluster that does not contain a clustered mailbox server, but can be quickly provisioned with a replacement clustered mailbox server in the event of a disaster, another failure of the production failover cluster, or some other recovery scenario.
Continuous replication, also known as log shipping, is the process of automating the replication of closed transaction log files from a production storage group to a copy of that storage group that is located on a second set of disks on the local computer or on another server altogether. After being copied to the second location, the log files are then replayed into the copy of the database, thereby keeping the storage groups synchronized with a slight time lag.
Continuous replication is available in two forms in Exchange 2007 RTM (LCR and CCR) and three forms in Exchange 2007 SP1 (LCR, CCR, and SCR).
High Availability for Other Server Roles
High availability for the Hub Transport, Edge Transport, Client Access, and Unified Messaging server roles is achieved through a combination of server redundancy, Network Load Balancing (NLB), hardware load balancing, Domain Name System (DNS) round robin, as well as proactive server, service, and infrastructure management. In general, you can achieve high availability for the Client Access, Hub Transport, Edge Transport, and Unified Messaging server roles by using the following strategies and technologies:
- Edge Transport You can deploy multiple
Edge Transport servers and use multiple DNS Mail Exchanger (MX)
records to load balance activity across those servers. You can also
use NLB to provide load balancing and high availability for Edge
- Client Access You can use NLB or a
third-party hardware-based network load-balancing device for Client
Access server high availability. For more information about NLB,
see Windows Server TechCenter.
- Hub Transport You can deploy multiple
Hub Transport servers for internal transport high availability.
Resiliency has been designed into the Hub Transport server role in
the following ways:
- Hub Transport server to Hub Transport server
(intra-org) Hub Transport server to Hub
Transport server communication inside an organization automatically
load balances between available Hub Transport servers in the target
Active Directory directory service site.
- Mailbox server to Hub Transport server (intra-Active
Directory site) The Microsoft Exchange
Mail Submission service on Mailbox servers automatically load
balances between all available Hub Transport servers in the same
Active Directory site.
- Unified Messaging server to Hub Transport
server The Unified Messaging server
automatically load balances connections between all available Hub
Transport servers in the same Active Directory site.
- Edge Transport server to Hub Transport
server The Edge Transport server automatically
load balances inbound Simple Mail Transfer Protocol (SMTP) traffic
to all Hub Transport servers in the Active Directory site
to which the Edge Transport server is subscribed.
- Hub Transport server to Hub Transport server (intra-org) Hub Transport server to Hub Transport server communication inside an organization automatically load balances between available Hub Transport servers in the target Active Directory directory service site.
- Unified Messaging Unified Messaging
deployments can be made more resilient by deploying multiple
Unified Messaging servers where two or more are in a single dial
plan. The Voice over IP (VoIP) gateways supported by Unified
Messaging can be configured to route calls to Unified Messaging
servers in a round-robin fashion. In addition, these gateways can
retrieve the list of servers for a dial plan from DNS. In either
case, the VoIP gateways will present a call to a Unified Messaging
server and if the call is not accepted, the call will be presented
to another server, providing redundancy at the time the call is
Achieving High Availability with Data and Service Redundancy
The basic premise of the Exchange 2007 high availability architecture is to introduce redundancy into the deployment. A failure is recovered using the remaining computing resources to support the Exchange services. As the failures are repaired, computing resources are again available to Exchange and its clients. In this context, the computing resources may be computers or storage for mailbox or other Exchange data.
Redundancy can be introduced within a single datacenter. This approach is typically done to protect against individual server failure. For example, introducing a second Hub Transport server into your organization's primary datacenter enables mail flow to continue if one of the two servers fails.
Alternatively, or in addition, redundancy could be introduced into a secondary datacenter. Two datacenter configurations enable service continuity after a datacenter failure. If an additional Hub Transport server is introduced into a secondary datacenter, there is the opportunity to have the second Hub Transport server handle mail flow when the primary Hub Transport server experiences a failure, or when the production datacenter is unavailable. If three Hub Transport servers are deployed, two of them can be in the production datacenter and the third can be in the secondary datacenter.
The key deployment point is that redundancy can prevent outages that, without redundancy, result in a variety of failures. How the redundant computers and services are deployed determines the failures that can occur without affecting data or service availability. Organizations must understand their requirements and then look at the operational issues to understand what solution is best for them. For example, one organization may want to activate a backup data center only after a 20 minute failure of the production datacenter. In this case, the organization must have the necessary processes in place to regularly validate backup data center activation and operation. A different organization may decide that ongoing validation of the backup datacenter is critical for their success; thereby leading to a different deployment configuration for that organization.