High availability refers to a system’s ability to minimize system down time while continuing normal operation in the event of disruptions due to hardware, software, or service requirements. Fault tolerance refers specifically to reducing the risk of service disruption in the event of system or component failure. Designing fault tolerance into your messaging infrastructure is essential for ensuring high availability for your Office Communications Server 2007 R2 deployment.

Planning for high availability is critical to deploying Office Communications Server 2007 R2 Enterprise Edition. This section discusses Office Communications Server 2007 R2 features that support high availability and the various options and strategies that need to be considered before the first server is installed.

If your organization requires that your Office Communications Server 2007 R2 topology offer high availability, you will want to deploy one or more Enterprise pools in your internal topology. If high availability is not a consideration and simplicity and economy are more important, Standard Edition may be an appropriate choice. You can also support high availability in your perimeter network if required.

Standard Edition

Standard Edition provides all IM, presence, and conferencing components, including data storage, on a single computer. This is an efficient, economical solution for organizations consisting of a relatively small number of users who are based at a single location and whose IM and online conferencing requirements are not mission critical. A Standard Edition server monitors its own state and in the event of failure restarts automatically without loss of files, meeting content, or meeting schedules. Meetings and conversations, however in progress, are interrupted, a situation that may persist for a prolonged period, depending on the reason for the failure.

Because a Standard Edition server represents a single point of failure, we do not recommend it for mission-critical deployments where high availability is essential. For such deployments, Enterprise Edition is the necessary choice.

Enterprise Edition

The architecture of Office Communications Server 2007 R2 Enterprise Edition reduces single points of failure through the use of multiple Enterprise Edition servers and a dedicated Back-End Database server. For greater redundancy, the database can be clustered in a multiple-node active-passive configuration. Office Communications Server 2007 R2 also provides mechanisms for automatically reconnecting clients. Momentary interruptions and terminated sessions can occasionally occur, but the system is largely immune to total outages.

Important:
The back-end database must be installed on a separate physical computer from any Enterprise Edition server. For Enterprise Edition, collocating the back-end database with any Office Communications Server role is not supported. Additionally, Office Communications Server requires a separate SQL Server instance not shared with any other server application. In a multiple-node cluster, the Office Communications Server SQL instance must be able to failover to a passive node that, for performance reasons, should not be shared by any other SQL instance.

The multiple Front End Servers that make up an Enterprise pool provide a high availability solution wherein if a single Front End Server fails, clients will detect the failure and automatically reconnect to one of the other available Front End Servers. Meeting state is preserved because a meeting is hosted by the pool, not by any single server. Multiple Front End Servers also make it possible to take any given server offline for hardware or software updates with minimal service interruption. When the server goes down due to hardware or network failure, there will be an interruption in the experience of the clients that are using that server for IM, presence, and conferencing. Those clients will reconnect to resume the service.

Locating the pool’s SQL Server databases on one or a cluster of back-end servers that are separate from the Front End Servers not only insulates the databases from possible Front End failure, but improves overall throughput and Front End performance.

Perimeter Network

If you plan to enable external access in a highly available topology, you will want to deploy multiple consolidated Edge Servers connected to a hardware load balancer (referred to as an array of consolidated Edge Servers) in your perimeter network. Conversely, if your organization does not require high availability in the perimeter network, you can deploy a single consolidated Edge Server. For details, see Planning for External User Access.

Group Chat

If you plan to deploy Group Chat, you can deploy a topology that offers high availability. For details, see Planning for External User Access.

For increased scalability and higher availability of your Group Chat installation, you may deploy up to a total of five Group Chat Servers. Each server supports 2,000 users when Capacity Planning recommendations are followed, for a total of 10,000 users.

The Group Chat Servers will handle new user connections using a simple round-robin load balancing algorithm. If any server in the installation should fail, Group Chat clients will automatically reconnect, and they will be redirected to one of the remaining servers.

Archiving and Compliance

If your organization must meet compliance requirements to archive IM messages, you can deploy the Archiving Server with a topology that offers high availability. For details, see Archiving Support.