To determine and perform the backup and restoration procedures for Office Communications Server 2007 R2 that are most appropriate for your organization, you need to define the right backup and restoration strategy for your organization. This includes the following:
- Establishing business priorities
- Establishing backup and restoration requirements
- Establishing a disaster recovery plan
- Understanding and applying best practices for backup and
restoration
- Understanding and applying best practices for minimizing the
impact of a disaster
Business Priorities
Evaluate the business priorities of your organization. Typically, the primary business priorities that affect your backup and restoration strategy are the following:
- Business continuity requirements
- Data completeness
- Data criticality
- Portability requirements
- Cost constraints
Backup and Restoration Requirements
Your business priorities should drive the specification of backup and restoration requirements for your organization. In general, the backup and restoration requirements you establish for your organization should address the following:
- Frequency of backups. Back up frequently. As a best practice,
take regular, periodic snapshots throughout the day. Generally, you
should perform full backups every 24 hours. If you use the simple
recovery model, databases can be recovered to the point of the last
backup. However, because you cannot restore the database to the
point of failure or to a specific point in time, you need to
manually re-create data since the most recent backup.
- Backup and restoration tools. This guide covers the use of
Office Communications Server 2007 R2 tools, including the
following:
- Using LCSCmd.exe to back up most settings, except Group Chat
and Communicator Web Access settings and configurations.
- Copying of Group Chat computer settings file, which cannot be
backed up by using LCSCmd.exe.
- Using Communicator Web Access to export and import virtual
server configurations.
- Using SQL Server Management Studio Express in SQL Server 2005
Express Edition with SP2 to back up data for a Standard Edition
server.
- Using SQL Server Management Studio in SQL Server 2008 and SQL
Server 2005 with SP2 to back up databases on all servers except the
database on each Standard Edition server.
- Using file system backup mechanisms (for files such as those
that contain meeting content, meeting compliance logs, client
update files and device update files used by Device Update Service,
and the file that contains Group Chat computer settings).
- Using the Office Communications Server 2007 R2 snap-in to
validate services after restoration and to perform other tasks
related to restoration of services.
- Using LCSCmd.exe to back up most settings, except Group Chat
and Communicator Web Access settings and configurations.
- Administrative computer. You can install and run Office
Communications Server 2007 R2 administrative tools on any
appropriate computer. Your backup and restoration strategy should
specify the computers to be used for backup and restoration
procedures.
- Backup location. The backup location can be local or remote,
based on security and availability requirements. In the most
extreme cases, loss of a complete site, due to issues such as a
total loss of power or a natural disaster, can delay or prevent
restoration of service at the original site, so use of a separate,
secondary site may be necessary to meet the requirements of your
organization. Use of a separate, secondary site can facilitate
recovery in case of site failure. Your backup and restoration
strategy should identify whether a secondary site is required, the
services to be supported at the site, and the location of the site.
- Hardware and software requirements. Specific hardware and
software requirements are determined by the backup and restoration
requirements of your organization, including those specific to each
restoration scenario. This includes not only the hardware to be
used for backup storage and restoration of specific components, but
also any software and network connectivity required to support
backup and restoration.
- Restoration scenarios. The potential scenarios that can require
restoration of one or more servers or components are determined by
which servers or other components are involved in the loss of
service:
- Loss of any of the RTC, RTCConfig, or RTCab databases, or loss
of the database server (Standard Edition server or, in an
Enterprise pool, back-end server). At a minimum, this requires
restoring the database, but it can also require rebuilding the
server on which the RTC database resides.
- Loss of a Standard Edition server. At a minimum, this requires
restoring pool-level and computer-level settings, but it can
require rebuilding the servers, restoring the database, restoring
domain information, and reassigning users.
- Loss of one or more servers in an Enterprise pool. At a
minimum, this requires restoring computer-level settings for the
server, but it can also require rebuilding individual servers or
the entire pool, restoring the back-end database, restoring domain
information, and reassigning users.
- Loss of Active Directory Domain Services (AD DS), as along
with the loss of a Standard Edition server or all Front End
Servers. At a minimum, this requires restoring global, pool, and
computer-level settings, but it can also require rebuilding the
servers and domain information.
- Loss of an Archiving Server or LCSlog database. At a minimum,
this requires restoring the database and computer-level settings,
but it can also require rebuilding one or more servers.
- Loss of a Monitoring Server, LcsCDR database, or QoEMetrics
database. At a minimum, this requires restoring the database and
computer-level settings, but it can also require rebuilding of one
or more servers.
- Loss of a Group Chat Server or Group Chat database. At a
minimum, this requires restoring the database and computer-level
settings, but it can also require rebuilding of one or more
servers.
- Loss of a Group Chat Compliance Server or database. At a
minimum, this requires restoring the database and computer-level
settings, but it can also require rebuilding of one or more
servers.
- Loss of a Mediation Server, forward proxy server, or Edge
Server. At a minimum, this requires restoring computer-level
settings, but it can also require rebuilding one or more servers.
- Loss of a site, including all servers and Active Directory
Domain Services, such as might be the result of a natural disaster.
This can require switching service to a separate, secondary site
(if supported) or rebuilding of all servers and components.
- Loss of any of the RTC, RTCConfig, or RTCab databases, or loss
of the database server (Standard Edition server or, in an
Enterprise pool, back-end server). At a minimum, this requires
restoring the database, but it can also require rebuilding the
server on which the RTC database resides.
Your backup and restoration plan should specify any criteria that you want personnel to use to determine which option is most appropriate, including when to rebuild (on existing or new servers) versus when to just restore service on existing servers. These decisions will be based on a combination of factors, including the degree of loss, business continuity requirements, hardware and software cost, availability of service personnel, and how well the original deployment meets current and projected requirements.
- Restoration methods. The methods covered in your plan should be
specific to all potential requirements of each scenario (from the
tools required for recovering settings to the tools required for
restoring databases or for rebuilding servers).
- Restoration sequence. In the event of loss of multiple servers
or services, you need to specify criteria for determining the
sequence for restoring services. For instance, you need to decide
whether you want to restore instant messaging (IM) functionality or
soft phone functionality first.
Disaster Recovery Plan
Before you deploy Office Communications Server 2007 R2 in a production environment, it is important to have well-defined and well-rehearsed disaster recovery plans in place as part of your backup and restoration strategy. These plans enable you to recover quickly any resulting loss of services to your users. You should have a specific strategy for each type of disaster that can occur.
If damage is minor, you may be able to repair your Windows installation or your Office Communications Server 2007 R2 installation to fix the problem. In more severe cases you will probably need to rebuild an entire pool.
Office Communications Server 2007 R2 Enterprise Edition includes enhanced recovery capabilities that use the standby recovery server model. Office Communications Server 2007 R2 does not support log shipping or other methods of active or hot standby. In the standby recovery server model, spare computers are reserved for use as recovery servers in the event of disaster. Using standby recovery servers is a common practice in server environments that include rack-mounted hardware. In such environments support technicians routinely replace modular components or complete servers as they become damaged. This method works well with data storage technologies that offer continuous availability such as Storage Area Networks (SANs).
Many organizations use a model of just-in-time inventories for their IT organizations. Organizations contract with hardware vendors and suppliers, and the contract specifies a service level agreement (SLA) of a few hours for delivery of certain pieces of hardware in the event of a catastrophe. The advantage of this method is that it eliminates the need to keep multiple spare servers sitting unused.
Using standby servers at a secondary site for recovery enables fast recovery in the event of failure of services. The secondary site can provide full recovery support, or it can provide recovery support for only specific functionality, based on business needs. As the first step in preparing for site recovery, you need to determine the level of support that is to be provided by the secondary site, which will determine which servers are to be deployed at the secondary site and how they are to be configured. Ideally, the secondary site would provide all of the Office Communications Server functionality available at the primary site, but your organization might need to limit the support provided by the secondary site. Your backup and restoration strategy should specify what is deployed in the secondary site and, if it does not provide full functionality, why recovery support for specific functionality is not implemented. To help determine what is required at the secondary site, you can use the following factors:
- Office Communications Server functionality that is available at
the primary site.
- Business criticality of specific functionality. At a minimum,
setting up a secondary site requires support of core services,
which are provided by the Standard Edition server or, for an
Enterprise pool, by the Front End Server and back-end database.
Other functionality, such as A/V conferencing, might be deployed in
the primary site but might not have the same level of criticality
as the core services, or it might not be fully implemented, if it
is early in a deployment. The secondary site should reflect the
business needs, not simply mirror the primary site. Business
criticality can change as the topology and usage change, so your
backup and restoration strategy should include periodic reviews of
the secondary data site capabilities and whether they match current
business needs.
- Cost of the hardware, software, and maintenance for the
secondary site. The equipment and software you deploy in the
secondary site should be capable of supporting the capacity
requirements for your organization, which can mean a significant
initial investment in hardware and software. You should also
consider the cost of deploying and maintaining it. Based on
business criticality decisions, you might determine that the cost
of specific functionality is not justifiable.
- Service availability requirements. Bringing a secondary site
online takes time, during which functionality is not available to
users in your organization. Bringing core services online can
require an hour or more in a large enterprise. Restoring additional
functionality increases the downtime. If your organization requires
immediate recovery, you can limit the functionality that is
restored in order to shorten the time required to bring services
back online. Or you can plan for a staged recovery, in which
critical functionality is brought online at the secondary site
first, and other functionality is introduced on a delayed schedule
(such as during off-peak hours).
When you use a secondary site for service restoration, all backed up data and settings must be available at the secondary site. Testing should include restoration of the data and settings from the secondary site.
The deployment plan for the secondary site should match the deployment plan for the primary site. The secondary site should be in the same domain and have the same network configuration, except for the following:
- It should only document the components required to support the
functionality that you determine are required at the secondary
site.
- The secondary site should have a pool name that is separate
from the pool name used for the primary site.
- The _sipinternaltls and _sip_tcp DNS records should be
modified, as appropriate to the secondary site.
The backup and restoration strategy should include a schedule and criteria for switching to the secondary site, schedules for performing ongoing maintenance at the secondary site, and assigned responsibilities for performing site restoration procedures at both the primary site and the secondary site.
Best Practices for Backup and Restoration
Use the following guidelines as best practices for establishing your backup and restoration requirements:
- Perform regular backups at appropriate intervals. The simplest
and most commonly used backup type and rotation schedule is a full,
nightly backup of the entire SQL Server database. Then, if
restoration is necessary, the restoration process requires only one
backup tape and no more than a day’s data should be lost.
- Schedule backups when normal Office Communications
Server 2007 R2 usage is low. Scheduling backups at times
when the server is not under peak load improves server performance
and the user experience.
- Plan for and schedule periodic testing of the restoration
processes supported by your organization.
Best Practices for Minimizing the Impact of a Disaster
The best strategy for dealing with disastrous service interruptions (caused by unmanageable events such as power outages or sudden hardware failures) is to assume they will happen and plan accordingly. The disaster management plans you develop as part of your backup and restoration strategy should include the following:
- Keeping your software media and your software and firmware
updates readily available.
- Maintaining hardware and software records.
- Monitoring servers proactively.
- Backing up your data regularly and ensuring the integrity of
your backups.
- Training your staff in disaster recovery, documenting
procedures, and implementing disaster recovery simulation drills.
- Keeping spare hardware available or, if you have an SLA,
contracting with hardware vendors and suppliers for prompt
replacements.
- Setting up a separate, secondary site that includes standby
servers that can be brought online quickly for optimal site
recovery. To help ensure availability of the secondary site in the
event of a catastrophic loss such as a natural disaster, the
standby servers should be located at a separate location in a
different geographical area than the primary site.
- Separating the location of your transaction log files (.ldf
files) and database files (.mdf files).
- Ensuring your insurance policy is adequate.