Topic Last Modified: 2011-03-02
To ensure the quality of your failover solution, we recommend that you monitor the following performance statistics:
- On Front End Servers, monitor the “LC:USrv – 00 –
DBStore\Usrv – 002 – Queue Latency (msec)”
counter. This counter represents the time that
a request spends in the queue to the Back-End Database Server. If
the topology is healthy, this counter averages less than 100 ms.
Occasional spikes are acceptable. The value will be higher on Front
End Servers that are located at the site opposite the location of
the Back-End Database Servers. This counter can increase if the
Back-End Database Server is having performance problems or if
network latency is too high. If this counter is high, check both
network latency and the health of the Back-End Database Server.
- On Front End, Archiving and Monitoring Servers, monitor the
“MSMQ Service\Total Messages in all Queues”
counter. The size of the queue will vary
depending on load. Verify that the queue is not increasing
unbounded. Establish a baseline for the counter, and monitor the
counter to ensure that it does not exceed that baseline.
- On Group Chat Channel and Compliance Servers, monitor the
“MSMQ Service\Total Messages in all Queues”
counter. The size of the queue will vary
depending on load. Verify that the queue is not increasing
unbounded. Establish a baseline for the counter, and monitor the
counter to make sure that it does not exceed that baseline.
- On the Directors, Edge Servers, and Front End Servers,
monitor the “LC:SIP – 04 – Responses object\ SIP – 051 – Local 503
Responses/sec” counter. This counter indicates
if any server is returning errors indicating that the server is
unavailable. At steady state, this counter should be approximately
0. Occasional spikes are acceptable.
- On all servers monitor the “LC:SIP – 04 – Responses \SIP –
053 – Local 504 Responses/sec” counter. This
counter can indicate connection delays or failures with other
servers. At steady state, this counter should be approximately 0.
Occasional spikes are acceptable. If you see 504 error messages,
check the “LC:SIP – 01 – Peers\SIP – 017 - Sends Outstanding”
counter. This counter records the number of requests and responses
in the outbound queue, which will indicate which servers are having
problems.