Just a quick question regarding a problem i seem to be having at the
moment which i can't seem to get my head around.
I have started at an organisation which has a very basic setup. A
clustered sql 2000 sp4 server with approx 90 databases on. Only a few
of these have much traffic on and performance on the whole seems ok.
The server is connected to a SAN with two LUNS being made available
for the data and log drives. Performance in general is ok with disk
latency for reads and writes on both luns around 10-20ms. There is a
constant stream of reads using approx 1.5 - 2 mb per second from the
data drive. On the odd instance disk read go upto 20-30mb per second
and the disk queue may rise to 10 for a very short period of time but
on the whole the disks seam to handle it and latency for the above
peak may go upto 200ms.
The problem seems to happen when checkpointing seems to happen. I have
set the trace flags so i can see what database is checkpointing at any
given time and when the highest throughput database checkpoints we see
Latency can go to 20-30 seconds for 30-60 seconds which effectively
halts processing for this time. Errors start appearing in the sql log
for files taking longer than 15s to respond.
What's throwing me is that the throughput on the disks doesn't seem
very high at these points. As an example checkpoint pages
View Complete Post