We had a 1 hour outage with our production SQL cluster recently and I have been trawling through the Event Logs to try and find an explanation. I know that a network failure actually triggered the outage, during a switch upgrade no network
was available and so the cluster shut down:
"This cluster node has no network connectivity. It cannot participate in the cluster until connectivity is restored. "
There are lots of enties telling me that the cluster service is stopping and one saying that DTC is terminating unexpectedly which is fine, but then I see a log enty:
"The system failed to flush data to the transaction log. Corruption may occur"
This is worrying - how can I be sure that my data is not corrupted?
I can see the SQL Server and SQL Server Agent services '"terminate unexpectedly" and shortly after that I can see the cluster service entering the running state and various attempts to bring clustered applications online. These attempts give the following
"The Cluster service failed to bring clustered service or application 'SQL Server (INSTANCENAME)' completely online or offline. One or more resources may be in a failed state. This may impact the availability of the clustered service or application."
Does this mean that the cluster service does no
View Complete Post