I have set up many of the same clusters in the last few months, but this cluster specifically has been having lots of problems lately.
OS: Windows Server 2008 R2 latest patches
SQL: SQL Server 2008 SP1 CU5 (version: 10.0.2746)
Disks: VMAX SAN
In the very beginning when I set up the cluster, the cluster validation kept failing when I tried to add a node. Turned out that we had to unjoin the servers from the domain and rejoin. Right after that the cluster validation succeeded, the second
node joined the cluster, and quorum was changed to "Node and Disk Majority". SQL was installed and set up as active/active (one sql instance on each node). No issues then until SQL server was actually in use.
SQL server has heavy ETL processing plus being a subscriber of a sql replication. The following errors first started on 4/24 when I set up sql replication (these servers are replication subscribers):
EventID: 1127, Source: FailoverCluster, Task Category: Network Manager
Cluster network interface 'man1fscl01a - Hartbeat to man1fscl91b' for cluster node 'man1fscl01a' on network 'ClusterHeartBeat' failed. Run the Validate a Configuration wizard to check your network configuration. If the condition persists, check for hardware
or software errors related to the network adapter. Also check for failures in any other network comp
View Complete Post