Quantcast
Channel: High Availability (Clustering) forum
Viewing all articles
Browse latest Browse all 2783

Troubleshooting Failover Errors on new 2012 4-node Cluster

$
0
0

Server Setup: 4 465c G7 blades running W2k12 and MS Cluster Failover Services

NICs: Public & Private

Quorum: assigned accordingly (q:)

Roles:

  • Single DTC Role
  • SharePoint 2010 role created with sql server and sql server agent.

The cluster was built last week and working fine, with SQL isntalled, until today. When trying to fail between nodes the SharePoint resources fail to come back online. The following errors are thown in the Cluster Event logs:

Event ID: 1205
The Cluster service failed to bring clustered service or application 'CMOSS' completely online or offline. One or more resources may be in a failed state. This may impact the availability of the clustered service or application.

Event ID: 1069
Cluster resource 'SQL Server (MOSS1)' of type 'SQL Server' in clustered role 'CMOSS' failed.

Based on the failure policies for the resource and role, the cluster service may try to bring the resource online on this node or move the group to another node of the cluster and then restart it.  Check the resource and group state using Failover Cluster Manager or the Get-ClusterResource Windows PowerShell cmdlet

Event ID: 1205
The Cluster service failed to bring clustered service or application 'CMOSS' completely online or offline. One or more resources may be in a failed state. This may impact the availability of the clustered service or application.

Event ID: 1228
Cluster network name resource 'SQL Network Name (CMOSS)' encountered an error enabling the network name on this node. The reason for the failure was:
 'Unable to obtain a logon token'.

 The error code was '1331'.

 You may take the network name resource offline and online again to retry.

Event ID: 1254
Clustered role 'CMOSS' has exceeded its failover threshold.  It has exhausted the configured number of failover attempts within the failover period of time allotted to it and will be left in a failed state.  No additional attempts will be made to bring the role online or fail it over to another node in the cluster.  Please check the events associated with the failure.  After the issues causing the failure are resolved the role can be brought online manually or the cluster may attempt to bring it online again after the restart delay period.

As of right now the SharePoint resource is down and I'm not 100 sure the issue is with the MS cluster but wanted to start here for some possible direction.

Any responses appreciated.


Viewing all articles
Browse latest Browse all 2783

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>