Quantcast
Channel: High Availability (Clustering) forum
Viewing all 2783 articles
Browse latest View live

cluster name vs file server role

$
0
0

I've just started working on a 2012 server , it has been a while the last time I worked with clustering I think it was 2003 ! In those days you might set up a cluster ( simple active-passive) create a cluster name with its own ip address separate from the nodes. As I recall you could then created a clustered file share and users would access it using the cluster name as the server name in the share , nice and simple ( or maybe I'm misremembering ). Now it seems we create a cluster name , as before , but we also add a file server role and it has its own virtual name and ip address which is separate from the physical nodes and the clustername/ip.

I'm not really seeing the point in this , would it not make sense to add a file server role but hook it to the clustername ? What , in simple terms are we getting out of creating another virtual name and ip just for the role??


Event 1146 - Call ISALIVE timed out for resource 'SQL'

$
0
0

Hello guys!

In a Windows Server 2008 R2 environment with SQL Server 2005 the Cluster service fails after a healthcheck timeout. There is no latency or path lose at 'SQL' resource (LUN), all logs have been checked (Windows/Cluster/SAN/Sw/HBA) and no driver or update done:

00000dfc.000021b8::2015/05/26-03:40:59.799 INFO  [VSS] OnIdentify Returned true
00000dfc.000023f8::2015/05/26-05:35:07.403 INFO  [NM] Received request from client address SAO-PAULO-01.
00000dfc.00001c64::2015/05/26-10:16:24.296 INFO  [NM] Received request from client address SAO-PAULO-01.
00000dfc.00001368::2015/05/26-10:56:22.015 INFO  [NM] Received request from client address SAO-PAULO-01.
00000e3c.00000d54::2015/05/26-12:17:06.000 ERR   [RHS] RhsCall::DeadlockMonitor: Call ISALIVE timed out for resource 'SQL'.
00000e3c.00000d54::2015/05/26-12:17:06.000 INFO  [RHS] Enabling RHS termination watchdog with timeout 1200000 and recovery action 3.
00000e3c.00000d54::2015/05/26-12:17:06.000 ERR   [RHS] Resource SQL handling deadlock. Cleaning current operation and terminating RHS process.
00000e3c.00000d54::2015/05/26-12:17:06.000 ERR   [RHS] About to send WER report.
00000dfc.00002220::2015/05/26-12:17:06.000 WARN  [RCM] HandleMonitorReply: FAILURENOTIFICATION for 'SQL', gen(0) result 4.
00000dfc.00002220::2015/05/26-12:17:06.000 INFO  [RCM] rcm::RcmResource::HandleMonitorReply: Resource 'SQL' consecutive failure count 1.
00000e3c.00000d54::2015/05/26-12:17:08.121 ERR   [RHS] WER report is submitted. Result : WerReportQueued.
00000dfc.00002220::2015/05/26-12:18:29.545 ERR   [RCM] rcm::RcmMonitor::RhsRpcResourceControl Error 1726 communicating with RHS process 3644.
00000dfc.00001808::2015/05/26-12:18:29.550 DBG   [NETFTAPI] received NsiDeleteInstance  for 192.168.177.167
00000dfc.00001808::2015/05/26-12:18:29.550 WARN  [NETFTAPI] Failed to query parameters for 192.168.177.167 (status 80070490)
00000dfc.00001808::2015/05/26-12:18:29.550 DBG   [NETFTAPI] Signaled NetftLocalRemove  event for 192.168.177.167
00000dfc.00001808::2015/05/26-12:18:29.551 DBG   [NETFTAPI] received NsiDeleteInstance  for 192.168.177.168
00000dfc.00001808::2015/05/26-12:18:29.551 WARN  [NETFTAPI] Failed to query parameters for 192.168.177.168 (status 80070490)
00000dfc.00001808::2015/05/26-12:18:29.551 DBG   [NETFTAPI] Signaled NetftLocalRemove  event for 192.168.177.168
00000dfc.000009a8::2015/05/26-12:18:29.552 DBG   [NETFTAPI] received NsiDeleteInstance  for 10.10.2.3
00000dfc.000009a8::2015/05/26-12:18:29.552 WARN  [NETFTAPI] Failed to query parameters for 10.10.2.3 (status 80070490)
00000dfc.000009a8::2015/05/26-12:18:29.552 DBG   [NETFTAPI] Signaled NetftLocalRemove  event for 10.10.2.3
00000dfc.00002540::2015/05/26-12:18:29.553 ERR   [RCM] rcm::RcmMonitor::RecoverProcess: Recovering monitor process 3644 / 0xe3c
00000dfc.00002540::2015/05/26-12:18:29.555 INFO  [RCM] Created monitor process 5616 / 0x15f0
00000dfc.00002220::2015/05/26-12:18:29.555 ERR   [RCM] rcm::RcmResource::Control: RPC_S_CALL_FAILED(1726)' because of 'result'
00000dfc.00002220::2015/05/26-12:18:29.555 ERR   [RCM] rcm::RcmResControl::DoResourceControl: RPC_S_CALL_FAILED(1726)' because of 'ResourceControl( GET_PRIVATE_PROPERTIES ) failed for resource 'SPCLUDtc'.'
00000dfc.00002220::2015/05/26-12:18:29.555 WARN  [RCM] ResourceControl(GET_PRIVATE_PROPERTIES) to SPCLUDtc returned 1726.
000015f0.000015a4::2015/05/26-12:18:29.656 INFO  [RHS] Initializing.

Should we trust that DLL could be corrupted? (happens 5 times with 4 or 5 days interval)

Is there any other information or troubleshooting about it?

Thanks!


Luiz Mercante | MCITP SQL 2008 | MCTS SQL 2008 | MTA Database Fundamentals | MCTS Windows Apps | MCTS Windows Network | MCP 2003 | sqldicas@outlook.com | http://sqldicas.com.br --> Se a resposta foi útil de alguma forma, classifique como resposta ou vote como útil.

Cluster Manager test fails on Storage Section

$
0
0
Running the test for cluster failover and I am not sure if I can proceed to create the cluster with the error that I have.  I understand that there is no failover for the local machine's C: Drive (boot), all other tests pass.  Or is there a way to remove the local drive from the test or from mpio?

Storage
NameResultDescription
List All DisksSuccess
List Potential Cluster DisksSuccess
Validate Disk Access LatencySuccess
Validate Disk ArbitrationSuccess
Validate Disk FailoverFailed
Validate File SystemCanceled
Validate Microsoft MPIO-based disksSuccess
Validate Multiple ArbitrationSuccess
Validate SCSI device Vital Product Data (VPD)Success
Validate SCSI-3 Persistent ReservationSuccess
Validate Simultaneous FailoverCanceled

Here is the detailed section:

List All Disks

    List all disks visible to one or more nodes (including non-cluster disks).
    Prepare storage for testing
    Preparing storage for testing on node tluxhs1.tluxvm.orl.com
    Preparing storage for testing on node tluxhs2.tluxvm.orl.com

    tluxhs1.tluxvm.orl.com

    Getting information on PhysicalDrive 0 from node tluxhs1.tluxvm.orl.com
    Getting information on PhysicalDrive 1 from node tluxhs1.tluxvm.orl.com
    Getting information on PhysicalDrive 2 from node tluxhs1.tluxvm.orl.com
    Getting information on PhysicalDrive 3 from node tluxhs1.tluxvm.orl.com
    Getting information on PhysicalDrive 4 from node tluxhs1.tluxvm.orl.com
    Getting information on PhysicalDrive 5 from node tluxhs1.tluxvm.orl.com
    Getting information on PhysicalDrive 6 from node tluxhs1.tluxvm.orl.com
    Disk NumberDisk IdentifierDisk Bus TypeDisk Stack TypeDisk Address (PORT:PATH:TID:LUN)Adapter DescriptionEligible for ValidationDisk Characteristics
    PhysicalDrive0fceb8fb7Bus Type SASStor Port2:0:0:0Dell SAS 6/iR Integrated ControllerFalseDisk is a boot volume. Disk is a system volume. Disk is used for paging files. Disk is used for memory dump files. Disk is on the system bus. Disk partition style is MBR. Disk partition type is BASIC.
    PhysicalDrive1871348a8Bus Type iSCSIStor Port3:0:0:0Microsoft Multi-Path Bus DriverTrueDisk partition style is MBR. Disk partition type is BASIC. Disk is Microsoft MPIO based disk
    PhysicalDrive2871348bcBus Type iSCSIStor Port3:0:3:1Microsoft Multi-Path Bus DriverTrueDisk partition style is MBR. Disk partition type is BASIC. Disk is Microsoft MPIO based disk
    PhysicalDrive3871348a4Bus Type iSCSIStor Port3:0:3:2Microsoft Multi-Path Bus DriverTrueDisk partition style is MBR. Disk partition type is BASIC. Disk is Microsoft MPIO based disk
    PhysicalDrive4a9429adaBus Type iSCSIStor Port3:0:3:3Microsoft Multi-Path Bus DriverTrueDisk partition style is MBR. Disk partition type is BASIC. Disk is Microsoft MPIO based disk
    PhysicalDrive587134890Bus Type iSCSIStor Port3:0:0:4Microsoft Multi-Path Bus DriverTrueDisk partition style is MBR. Disk partition type is BASIC. Disk is Microsoft MPIO based disk
    PhysicalDrive6<Unknown>Bus Type iSCSIStor Port3:0:2:31Microsoft Multi-Path Bus DriverFalseDisk partition style is RAW. Disk is Microsoft MPIO based disk

 

Any help would be greatly appreciated.

Robert Ramos

SQL Server 2014 Failing to install in a cluster

$
0
0

Dear All,

When we are attempting to install SQL Server 2014 on a one node cluster, we are getting the following error. We know it storage related we just don't know what and why :P

Problem signature:

  Problem Event Name: APPCRASH

  Application Name:         CPrepSrv.exe

  Application Version:     6.3.9600.17093

  Application Timestamp:              53477555

  Fault Module Name:    StackHash_6162

  Fault Module Version: 6.3.9600.17736

  Fault Module Timestamp:          550f4336

  Exception Code:             c0000374

  Exception Offset:           PCH_B8_FROM_ntdll+0x000000000009177A

  OS Version:      6.3.9600.2.0.0.272.7

  Locale ID:           2057

  Additional Information 1:           6162

  Additional Information 2:           61622186c57016e5ac2d3038f45ee3f2

  Additional Information 3:           1139

  Additional Information 4:           11398d6c53eb5dafdaf0ef26509b53e6

Anyone out there with some insights ?

J


Cookies vrs Cake - the eternal struggle


Renaming Cluster Shared Volumes using Powershell

$
0
0

Hi,

I am wondering if it is possible to rename a Cluster Shared Volume using PowerShell.

When I add a disk to the cluster using "Add-ClusterDisk" and then add the cluster disk as a clustered shared volume using "Add-ClusterSharedVolume" the new cluster shared volume gets a generic name in the failover cluster manager like"Cluster Disk 3" and a generic name in the "C:\ClusterStorage" directory such as "Volume1". 

My goal is for the name in the failover cluster manager to match the name in the "C:\ClusterStorage" directory.  So I would see a cluster shared volume in the failover cluster manager GUI such as "MyNewCSV" and a path such as "C:\ClusterStorage\MyNewCSV" in the filesystem.

Is there some PowerShell command that I can use to rename "Cluster Disk 3"  in the failover cluster manager to "MyNewCSVName"?  I can rename the folder in the filesystem using Rename-Item with the following Powershellcommand"Rename-Item -Path "C:\ClusterStorage\Volume1" -NewName "MyNewCSVName" to get the directory name I want, but the CSV in failover cluster manager still uses "Cluster Disk 3" as the name.

Does anyone know if this is possible in PowerShell?  I can do it easily through the failover cluster manager GUI but really want an automated solution.

Thanks,

Paul

Unable to Validate a Cluster Configuration. The operation has failed. The action validate a configuration did not complete .

$
0
0

There is an error in XML document (5, 73).  

Attempt by method

Microsoft.Xml.Serialzation.GeneratedAssembly.XmlSerialzationReaderClusterPrep.Config.Read4_As...Bolean) to access method

MS.Internal.ServerClusters.Validation.TestAssemblyCollection.Add(MS.Internal.ServerClusters.V....Failed

Live migrations fail but Quick Migrations successful

$
0
0

Currently running a (2) node cluster for virtual hosts that uses SMB3.0 share hosted on a (2) node cluster with SoFS role.  I have had no issues running CAU or Live Migrations since the servers were first setup.  About a month ago I noticed my CAU was failing due to not being able to drain the roles.  I went in and attempted a live migration and it failed.  I went through the error log and noticed a general access denied error (see screenshot for error).  No permissions have been modified nor have any changes to the domain been made within this time.  Not quite sure why suddenly I am receiving these errors. I went through and verified permissions on the SMB share. I also ensured I had delegation setup in ADUC.   Currently have the option selected for "Trust this computer for delegation to any service (Kerberos only)."  I set that on both cluster names and I even added it to the nodes just for testing purposes.  I am currently thinking maybe I need to do quick migrations to drain the host and then run Windows updates to see if maybe I am missing something.  Any ideas would be greatly appreciated.  I did search through the forums but did not find something with this same error. 


2012 R2 CSV "Online (No Access)" after node joins cluster

$
0
0

Okay, this has been going on for months after we performed an upgrade on our 2008 R2 clusters.  We upgraded our development cluster from 2008 R2 to 2012 (SP1) and had no issues, saw great performance increases and decided to do our production clusters. At the time, 2012 R2 was becoming prominent and we decided to just hop over 2012, thinking changes in this version weren't that drastic, we were wrong.

The cluster works perfectly as long as all nodes stay up and online.  Live migration works great, roles (including disks) flip between machines based on load just fine, etc.  When a node reboots, or the cluster service restarts, when the node goes from "Down" to "Joining" and then "Online", the CSV(s) will switch from Online to Online (No Access) and the mount point will disappear.  If you were to switch the CSV(s) to the node that just joined back into the cluster, the mount point returns and it goes back to Online.

Cluster validation checks out with flying colors and Microsoft has been able to provide 0 help whatsoever.  We have two types of FC storage, one that is being retired and one that we are switching all production machines to.  It does this with both storage units, one SUN and one Hitachi.  Since we are moving to Hitachi, we verified that the firmware was up-to-date (it is), our drivers are current (they are) and that the unit is fully functional (everything checks out).  This has not happened before 2012 R2 and we have proven it by reverting to 2012 on our development cluster.  We have started using features that come with 2012 R2 on our other clusters so we would like to figure this problem out to continue using this platform.

Cluster logs show absolutely no diagnostic information that's of any help.  The normal error message is:

Cluster Shared Volume 'Volume3' ('VM Data') is no longer accessible from this cluster node because of error '(1460)'. Please troubleshoot this node's connectivity to the storage device and network connectivity.

Per Microsoft our Hitachi system with 2012 R2 and utilizing MPIO (we have two paths) is certified for use.  This is happening on all three of our clusters (two production and one development).  They mostly have the same setup but not sure what could be causing this at this point.


Windows 2008 R2 Cluster drive letter changed unexpectedly

$
0
0

On my Windows 2008 R2 cluster, drive letter changed unexpectedly during fail over. From what I understand, this can happen if the drive letter is already being used by the node at the time of fail over. But I did not find any error in cluster log related to the drive letter. I have seen this happening before on other Windows 2008 R2 clusters too.I want to understand what causes this unexpected drive letter change and what can I do to prevent this from happening.



ARASKAS

No available shared disk

$
0
0

Hi all,

I am trying to install a SQL Failover Cluster nodes for our infrastructure. The idea is to provide 2-NLB SharePoint Servers, with 2-Failover-Clustered SQL. I am now in the testing environment, in which I am using a software called StarWind to emulate the iSCSI Hard-Drive.

I have finished adding the Failover Service, running smoothly. However, when I try to install the SQL Server Cluster node it gives an error:

[[[

Rule "Cluster shared disk available check" Failed.
The cluster on this computer does not have a shared disk available. To continue, at least one shared disk must be available.

]]]

if anyone has an idea, pls assist??

many thanks to all of you.. :)

Failover Cluster: CSV Network Migration

$
0
0

Our main VM CSV is connected to our iSCSI on the wrong network.  We have three Cluster networks:

5.x - iscsi

40.x - cluster network / live migration

50.x - client vm network

For some reason the ISCSI is connected via the .50 at the moment on both of our CSV's, which means its passing all that traffic through our router.  The 2nd CSV is just backups so it was easy for me to shutdown, and reconnect to the ISCSI volume. Doing the CSV with our other of dozens of vm's is not so easy and would require a scheduled outage on a weeekend.

Is there a graceful way to make it rollover to the correct network without bringing everything offline?  I found this article and set the metric for the right network. But how do I make it reconnect without dropping connections?

https://technet.microsoft.com/en-us/library/ff182335%28WS.10%29.aspx

Failover Clustering - EventID 2049

$
0
0

Hello, I've got a four node hyper-v 2012r2 cluster using a CIFS share to store all of the virtual machines. It's working all right but I do see thousands of similar log entries in the Microsoft-Windows-FailoverClustering/Diagnostic log:

Event ID: 2049 -

[RCM [RES] SCVMM OPS69 embedded failure notification, code=0 _isEmbeddedFailure=false _embeddedFailureAction=2

[RCM [RES] SCVMM RDS22 - SessionHost embedded failure notification, code=0 _isEmbeddedFailure=false _embeddedFailureAction=2

Does anybody know what this is referring to? Is there a fix for this?

Thanks in advance for any feedback?

Problem with csv

$
0
0

I am trying to migrate storage for hyper-v vm's to a new csv.  When I try to move any VM's storage to the new target I get the following error..

an error occurred while moving the virtual machine storage 0x80071008 invalid parameter.

I have already unmounted the iso on the VM

Creating a SMB Application share

$
0
0
I have 2xtest servers (both Server 2012 R2 with the File Server Resource Manager role) that I'm trying to use to test Hyper-V clustering and a FreeNAS box with a CIFS/SMB share configured. I can map a network drive from the servers and see the share contents but when I try to create a share in File and Storage services the drive letter isn't listed. How can I get Windows to see the shared storage?

Failover Cluster Virtual Adapter - This device is not working, Error Code 31

$
0
0

Hi there,

I have two identical servers with 2012 R2 installed and was having an issue with the Failover Cluster Virtual Adapter on one of them. I must have Hyper-V installed for some other software I am running, but we are not running an VMs. I installed the Hyper-v role first, set up NIC teaming and then installed the MS Failover Cluster feature. The problem I'm having is that one of the systems is showing an error with the Failover Cluster Virtual Adapter.

Opening Device Manager, the "Hyper-V Virtual Ethernet Adapter" is shown under the Network Adapter section with a yellow triangle, and the following message:

This device is not working properly because Windows cannot load the drivers required for this device. (Code 31)

I have since reinstalled MS FO Clustering but that did not fix it. I found an article about disabling IP offloading on the physical NICs part of the Teaming but that did not resolve it either after a MS FO Clustering reinstall. I have just uninstalled the FO Cluster and Hyper-V roles and trying a reinstall of both to see if it was a problem somewhere in Hyper-V. 

A similar issue like this thread : https://social.technet.microsoft.com/Forums/en-US/1cc9cc76-6e5f-4198-9936-c6ffcb67328c/hyperv-virtual-ethernet-adapter-not-working-probperly-code-31-on-host?forum=virtualmachingmgrhyperv

Thanks,


Two NLB clusters on two nodes?

$
0
0

Hi all,<o:p></o:p>

I have some questions about NLB clustering. <o:p></o:p>

 

I have 2 VMs, both running on Windows 2008 R2.<o:p></o:p>

I would like to have BOTH Virtual servers added to TWO NLB clusters<o:p></o:p>

Please have a look at the picture for more specified configuration info

<o:p></o:p>

1st:Is it possible AT ALL to add 1 Virtual server to 2 NLB clusters<o:p></o:p>

2nd:If this is supported, which configuration should be preferred (B?)<o:p></o:p>

 

Thanks for your help

<o:p>JBrew</o:p>




Jelle

Clustered Tiered Storage Space - no optimization report

$
0
0

I have a clustered SOFS Server and I am currently using CSV's for Hyper-V.  I have enabled the optimization task and added the report per https://technet.microsoft.com/en-us/library/dn789160.aspx.  The report does not generate.  I have also tried to run the defrag command per the technet article manually.  This issue seems to be that when using CSV's and a cluster defrag has no drives to run against.  Anything constructive you can suggest would be helpful.

How do you monitor the tiers in Storage Spaces on a cluster with CSV's?

Thanks,

Gary

 

Freezal

SQL Failover installation dilemma

$
0
0

Hy folks 

I'm in a dilemma I installed a loadbalacing cluster in  a given domain and then installed a failover cluster on the same servers. Given that installing load balancing will add the same IP to the both nodes automatically to the both network adapters of the both nodes and then when you try to install sql server this won't be passed because the validation report of the failover cluster is not 100% valid  given that duplicate IP will be detected during the validation of the failover cluster how can one tweak this situation


The complexity resides in the simplicity Follow me at: http://smartssolutions.blogspot.com

Temporarily mix Intel and AMD hosts in Hyper-V cluster for migration purposes

$
0
0

Hi, we currently have a Hyper-V cluster (2012 R2) running on old AMD-based hosts. We are going to replace these with new Intel-based servers.

Is the following scenario possible (though likely not supported)? 

  • join new (Intel-based) hosts to existing cluster
  • shut down all virtual machines
  • migrate all virtual machines to new hosts
  • remove old (AMD-based) hosts from cluster

Would be the easiest option. Otherwise I need to create a new cluster and either create new storage, or connect LUNs to new cluster and import all VMs, in any case much more work and more risk.

ID 2051 FailoverClustering Errors

$
0
0
Good morning, 

I just took a quick look at my Event Viewer on my Windows Server 2012 Standard server hosting Exchange 2013 and see the following errors... every 5 minutes for days now. Im not aware of any updates installed so Im not sure what could be causing this or where to troubleshoot quite honestly; 

The server is running on VMware Virtual Platform.  

Log Name:      Microsoft-Windows-FailoverClustering/Diagnostic
Source:        Microsoft-Windows-FailoverClustering
Date:          6/8/2015 10:55:33 AM
Event ID:      2051
Task Category: None
Level:         Error
Keywords:      
User:          SYSTEM
Computer:      XCH1Server.Domain.local
Description:
[RCM] [GIM] ResType Virtual Machine has no resources, not collecting local utilization info

Now that I look at all my 6 exchange servers running Windows Server 2012 Standard, I see the same in those event logs also. 

Any suggestions folks? 

Viewing all 2783 articles
Browse latest View live


<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>