Quantcast
Channel: High Availability (Clustering) forum
Viewing all 2783 articles
Browse latest View live

New-SRPartnership without drive letters?

$
0
0
I'm struggling to find a workaround for my shortage of drive letters when I try to get cluster/cluster storage replication ready:

I have my 2 clusters, each with 16xDATA LUNs and 16xLOG LUNs ready to be configured for 16 SRPartnerships. So there is no way to assign a drive to volume. Each cluster has the DATA LUNs configured as CSV and mounted at the default location C:\ClusterStorage\... and I can successfully use "-SourceVolumeName C:\ClusterStorage\DATA01. So far so good :-)

I followed the same logic and mounted all my LOG LUNs at C:\SAN\... . When I try to create the SRPartnership, I use "C:\SAN\LOGx" for the source/destination LogVolumeName (that works for TEST-SRTopology after all). But half way thru the new-srpartnership cmdlet, I see the LOG disk changing into a CSV too and I assume its path changes to C:\ClusterStorage\LOGx? Anyway, I then get "New-SRPartnership : Unable to create replication group RG-DATA01, detailed reason: Unable to resolve log path C:\SAN\LOG01\ on this computer".

I assume I could start assigning drive letters for the 16 LOG volumes I have? But with 16 of ~20 available drive letters already used at day 1, that seems to be a bit limiting for our future here. Manually mounting the LOG volumes into C:\ClusterStorage before running New-SRPartnership, feels a bit risky and may be a supportability issue waiting to happen? Any better ideas? What's the preferred/supported way to handle this?

thanks a lot tips, tricks and solutions of any kind.

WSFC Virtual IPs and GARP

$
0
0
Where I work we are setting up a new system and I am experiencing a problem I am hoping others have experienced and can offer some recommendations on.

I have setup a new Windows Server 2012 R2 2-node WSFC that is hosting a SQL Server AlwaysOn Availability Group. I have setup AGs before and have never really had problems with them, however now I am starting to see an issue. Our WSFC nodes are on one VLAN and subnet and the clients that connect to SQL (SharePoint servers in this case) are in another VLAN and subnet. Whenever a failover of the AG occurs the SharePoint servers lose connectivity to the AG for approximately 13-15 minutes before connectivity is restored. It is not a routing issue as continuous pings to the WSFC nodes and the WSFC IP never fail, just pings and other connection attempts to the AG listener. It is also not a problem with SQL as I can verify connectivity to a specific AG node.

After looking at this problem with our networking guys the problem appears to be related to gratuitous ARP (GARP). When failover of the AG occurs a GARP request is supposed to update devices on the local network of updated IP/MAC information so that requests to that IP are sent to the right network interface. However I work in a federal government environment and DISA STIGs mandate that infrastructure routers disable GARP (V-5618 for those that are interested https://www.stigviewer.com/stig/infrastructure_router__cisco/2013-10-08/finding/V-5618). This seems to have the effect of traffic outside of the local network timing out after a failover until the entry in the VLANs ARP table expires and gets updated. Disabling that setting for testing showed AG failovers being near instantaneous.

Has anyone encountered this or have any recommendations? I am trying to make the case for multiple NICs on the SharePoint servers so that they can communicate with the SQL servers on the same VLAN, but I am getting push back and am trying to see if there is an alternate solution that can be researched.

Joie Andrew "Since 1982"

Can I configure fail-over cluster between 2012 R2 and 2016?

$
0
0

Old Server install Windows Server 2012 R2. It is HyperV host.

I bought new server for HyperV high availability. I bought new Windows Server license too, it is Windows Server 2016 license.

So, Can I install Windows Server 2016 on new server and configure cluster with old server?

Storage spaces direct - STATUS_IO_TIMEOUT

$
0
0

Hi Everyone,

I configured cluster of 4 Supermicro X10DRIservers to test storage spaces direct. Supermicro server configuration:

- 2 x Intel Xeon E5-2640v3 CPU
- 10 x 32GB 2133MHz DDR4 RAM
- 2 x SSD 400GB SATA Intel S3610
- 2 x HDD 2TB SATA Seagate ST2000NX0253
- 6 x SSD 800GB SATA Intel S3610
- Intel X520-DA2 E10G42BTDA Network adapter

All was fine, cluster and S2D were working without any problems. But after a month, I replaced Intel X520-DA2 E10G42BTDA Network adapter on Mellanox ConnectX-3 Pro Ethernet Adapter. After that, I started receiving Warnings in Cluster Events log:

"Cluster Shared Volume 'VD1' ('Cluster Virtual Disk (VD1)') has entered a paused state because of 'STATUS_IO_TIMEOUT(c00000b5)'. All I/O will temporarily be queued until a path to the volume is reestablished."

I checked health of physical disks - they fine, also I checked network and found I had about 500K Received Discarded Packets per server. Itlookspretty bad, but I'm not sure. Is it a problem? Orshould I continue my investigation?

Cannot validate cluster "You do not have administrative privileges on the server "

$
0
0

Hi


 Let me describe the problem:

I have to build a two-node failover cluster. I have this two nodes in  a domain which is installed on the third computer. I am log on with domain admin. when running validation tests it gives a message that I do not have administrative privileges on other node.

Please show me a way to fix this. 

Regards.


a validation message

Failover Clustering Fails to Start

$
0
0
Good day all,

Hopefully someone can assist. Ive been running around in circles. 

We had a 2 Node cluster configured, then the cluster was destroyed (not the correct way) the 2 servers were shutdown and formatted.  Now Im tyring to re-configure a 2 node Cluster again and am running into numerous issues. 

Both servers have been rebuilt with Server 2008 Enterprise SP2. - Identical Servers as well as connected to a SAN for Disk Quorum.

Clustering Services were installed successfully on both Nodes and Validation comes back 100% on both Nodes. however when tyring to start the cluster Service it fails with Event ID: 7024 and 1090.

Event ID 7024: Service Specific Error 2
Event ID: 1090: Cluster Service cannot be started on this node because a registry operation failed with error '2'

The user account has admin rights on the computer and network domain.
Ive tried uninstalling the Failover Clustering services.
Even when setting the service to Automatic then trying to create the cluster - it says the computer is already part of a cluster.

Cluster Group Failed after Network Lost

$
0
0
hi all,

I have a two node cluster with Windows Server 2008 Enterprise and SQL Server 2008.
the problem is when active node lost the LAN connection, the "cluster group" in Cluster Core Resources not automatically transfer to other node.

Event ID-1205    Resource Control Manager    The Cluster service failed to bring clustered service or application 'DTC01-A' completely online or offline. One or more resources may be in a failed state. This may impact the availability of the clustered service or application.

Event ID-1069    Resource Control Manager    Cluster resource 'IP Address 10.20.51.122' in clustered service or application 'DTC01-A' failed.

any help and possible solution.

Thanks,
k.rijal






Cluster-Aware Updating Readiness "Errors"

$
0
0

I have a two-node Windows Server 2012 failover cluster.  The Windows firewall is disabled on both nodes.

When I log on to one of the nodes (bcs-vmhyperv2), and run the Cluster-Aware Updating tool to analyze the cluster readiness, I receive this result:

When I log on to the other node and run the tool, I received the same two errors.  The problem computer is always the local computer.

I know that PowerShell remoting and WINRM are enabled!  So, the "resolution" steps don't help.

Here's proof:

If I log on to a different Windows Server 2012 system (not one of the cluster nodes), and run the tool, I receive no errors:

In fact, I setup CAU and used it to apply the latest set of Windows Updates from the third computer.

Why can't I use it from the cluster nodes?  How do I fix it?


-Tony


Unable to configure server 2016 cluster aware updating.

$
0
0

Initially got a WinRM error when trying to set options in the CAU gui so ran test-causetup which reported the same errors after rule 11 below. It's also reporting several other failures though which are bogus. Rule 3 is giving an error saying PSRemoting is not enabled yet it is and I've even tried "enabling" it using -force. Rule 2 not positive about but know winrm is running and configured and firewall is off. Rule 11 is failing on the cluster name its not a server...

To test PSRemoting I ran this command on the other cluster node and it works BUT if you run it and target the same machine you are on it fails. This happens on both nodes and both nodes are giving all the same errors.

Invoke-Command -ComputerName 90vmhost5 -ScriptBlock { Get-ChildItem c:\ } -credential CM

PS C:\Windows\system32> Test-CauSetup


RuleId          : 1
Title           : The failover cluster must be available
State           : Started
Severity        : Info
FailedMachines  : {}
PercentComplete : 1

RuleId          : 1
Title           : The failover cluster must be available
State           : Succeeded
Severity        : Info
FailedMachines  : {}
PercentComplete : 5

RuleId          : 2
Title           : The failover cluster nodes must be enabled for remote management via WMIv2
State           : Started
Severity        : Info
FailedMachines  : {}
PercentComplete : 6

RuleId          : 2
Title           : The failover cluster nodes must be enabled for remote management via WMIv2
State           : Failed
Severity        : Error
FailedMachines  : {90VMHost5}
PercentComplete : 11

RuleId          : 3
Title           : Windows PowerShell remoting should be enabled on each failover cluster node
State           : Started
Severity        : Info
FailedMachines  : {}
PercentComplete : 12

RuleId          : 3
Title           : Windows PowerShell remoting should be enabled on each failover cluster node
State           : Failed
Severity        : Error
FailedMachines  : {90VMHost5}
PercentComplete : 16

RuleId          : 4
Title           : The failover cluster must be running Windows Server 2012
State           : Started
Severity        : Info
FailedMachines  : {}
PercentComplete : 17

RuleId          : 4
Title           : The failover cluster must be running Windows Server 2012
State           : Succeeded
Severity        : Info
FailedMachines  : {}
PercentComplete : 21

RuleId          : 5
Title           : The required versions of .NET Framework and Windows PowerShell must be installed on all
                  failover cluster nodes
State           : Started
Severity        : Info
FailedMachines  : {}
PercentComplete : 22

RuleId          : 5
Title           : The required versions of .NET Framework and Windows PowerShell must be installed on all
                  failover cluster nodes
State           : Succeeded
Severity        : Info
FailedMachines  : {}
PercentComplete : 45

RuleId          : 6
Title           : The Cluster service should be running on all cluster nodes
State           : Started
Severity        : Info
FailedMachines  : {}
PercentComplete : 46

RuleId          : 6
Title           : The Cluster service should be running on all cluster nodes
State           : Succeeded
Severity        : Info
FailedMachines  : {}
PercentComplete : 55

RuleId          : 7
Title           : Automatic Updates must not be configured to automatically install updates on any failover
                  cluster node
State           : Started
Severity        : Info
FailedMachines  : {}
PercentComplete : 56

RuleId          : 7
Title           : Automatic Updates must not be configured to automatically install updates on any failover
                  cluster node
State           : Succeeded
Severity        : Info
FailedMachines  : {}
PercentComplete : 64

RuleId          : 8
Title           : The failover cluster nodes should use the same update source
State           : Started
Severity        : Info
FailedMachines  : {}
PercentComplete : 65

RuleId          : 8
Title           : The failover cluster nodes should use the same update source
State           : Succeeded
Severity        : Info
FailedMachines  : {}
PercentComplete : 73

RuleId          : 9
Title           : A firewall rule that allows remote shutdown should be enabled on each node in the failover
                  cluster
State           : Started
Severity        : Info
FailedMachines  : {}
PercentComplete : 74

RuleId          : 9
Title           : A firewall rule that allows remote shutdown should be enabled on each node in the failover
                  cluster
State           : Succeeded
Severity        : Info
FailedMachines  : {}
PercentComplete : 82

RuleId          : 10
Title           : The machine proxy on each failover cluster node should be set to a local proxy server
State           : Started
Severity        : Info
FailedMachines  : {}
PercentComplete : 83

RuleId          : 10
Title           : The machine proxy on each failover cluster node should be set to a local proxy server
State           : Failed
Severity        : Warning
FailedMachines  : {90VMHost5, 90VMHost6}
PercentComplete : 91

RuleId          : 11
Title           : The CAU clustered role should be installed on the failover cluster to enable self-updating
                  mode
State           : Started
Severity        : Info
FailedMachines  : {}
PercentComplete : 92

RuleId          : 11
Title           : The CAU clustered role should be installed on the failover cluster to enable self-updating
                  mode
State           : Failed
Severity        : Warning
FailedMachines  : {VMCluster}
PercentComplete : 100

WARNING: <f:WSManFault xmlns:f="http://schemas.microsoft.com/wbem/wsman/1/wsmanfault" Code="2150859027" Machin
e="90VMHost5.pacific.local"><f:Message>The WinRM client sent a request to an HTTP server and got a response sa
ying the requested HTTP URL was not available. This is usually returned by a HTTP server that does not support
 the WS-Management protocol. </f:Message></f:WSManFault>
WARNING: Connecting to remote server 90VMHost5 failed with the following error message : The WinRM client sent
 a request to an HTTP server and got a response saying the requested HTTP URL was not available. This is usual
ly returned by a HTTP server that does not support the WS-Management protocol. For more information, see the a
bout_Remote_Troubleshooting Help topic.
WARNING: The WinRM client sent a request to an HTTP server and got a response saying the requested HTTP URL wa
s not available. This is usually returned by a HTTP server that does not support the WS-Management protocol.
ModelId           : Microsoft/Windows/ClusterAwareUpdating
SubModelId        :
Success           : True
ScanTime          : 2/1/2017 4:10:32 PM
ScanTimeUtcOffset : -08:00:00
Detail            : {90VMHOST5, 90VMHOST5}

Three VMs are "stuck" (locked) on one node.

$
0
0

I'm trying to do maintenance on one node running ws2012dc. All but three VMs moved over successfully. I think this is related to a failed DPM backup job since I hadn't enable serialized CSV backups. I tried restarting volume shadow services but it's stuck in the "stopping" state in services.msc. I see the VSSVC.exe is still running but i'm not sure how dangerous it is to kill this process? The VM's have a status as running (locked) in failover cluster manager. I tried turning off one of the VMs but i still wasn't able to override the locked status of it. Hyper-v manager reports the VM as stopped, but failover manager reports it as still running (locked). 

[Main Instruction]
Failed to Live migrate virtual machine 'SCVMM sql2 Resources'
[Expanded Information]
The action 'Move' did not complete.
Error Code: 0x80070057
The parameter is incorrect

Not sure on how to solve this. Any suggestions? 

 

Access denied in Failover Cluster manager for non-domain admin with read-only or full control on cluster

$
0
0

Hello all,

I observe the following:

When giving a user explicitely permissions to the cluster with  Grant-ClusterAccess <domain>/<user>

The user cannot connect to the cluster in the failover cluster manager. 

Managing the cluster via PowerShell commandlets does work, no messages about administrative privileges is shown. Only FOCM cannot connect for this user. What can I do to fix this?


You know you're an engineer when you have no life and can prove it mathematically

Disabling needless network prevents Live migration.

$
0
0

I am trying to get rid of cluster networks that are not needed any more. I have disabled them on the servers and they disappear from failover cluster manager.  However, live migration fails with "Failed to get the network address node 'node name': A cluster network is not available for this operation. (0x000013AB).'  There is one network cluster between all the nodes.

Can anyone suggest how to fix this?

Node in cluster - status changes to "paused"

$
0
0

We have seven Windows 2012 R2 nodes in a Hyper-V cluster. They are all identical hardware (HP BladeSystem). For a while, we had only six nodes, and there were no problems.

Recently, we added the seventh node, and the status keeps reverting to "paused". I can't find any errors that directly point to why this is happening - either in the System or Application log of the server, in the various FailoverClustering logs, or in the Cluster Event logs. I created a cluster.log using the get-clusterlog command, but if it explains why this is happening, I can't figure it out (it's also a very large file - 150 MB, so it's difficult to determine what lines are the important ones).

As far as I can tell, everything on this new node is the same as the previous ones - the software versions, network settings, etc. The Cluster Validation report also doesn't give me anything helpful.

Any ideas on how to go about investigating this? Even before I can solve the problem, I'd like to at least know when and why the status reverts to paused.

Thanks,

David

Windows NLB - Multicast - When adding second host, both stop communicating with each other

$
0
0

Hey,

I ran into a problem today, when trying to configure NLB for 2 Hyper-V servers.

Both Hyper-V servers have MAC Spoofing enabled. Also they are on a different subnets stationed in different locations.

When creating a cluster and adding 1 host to it, either 1st or 2nd server, it works fine, I can ping host and cluster.

Then when adding second host, they suddenly lose connection to each other and I can't seem to ping one from the other. Though they are reachable from any other machine. This does not affect Cluster as it is still reachable from both hosts.

I even tried creating Cluster for those 2 hosts from a 3rd random machine, but it all acts crazy and hosts are showing the same cluster, but they have only their own host names in it, while Cluster is showing both when managing from 3rd random machine.

Hope explanation makes sense and I would really appreciate your support !

Missing Cluster Services DNS Record

$
0
0
In Windows 2008 server, I setup Print and File Server cluster services.  It appears the Netbios names are published but the DNS names are not.  The Netbios names only works when you are on the same subnet therefore we need the records in the DNS. I found the screen to publish the ptr records (property of the server name middle panel in Failover Cluster Manager) and enabled "Publish PTR Records". At this time I can't restart the Windows 2008 servers in the cluster or the cluster services, the records are still missing.  Would the records get created when the servers are rebooted? Or do I need to create them manually?

VM Status Failed - Can't live migrate, move storage, etc. Failover Cluster general issues

$
0
0

Hello Technet,

I have a 4 node Server 2016 Failover Cluster using S2D CSVs.  Last week I ran into some issues installing S2D management packets on our Operations Manager instance.  This appears to have affected the S2D storage and Failover Cluster.  The inital problem appeared to be that the volume filled up when vm storage expanded to 100% of the volume.  Deleting a test vm cleared up space and resolved this issue, however some problems persist.

I have one failed VM in the cluster that I can't bring online.  I've tried to move the VHDs via the FCM gui to another s2d volume, but the VM is greyed out.  I can't start or stop the role or vm, or start/stop the vm OS.  I've attached the repeating log issues for the cluster attempting to start the role/vm.  

Can someone recommend my next troubleshooting steps?


Disk cluster - local unavailable, iSCSI only?

$
0
0

Windows Server 2016

I have created a two node cluster for Remote Desktop Services and I would like to create a Cluster Shared Volume for said services. The servers passed the validation tests with warnings (mostly pertaining to DHCP on some NICs) and created a cluster without any problems.

The systems involved have a RAID6 hardware side with plenty of storage space. On the RAID6 array I've created a 3TB VHDX on each machine and mounted it thinking that this would suffice for creating a disk cluster and a subsequent CSV. However when I go to add disks to the storage cluster I get the "No disks suitable for cluster disks were found" error message. All the research I've found wants a separate iSCSI server hosting the disks which I don't have in this small environment.

How do I overcome this limitation? In the event that this configuration isn't supported how do I get the two drives to replicate to each other like a CSV would?


Unable to add shared disk to 2012 cluster

$
0
0

Hello, i have three shared disks coming from an IBM V7000.

All three disks are visible and mountable by both of cluster nodes but, adding them to cluster, i have:

disk 1) added as quorum device

disk 2) i am able to add it to the cluster

disk 3) i can see it in Server Manager - Storage and i am able to put it online on both servers but it is not present in the disk list when i try to add it to the cluster.

Cluster validation is successful (except for some warning).

Cluster nodes are virtual machines on ESX5.5 and i followed best practices about RDM compatibility configuration.

How could i debug the problem and try to understand where the issue is?

Any hint is appreciated.


Replace Quorum Disk on Failover Cluster on different Storage LUN

$
0
0

Hello Team,

we are planning to Replace Failover Cluster Quorum Disk one Storage to another Storage Lun, kindly help us to doing further activity without any failure.

thanks.

Cluster Network for guest clusters

$
0
0

Is there a downside to sharing the cluster network between guest cluster(s) and/or a Hyper-V host cluster?

I'm assuming that the only issue would be bandwidth contention (e.g. during a live migration).

Viewing all 2783 articles
Browse latest View live


Latest Images

<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>