Wed Jan 19, 2022 8:40 pm
Login Register Lost Password? Contact Us


Fault-Tolerance in HPCC

Comments and questions related to the Enterprise Control Language

Tue Jul 24, 2012 10:25 am Change Time Zone

hello
Its urgent...

I am new to HPCC and not getting any idea abt
how fault-tolerance is achieved especially for Thor cluster.
How mirroring takes place?
what happens if a node goes down...
Is any manual intervention required?
Does the fail-over happen automatically?

plz Help....
Ankita Singla
 
Posts: 21
Joined: Tue Jul 24, 2012 7:02 am

Tue Jul 24, 2012 3:29 pm Change Time Zone

Hi,

There are several ways that the HPCC System handles fault tolerance in relation to Thor.

First, in regards to the mirroring, when you have a system configured to run on multiple nodes, the hpcc-mirror directory for node n is typically on node n+1. This is automatically set up when you configure the system through the configuration wizard.

Second, regarding when a node goes down, there is the option to run the swapnode function. This function allows you to swap in a spare node to replace the failed one. However, you'll have to configure this spare node and attach it to the thor component with the configmgr tool.

Here are the steps to do so:

1. Add the spare node to the hardware section
2. Navigate to the Thor Component > Topology tab
3. Ensure that Write Access is enabled
4. Right-click on the master node and select add spares.

You can then select the extra nodes as spares. Additionally, you can make this swap node function work automatically by navigating to the SwapNode tab and the setting the AutoSwapNode value to true. The spares are swapped in when the Thor recycles and the thormaster notices that it cannot communicate with the slaves.

I found a similar post on the forum that hopefully answers any other questions you might have on redundancy.
viewtopic.php?f=16&t=46&hilit=swap+node
clo
 
Posts: 51
Joined: Thu May 12, 2011 11:57 am

Thu Aug 01, 2013 5:26 pm Change Time Zone

Bringing an old thread back to life is so much fun.

You can then select the extra nodes as spares. Additionally, you can make this swap node function work automatically by navigating to the SwapNode tab and the setting the AutoSwapNode value to true. The spares are swapped in when the Thor recycles and the thormaster notices that it cannot communicate with the slaves.

Obviously, one instance of "recycling" is restarting Thor itself. Does this happen on timed basis, though? In other words, will Thor eventually notice that a node is out of communication and swap in the spare without human intervention?

Thanks,

Dan
DSC
Community Advisory Board Member
Community Advisory Board Member
 
Posts: 568
Joined: Tue Oct 18, 2011 4:45 pm


Return to ECL

Who is online

Users browsing this forum: Google [Bot] and 1 guest