Replica stuck in DYING, DEAD or TRANSITION ON State

  • 7003126
  • 27-Apr-2009
  • 27-Apr-2012

Environment

Novell Directory Services
Novell NetWare 5.x
Novell NetWare 4.x

Situation

Try to delete the replica(s) from the problem server in NDSManager.
Replica is stuck in a dying state.
Error:  -672 Inconsistent replica ring reported in DSREPAIR | Report Synchronization Status
Master replica does not see the server with dying replica in its replica ring.
Replica stuck in DYING, DEAD or TRANSITION ON State
Replica stuck in new state.
Replica stuck in a TRANSITION ON state
Replica not advancing.
Server not communicating properly with the rest of the tree.
Server holding synchronization because of object corruption.
Server getting -672 error because its replica ring information is inconsistent with other servers.
Server getting -761 error.
Object name changed to "1_2"(corrupt replica)after creating or renaming an object.
The only replica on a server stuck in Dying State.

Resolution

There are several ways to troubleshoot replicas stuck in a DYING, DEAD, or TRANSITION ON states.  

1.  The best solution is to be patient.  Depending on how big the replica is, deleting or adding a replica can take a long time.  Adding a replica to a server across a slow WAN link can also take a long time.

2.  Check to make sure communication is functioning properly in the tree and that all servers are UP in the replica ring in question.  A DOWN server in the replica ring can cause a replica to become stuck in a DYING, DEAD, or TRANSITION ON state.    

3.  Make sure time is synchronized across the network.  A server having a problem with TIMESYNC can cause synchronization problems.

4.  Run DSREPAIR | Report Synchronization Status and check for errors.  These errors may need to be resolved first before the replica will turn ON or will be deleted.  

5.  If DSREPAIR is vague, it may be better to use DSTRACE.  Force a heartbeat by typing the following:

SET DSTRACE = NODEBUG
SET DSTRACE = +S
SET DSTRACE = *U
SET DSTRACE = *H

Look for possible errors that may be causing the problem.  -603 and -673 errors are common when adding a replica.  -672 errors are common when deleting a replica.

6.  NEVER use the DESTROY SELECTED REPLICA option in DSREPAIR.  This will leave your replica rings inconsistent and can leave orphaned Subordinate Reference replicas on the server.  Always try to follow the proper guidelines of removing a replica off of a server by using NDS Manager or ConsoleOne.  These are the ONLY utilities that should be used.

As a LAST RESORT, and if all the above options fail, there is a way that you can manually remove ALL replicas off a server, but it is very destructive and can have MAJOR repercussions in the tree if not done correctly.  For more information on this procedure, see TID #7001592 - Manually Removing All Replicas From a Server; DSREPAIR -XK2.

Additional Information

Replica ring is inconsistent.
NDS data corruption is usually caused by either a power outage, a critical abend, or a hardware failure.
Formerly known as TID# 10019369