ndsconfig failed to upgrade and start eDirectory

  • 7002557
  • 05-Feb-2009
  • 27-Apr-2012

Environment

Novell eDirectory 8.8 for Linux
Novell Open Enterprise Server 2 (OES 2)
Novell Open Enterprise Server 2 SP1 (OES 2 SP1)

Situation

Purpose
Upgrade OES2 to OES2 SP1

Symptoms
The upgrade process stops while trying to upgrade eDirectory, the following error is displayed:

ndsconfig failed to upgrade and start eDirectory
[...]
Checking if server is ready to service requests... unknown error 1 (1 hex).1

The instance at /etc/opt/novell/eDirectory/conf/nds.conf is upgraded successfully.

ERROR: /opt/novell/eDirectory/bin/ndsconfig return value = 56

The upgrade process cannot continue, the only options are to abort it or to retry the eDirectory upgrade.

Taking an strace of the ndsd binary shown that ndsd was unable to get the exclusive lock on file /var/opt/novell/eDirectory/data/dib/nds.lck:

24943 16:58:14 open("/var/opt/novell/eDirectory/data/dib/nds.lck", O_RDWR|O_CREAT|O_EXCL|O_LARGEFILE, 0600) = -1 EEXIST (File exists)
24943 16:58:14 open("/var/opt/novell/eDirectory/data/dib/nds.lck", O_RDWR|O_LARGEFILE) = 14
24943 16:58:14 fcntl64(14, F_SETLK64, {type=F_WRLCK, whence=SEEK_SET, start=0, len=1}, 0xb705c718) = -1 EAGAIN (Resource temporarily unavailable)

Because of this the eDirectory upgrade process was not able  to complete.

Resolution

To proceed further with the upgrade take the following steps:

  1. Spawn a new terminal using CTRL+ALT+F1;
  2. Stop eDirectory using 'rcndsd stop';
  3. Verify that no ndsd process are still running using 'ps -ef |grep ndsd';
  4. In case any ndsd process is still running, kill it;
  5. Delete the nds.lck file using 'rm //var/opt/novell/eDirectory/data/dib/nds.lck';
  6. Start eDirectory with 'rcndsd start';
  7. Switch back to graphic terminal using CTRL+ALT+F7;
  8. Select to retry the eDirectory upgrade;

This time the upgrade should complete fine.

Additional Information

Another possible evidence of this issue is the 'ndsrepair -R' unable to run giving the following error:

****************************************************************************/
Repair utility for Novell eDirectory 8.8 - 8.8 SP4 v20215.03, DS 20217.06.
Repairing Local Database
Start:  Tuesday, February 03, 2009 12:11:17 Local Time

ERROR: Insufficient disk space or missing files, Error: -168

NOTICE: Unable to update repair status.  Error: -663

Repair process aborted

The error -168 given by ndsrepair in this case was because the exclusive lock on the file nds.lck was not possible to be taken.

The investigations made didn't show the root cause of this lock problem, most likely this was caused by an unexpected crash of the ndsd process during the upgrade procedure, who didn't release the lock on the file.

Steps to take the strace of the ndsd process:

  1. Spawn a new terminal using CTRL+ALT+F1;
  2. Stop eDirectory using 'rcndsd stop';
  3. Start the strace using 'strace -f -o <output log file> -s 256 -t /opt/novell/eDirectory/sbin/ndsd';
Please be aware that running the ndsd daemon this way can cause the process to never return the console prompt. In case this should occur, the ndsd process needs to be manually stopped using either CTRL+C (In some instances CTRL+Z or CTRL+Q can also work), o spanning a new terminal and using 'rcndsd stop'.