Intermittent -625 errors that get cleared when restarting ndsd

  • 7018286
  • 17-Nov-2016
  • 17-Nov-2016


NetIQ eDirectory 8.8 SP8 running on RHEL 6.6


These are some symptoms of this particular problem:
 - Replica synchronization reports to some server errors -625 to servers that are reachable.
 - If you connect to iMonitor -> Agent Activity, you see some thread take hold of the Write lock and never release
 - Requests to the affected server may work or may get stuck.
 - The affected server becomes unresponsive.
 - After restarting ndsd, the problem is resolved, at least for some time (a few hours or a few days)
 - If you use the utility gstack to get a list of the running threads, the lock is released and the server goes back to normal


The error -625 indicates that a server failed to respond in a timely manner. There are some other conditions that can also cause this error, like high utilization conditions or when a server tries to write a very large amount of attributes for a particular object. In these scenarios, though, the issue reappears soon after restarting the ndsd process.

This particular problem is caused by a kernel bug in Linux, which affects mostly Red Hat Enterprise Linux 6.6, 7.0 and 7.1 (running kernel versions 2.6.32-504 up to and including 2.6.32-504.12.2), in particular servers on version 

For more information from Red Hat:

To avoid this issue, make sure that the latest patches are applied on your Red Hat Linux server and that the kernel version is higher than the ones mentioned above.