Using Valgrind to troubleshoot possible ndsd memory leaks

  • 7005905
  • 05-May-2010
  • 20-Aug-2019

Environment

eDirectory 8.8 for Linux
eDirectory 9 for Linux

Situation

The ndsd process is using more that the amount specified for dib cache (see the _ndsdb.ini cache= option) + 300-400 MB for application overhead + the amount specified as the MAX JAVA HEAP SIZE in either the pre_ndsd_start script or the /etc/init.d/ndsd script.

EX: 
If the dib directory is /var/opt/novell/eDirectory/data/dib

#cat /var/opt/novell/eDirectory/data/dib/_ndsdb.ini | grep cache
cache=1024000000

#cat /etc/init.d/ndsd | grep HEAP
DHOST_JVM_MAX_HEAP=512000000

So, cache+max heap+approx 300 MB would be 1024000000 + 512000000 + 300000000 = 1836000000 bytes. 

If the amount of memory ndsd is consuming is greater than 2 GB in this case and growing, there may be a possibility of a memory leak.

NOTE: IDM and auditing can cause memory growth.  Memory growth of ndsd should be isolated without IDM and audit.

Resolution

1. Install Valgrind
Option 1:  Install valgrind from OS repositories
SLES 12
https://www.suse.com/documentation/sles-12/book_sle_tuning/data/sec_tuning_tracing_valgrind.html https://www.novell.com/documentation/suse/esd/sles/esd_sles12_sdk.html
Option 2:  Download Valgrind installations file from http://valgrind.org/

2.  Configure ndsd to start with valgrind
initd (SLES 11.X, RH 6.X)

Change the following lines in /etc/init.d/ndsd script on a non-oes server.

Original lines 187-195
       if [ "$MALLOC_CHECK_" = ""]&& [ -f /usr/$libfldr/libtcmalloc_minimal.so.0 ]; then
                if [ -f /etc/novell-release ]; then
                        LD_PRELOAD="/usr/$libfldr/libtcmalloc_minimal.so.0" $sbindir/ndsd
                else
                        LD_BIND_NOW=1 LD_PRELOAD="/usr/$libfldr/libtcmalloc_minimal.so.0" $sbindir/ndsd
                fi
        else
                $sbindir/ndsd
        fi

Change these line to:
       if [ "$MALLOC_CHECK_" = ""]&& [ -f /usr/$libfldr/libtcmalloc_minimal.so.0 ]; then
                if [ -f /etc/novell-release ]; then
                        LD_PRELOAD="/usr/$libfldr/libtcmalloc_minimal.so.0" $sbindir/ndsd
                else
                        LD_BIND_NOW=1 LD_PRELOAD="/usr/$libfldr/libtcmalloc_minimal.so.0" $sbindir/ndsd
                fi
        else
                valgrind --tool=memcheck  -v --error-limit=no --log-file=/root/valgrindlogs/valgrind%p.log --leak- check=full --leak-resolution=high --num-callers=50 --freelist-vol=100000000 --trace-children=yes -- malloc-fill=ac  --free-fill=fe $sbindir/ndsd
        fi

systemd (SLES 12.X / RH 7.X)

Make a backup copy of the /opt/novell/eDirectory/sbin/ndsdwrapper file
Modify ndsdwrapper to the following: 
 if [ "$MALLOC_CHECK_" = ""]&& [ -f /usr/$libfldr/libtcmalloc_minimal.so.0 ]; then 
 if [ -f /etc/novell-release ]; then 
 LD_PRELOAD="/usr/$libfldr/libtcmalloc_minimal.so.0" $sbindir/ndsd 
 else 
 LD_BIND_NOW=1 LD_PRELOAD="/usr/$libfldr/libtcmalloc_minimal.so.0" $sbindir/ndsd
 fi
 else 
valgrind --tool=memcheck -v --error-limit=no --log-file=/root/valgrindlogs/valgrind%p.log --leak- check=full --leak-resolution=high --num-callers=50 --freelist-vol=100000000 --trace-children=yes -- malloc-fill=ac --free-fill=fe $sbindir/ndsd 
fi

3.  Create the valgrind logging directory
/root/valgrindlogs
(if a different location is for the valgrind logs is preferred, modify the --log-file option in the ndsdwrapper and create the directory specified.

4.  Add:  MALLOC_CHECK_=2

initd:  add:  export MALLOC_CHECK_=2   to the /etc/opt/novell/eDirectory/sbin/pre_ndsd_start script.
systemd:  add:  MALLOC_CHECK_=2   to the /etc/opt/novell/eDirectory/conf/env file

5.  Obtain debug libraries for the version of eDirectory being run and place them in the eDirectory program directories.

6. Start ndsd 
initd  etc/init.d/ndsd start or rcndsd start
systemd  use ndsmanage to start instance

7. Run normal test case that would produce the increase memory consumption.

8. Create a gcore of the ndsd process using:  gcore <pid of ndsd process.  Make sure the current directory when you run the gcore command has plenty of space as the core created will be the same size as the memory usage of the ndsd process.

9. Stop ndsd using command "kill -INT <pid of ndsd process>"

NOTE:   Keep monitoring the process id through ps until it gets killed. Don't do anything on ndsd at this time. Don't kill it again or anything else. It will take some time for the process to get killed.  During this shutdown the  memory is being inspected and a summary written to the log files.

10. Collect all the log files under /root/valgrindlogs/ . There is usually be many files for different child processes. 

11. Bundle the gcore with novell-getcore.

Tar all the log files under /root/valgrindlogs together and submit to Novell Support along with the core bundle.