NetIQ Unix Agent metric issues on AIX due to SPMI corruption.

  • 7771846
  • 23-Sep-2009
  • 14-May-2012

Environment

  • NetIQ UNIX Agent 7.1
  • NetIQ AppManager UNIX Agent 7.0.1
  • NetIQ AppManager UNIX Agent 6.5
  • IBM AIX 5.2
  • IBM AIX 5.3
  • IBM AIX 6.1

Situation


NetIQ 6.5 AppManager UNIX Agent
  • Can core dump when a KS utilizing SPMI is run (such as UNIX_CpuLoaded)
NetIQ AppManager UNIX Agent 7.0.1
NetIQ UNIX Agent 7.1
  • Can core dump when the agent is started.
  • Knowledge scripts using SPMI data can return invalid -1, 0, or 1.0 values.
To test a large number of SPMI counters as well as to provide good SPMI performance counter troubleshooting information to NetIQ Technical Support, feel free to utilize the attached compressed tar package. NOTE: This package has been tested within Technical Support on several AIX systems but has not passed NetIQ's intense QE testing process. This testing package by NetIQ Technical Support is a best effort to ensure the tested AIX systems SPMI counters are functioning properly. Although we are lead to beleave that this package / script will cause no harmful effects, it is directly accessing a possibly corrupt SPMI environment/implementation. No warranty is implied or expressed.

Utilizing the SPMI testing script:
  1. Decompress the package on your AIX system
    gunzip NETIQKB71846-aixctrtest-RISC-r4.tar.gz
  2. Untar the testing binaries and scripts.
    tar xvf NETIQKB71846-aixctrtest-RISC-r4.tar
  3. Change to the created directory.
    cd aixctrtest-RISC-r4
  4. Execute the testing script.
    # ./run.sh

see that attached sample-output.txt file for the testing results on a working system.

Resolution

IBM notes the following known corruption problems with SPMI in recent versions of AIX:

The solution provided thus-far by IBM is to update to the most recent TL/SP for your AIX release.


Another (normally easier) option is to utilize the UNIX_CpuUtil KS in your environment on UNIX systems as it provides comparable data from the sar command and doesn't utilize the AIX SPMI counters.

Cause

NetIQ attempts to utilize IBM's SPMI performance counters on start in 7.x. A known issue within some SPMI releases on AIX causes SPMI memory spaces to not be released when a process is killed this filling all available SPMI memory space.

IBM has identified and accepted an issue with all recent versions of objrepos:bos.perf.perfstat.  The SPMI problem can be identified on the system by running the attached testing tool.

Additional Information

Formerly known as NETIQKB71846

Additional questions about this issue should be raised to IBM's technical support. 

NETIQKB71846-aixctrtest-RISC-r4.tar.gz
NETIQKB71846-sample-output-r4.txt