How do I automate NetIQDiag to collect logs to troubleshoot a failing NetIQ service? (NETIQKB38232)

  • 7738232
  • 02-Feb-2007
  • 28-Aug-2010

Environment

NetIQ AppManager 6.x
NetIQ AppManager 7.0.x

Situation

How do I automate NetIQDiag to collect logs to troubleshoot a failing NetIQ service?

Resolution

NOTE: This process should only be used while actively attempting to troubleshoot an issue with NetIQ services.  You should disable the process listed below when not actively being  used, as it can generate a lot of disk space usage over an extended period of time.

The following process can be used to automate the collection of necessary diagnostic logs at the time a failure of the AppManager agent is encountered.

The Process consists of using the General_EventLog knowledge script to monitor the Windows Application event log for a restart of the NetIQ service, followed by an action which will launch the NETIQDIAG utility automatically and silently, so that the logs are collected and stored in the \netiq\Diagnostics directory on the agent immediately.

In addition to the log collection, you will receive an event in AppManager alerting you to the fact that a restart of the service has occurred.

To get this process working properly, a little bit of prep work will be required on the agents.

To fully Diagnose an AppManager Agent service (NetIQmc) failure, the following Registry values will need to be set on each agent (the NTAdmin_RegistrySet Knowledge script can be used to set these values):

Set HKEY_LOCAL_MACHINE\Software\NetIQ\AppManager\4.0\NetIQmc\Tracing\TraceKS = 1

Set HKEY_LOCAL_MACHINE\Software\NetIQ\AppManager\4.0\NetIQccm\Config\MonitorInterval = 300 (5 minutes)

NOTE: A restart of the agent services is required to enable these settings.

To set up the monitoring, drop the General_EventLog knowledge script onto the servers that you wish to monitor, and set the parameters as follows:

  • Log: Application
  • Type: Information
  • Source: NetIQmc
  • EventID: 257
  • Events in past n hours: 1
  • Max number of entries per event report: 5
  • Schedule the job to run every 5 minutes.
  • Disable event collapsing.
  • Action: MC Action, Action_DOSCommand: cmd /c c:\progra~1\NetIQ\AppManager\bin\netiqdiag -c

To capture possible failures of the Management Server service (NetIQms), run the same process on that server's agent, with the following Event information:

  • Log:  Application
  • Event Type: Information
  • Event Source: NetIQms
  • Event Category: General
  • Event ID: 256
  • Events in past n hours: 1
  • Max number of entries per event report: 5
  • Schedule the job to run every 5 minutes.
  • Disable event collapsing.
  • Action: MC Action, Action_DOSCommand: cmd /c c:\progra~1\NetIQ\AppManager\bin\netiqdiag -c

Diagnostic logs will be deposited in the following path, on the Agent where the problem has occurred:

<install path>\NetIQ\Temp\NetIQ_Debug\<servername>\*.cab

The diagnostic cab file will have a unique name which includes a timestamp.  This will allow you to correlate the logs with the Event notifications that triggered them to be collected.

Additional Information

Formerly known as NETIQKB38232