SQL knowledge scripts generate the following event: Login failed for user 'sa' (NETIQKB38449)

  • 7738449
  • 02-Feb-2007
  • 10-Jan-2011

Environment

NetIQ AppManager 6.x
NetIQ AppManager 7.0.x

Situation

Although the login account specified in the knowledge script properties has a valid password entered in AppManager Security Manager for the SQL Server being monitored, you still recieve the error message below.

SQL knowledge scripts generate the following event: [Microsoft][ODBC SQL Server Driver][SQL Server]Login failed for user 'sa'. Please use 'NetIQ Security Manager' to set up SQL Knowledge Script users or use 'Save SQL Password' extension to update the KPW user password.

MCTRACE.LOG displays the following error message: 1079007131 [2504] qkpw-GetValue: failed to find entry
1079007131 [2504] mcextcore-GetContextEx: fail to lookup label

Resolution

If a job errors out or stops and the status is subsequently reset to "start pending", then it will be picked up by the Management Server and sent back to the agent and go back to running. However, the Status field is a bitmask and can hold several status 'flags' at once. The status 0x00000002 means "start pending", but the status 0x00400000 means "use full path". A job which has errored out and was using the "full path" would have a status of 0x00400080. The SQL statement:-

Update Job set Status = 0x2 where JobID = xxx

Would set the job to start pending and then it would go to running active, BUT it will lose the "use full path" setting in the process. Resulting in the symptoms described in this article. For most jobs there is no side effect because most do not use this setting. However, there are other issues as well. The status 0x01000000 means that the job has an action, so this would be lost as well.

The way to preserve the higher bits if you need to set a job to start - pending would be a query like:

           UPDATE job SET status = (Status & 0xFFFF0000) | 0x2 WHERE jobid = xxx

Cause

The job received an incomplete object information when it was restarted.  This can be caused by the job status being incorrectly reset by an automated script created by a user.

Additional Information

Formerly known as NETIQKB38449

AppManager includes automated controls to automatically restart jobs that have failed.  For Ad-Hoc jobs (dropped directly onto agents), the option to restart jobs can be set via the Operator Console's File/Preferences/Repository/Miscellaneous setting.  For jobs that are created as part of a monitoring policy, AppManager will always attempt to restart these jobs.