Environment
NetIQ Access Manager 3.2
NetIQ NIDP server 3.2 on Linux
NetIQ NIDP server 3.2 on Linux
Situation
- NetIQ Access Manager NIDP server stops sending naudit events to the audi server
- running "ps -ef |grep lcache" returns:
novlwww 4738 3770 0 Apr15 ? 00:01:17 [lcache] <defunct>
root 6788 743 0 09:13 pts/0 00:00:00 grep lcache
novlwww 14511 3770 0 Jul02 ? 00:00:00 lcache
-dir:/var/opt/novell/naudit/cache -port:1288 -slsport:1289 -int:600 -c - nproduct.log file shows below entries:
[Novell Audit Platform Agent]: All Channels failed for [Novell Access Manager], LastError is being set
[Novell Audit Platform Agent]: This is from EndClientConnection
[Novell Audit Platform Agent]: LCache could not process event for the application Novell Access Manager. Reconnecting LCache
Again.
[Novell Audit Platform Agent]: LCache could not process, Going to restart/connect again
[Novell Audit Platform Agent]: This is from EndClientConnection
[Novell Audit Platform Agent]: LCache could not process event for the application Novell Access Manager. Reconnecting LCache
Again.
[Novell Audit Platform Agent]: Failed to connect to cache for application Novell Access Manager, DISABLING cache mode.
[Novell Audit Platform Agent]: This is from EndClientConnection
[Novell Audit Platform Agent]: LCache could not process event for the application Novell Access Manager. Reconnecting LCache
Again.
[Novell Audit Platform Agent]: All Channels failed for [Novell Access Manager], LastError is being set
Resolution
- configure the lcache service to make use to always use the cache file
edit the logeventfile and make sure the following two entries have been set
LogForceCaching=Y
LogCacheLimitAction=roll cache
This configuration should as well improve the performance of the system as a whole. User requests will not get delayed as the NIDP server will not try to establish a connection itself in order to push naudit log events to the audit server Instead events will first get cached and then the get pushed to the audit server by the lcache process in the background. - Due to lcache crahes it can happen that the process runs as non root user causing it to fail.
- create a script called lcachemonitoring.sh-------------------------------------------------------------------------
- place the script into the "/etc/init.d" directory
NUM_ITER_CHK=30
DURATION=600
count=0
STARTTIME=`date +%s`
while :
do
killall -9 lcache 2>/dev/null
#sleep 1
/opt/novell/naudit/lcache -int:600 -c &
sleep 1
wait
echo "`date` : lcache process has crashed." >>/tmp/lcacheprocessstatus.log
let "count+=1"
if [ $count -ge ${NUM_ITER_CHK} ]
then
NOW=`date +%s`
DIFF=`expr $NOW - $STARTTIME`
if [ $DIFF -le $DURATION ]
then
echo "`date` : lcache process crashing frequently." >>/tmp/lcacheprocessstatus.log
sleep 300
fi
STARTTIME=$NOW
count=0
fi
done
-------------------------------------------------------------------------
- make script executable: "chmod +x lcachemonitoring.sh"
- add this script to runlevel 3: "chkconfig lcachemonitoring.sh 3"
Cause
The lcache process did not run as user "root" anymore due to a crash causing an automatic lcache restart
Additional Information
The script will monitor the lcache process every 60 seconds and schedule a restart if it does not run as user root