Web Server load balancing not working after upgrading to 3.2 and applying updates

  • 7014203
  • 29-Nov-2013
  • 29-Nov-2013

Environment

NetIQ Access Manager 3.2
NetIQ Access Manager 3.2 Support Pack 2 applied
NetIQ Access Manager 3.2 Access Gateway (AG) protecting multiple back end applications
One AG application front ending 6 different back end servers with stickiness enabled

Situation

Access Manager 3.1 setup and working fine. After upgrading to Access Manager 3.2 SP2, users accessing one specific Peoplesoft application reported slowness and performance problems. The Access Gateway (AG) reverse proxy accelerating this Peoplesoft app was configured to load balance between 6 Peoplesoft web servers, with stickiness enabled.

On further investigation, the AG was sending requests to only 3 of the 6 Peoplesoft Web servers. The AG server-status application shows the load balancing statistics for the Balancer module, and this clearly showed that 3 of the servers had no requests eg.

Reverse Proxy Peoplesoft

      SSes        Timeout   Method
ZNPCQ003-33333200 0       byrequests


Sch       Host       Stat  Route   Redir F Set Acc  Wr   Rd
http 
23.14.93.156     Ok   39ef0f95       1 0   628   2.4M 1.7M
http 
23.14.93.157    Ok   ab1e37e9       1 0   0   0 0
http 
23.14.93.158    Ok   0d26e398       1 0   585 2.1M 6.5M
http 23.14.93.159    Ok   37c6bba9       1 0   0   0 0
http 23.14.93.212    Ok   6faf9886       1 0   676   2.1M 1.5M
http 23.14.93.213    Ok   9cf93466       1 0   0   0 0

The issue does not re-appear when the server is rebooted.

Resolution

Perform the following changes on the AG:
 
a) open /opt/novell/nam/mag/webapps/agm/WEB-INF/agm.properties
b) search for the line 'linux.apache.command.gracefulrestart = graceful'
c) replace it with 'linux.apache.command.gracefulrestart = restart'

The issue is triggered every time a change gets applied, whether the change is to the proxy having the issue (Peoplesoft) or another proxy on the system. By forcing the Apache to restart fully and not gracefully, we can work around the following Apache defect at https://issues.apache.org/bugzilla/show_bug.cgi?id=55152