Apache fails to start on RHEL 7.3 after upgrade to NAM 4.4

  • 7021224
  • 23-Aug-2017
  • 06-Sep-2017

Environment

Access Manager 4.3

Situation

Access Manager Admin Console, Identity Server and Access Gateway Service running version 4.3 installed on RHEL 7.3. Everything working well in htat users can access AG protected resources and SAML connectors after authenticating at the Identity Server. The number of AG protected applications were quite high ie. over 100 proxy services.
 
After upgrading the setup from NAM 4.3.2 to NAM 4.4, Apache fails to start correctly and the following errors are reported on all the AG nodes.
---------------------------------------------------------------------
[root@RHEL-MAG4 novell-access-gateway-4.4]# /etc/init.d/novell-apache2 restart
Syntax OK
Restarting Novell Gateway Service..
[root@RHEL-MAG4 novell-access-gateway-4.4]#
Broadcast message from systemd-journald@RHEL-MAG4 (Fri 2017-08-18 16:21:07 IST):

httpd[16883]: [core:emerg] [pid 16883:tid 139877523965824] (28)No space left on device: AH00023: Couldn't create the proxy-balancer-shm mutex


Broadcast message from systemd-journald@RHEL-MAG4 (Fri 2017-08-18 16:21:07 IST):

httpd[16883]: [proxy_balancer:emerg] [pid 16883:tid 139877523965824] (28)No space left on device: AH01180: mutex creation of proxy-balancer-shm : p7e85ef8b_bal_path_8853_ws failed

Resolution

Increase the number of semaphore arrays on the RHEL 7.3 OS kernel settings. The following steps indicates how to increase number of semaphore arrays

1) Open /etc/sysctl.conf and add the following line,

kernel.sem = 250 256000 32 1024

------ Semaphore Limits --------
max number of arrays = 1024
max semaphores per array = 250
max semaphores system wide = 256000
max ops per semop call = 32

Based on the requirement [no. of proxy services * 2] has to be chosen to be the minimum number of "max number of arrays" value. Accordingly following condn should also be maintained

 Max semaphores system wide = max no. of arrays * max semaphores per array

2) Run sysctl –p in the command line for the changes to reflect.

3) Now start apache. ( If not modify the parameter limits based on the no. of balancers)

Cause

The issue is not seen in systems with less of a load ie. If number of proxy services is within or around 60, the issue will not be seen. This is because for each balancer configured, a semaphore array is blocked. With web sockets support in NAM 4.4, we now create two balancers for every proxy service so with more than 60 proxy services, we risk exceeding the default 128 semaphore arrays (one for http and another for websocket), and need to increase the threshold.

In the reliability setup, there were 104 proxy services (208 balancers), which is greater than number of semaphore arrays (128). Hence the issue is seen here.

When examined, observed implementation change between Apache 2.2 and Apache 2.4. In Apache 2.2, semaphore array were not blocked for each balancer. This is the reason why we are seeing this issue.

So for now, the issue can be resolved by tuning the number of semaphore arrays. In SLES chances of seeing this issue is rare as it has 1024 semaphore arrays.