Environment
Novell Open Enterprise Server 1 (OES 1) Linux
Novell Open Enterprise Server 2 (OES 2) Linux
Novell Open Enterprise Server 2 (OES 2) Linux
Situation
Trying to access NRM, by going to https://IP_Address_or_DNSname:8009, the page is never displayed.
Further, when checking the status of NRM -- /etc/init.d/novell-httpstkd status -- the service is shown as dead.
Further, when checking the status of NRM -- /etc/init.d/novell-httpstkd status -- the service is shown as dead.
Resolution
The method to resolving this matter, with the least impact, is to:
1. Identify the dead process by running "ps aux | grep httpstkd | grep -v grep"
The first part, "ps aux | grep httpstkd" will search running processes for httpstkd running.
The second part, after the second pipe (|), will exclude the ps process from being displayed.
Therefore, you should have only running httpstkd processes returned and the output will look like this:
# ps aux | grep httpstkd | grep -v grep
root 2447 0.0 0.3 197416 7852 ? Sl 10:56 0:00 /opt/novell/httpstkd/sbin/httpstkd
The number after "root" is the actual process ID, or PID.
2. Kill the remaining process by running "kill <pid> " , replacing <pid> with the number from the ps output.
In this case, it would be kill 2447 -- NOTE: it may take a few seconds for the process to be killed. If you want to verify the process is removed, rerun step 1 until you don't see the PID returned.
3. Restart the NRM service by running:
rcnovell-httpstkd start
or
/etc/init.d/novell-httpstkd start
This will only restart the novell-httpstkd service (aka NRM) and not impact any other services.
1. Identify the dead process by running "ps aux | grep httpstkd | grep -v grep"
The first part, "ps aux | grep httpstkd" will search running processes for httpstkd running.
The second part, after the second pipe (|), will exclude the ps process from being displayed.
Therefore, you should have only running httpstkd processes returned and the output will look like this:
# ps aux | grep httpstkd | grep -v grep
root 2447 0.0 0.3 197416 7852 ? Sl 10:56 0:00 /opt/novell/httpstkd/sbin/httpstkd
The number after "root" is the actual process ID, or PID.
2. Kill the remaining process by running "kill <pid> " , replacing <pid> with the number from the ps output.
In this case, it would be kill 2447 -- NOTE: it may take a few seconds for the process to be killed. If you want to verify the process is removed, rerun step 1 until you don't see the PID returned.
3. Restart the NRM service by running:
rcnovell-httpstkd start
or
/etc/init.d/novell-httpstkd start
This will only restart the novell-httpstkd service (aka NRM) and not impact any other services.
Additional Information
Further investigation indicates the following may be seen in /var/log/messages and are a further indication of this issue:
After trying to launch NRM (while the process is dead), you may see:
Jan 1 01:00:16 MyOESServer httpstkd[3617]: PAM_NAM: User admin.org unknown to the authentication module
Jan 1 01:00:17 MyOESServer httpstkd[3617]: SSL_accept() ERROR: 0: error:00000000:lib(0):func(0):reason(0)
If just trying to restart novell-httpstkd, or stop & start that service, you may see:
Jan 1 01:01:01 MyOESServer httpstkd[5494]: Error initializing sockets, ccode = 0x62
The root cause is that the previous instance became unresponsive but a process is still running. Therefore, even if you try to restart novell-httpstkd, you will still be in the same situation as the old process is registered for listening on the specific tcp ports (8008 & 8009) and you receive an error initializing sockets.
After trying to launch NRM (while the process is dead), you may see:
Jan 1 01:00:16 MyOESServer httpstkd[3617]: PAM_NAM: User admin.org unknown to the authentication module
Jan 1 01:00:17 MyOESServer httpstkd[3617]: SSL_accept() ERROR: 0: error:00000000:lib(0):func(0):reason(0)
If just trying to restart novell-httpstkd, or stop & start that service, you may see:
Jan 1 01:01:01 MyOESServer httpstkd[5494]: Error initializing sockets, ccode = 0x62
The root cause is that the previous instance became unresponsive but a process is still running. Therefore, even if you try to restart novell-httpstkd, you will still be in the same situation as the old process is registered for listening on the specific tcp ports (8008 & 8009) and you receive an error initializing sockets.
If the above resolution does not work, there may be another application using the same ports (8008 & 8009). If that is the case, killing the PIDs for httpstkd will not resolve the issue as that will not free the port(s). An alternate resolution would be to:
- Verify the processes using ports 8008 and 8009 with:
lsof -i | grep 8008
lsof -i | grep 8009
which would return something to the effect of:
httpstkd 3525 root 4u IPv4 10435 TCP *:8009 (LISTEN)
httpstkd 4984 root 4u IPv4 10435 TCP *:8009 (LISTEN)
- If there are other processes listed (than httpstkd), then either:
- investigate reconfiguring them to utilize a different port
- shutdown the other process and restart httpstkd
Finally, the solutions above are more expedient as the system remains up during the entire time -- only services are restarted. Alternatively, you could restart the server (shutdown -r now ) but this would require the whole server to be unavailable for a brief time, and would require the services (httpstkd & others using ports 8008 and/or 8009) to be reconfigured so they do not conflict.