POA, MTA, or Gateway attempts to start continuously

  • 7003503
  • 10-Jun-2009
  • 27-Apr-2012

Environment

Novell GroupWise 7
Novell Open Enterprise Server 2 (OES 2) Linux
Novell Cluster Services 1.8.4
Novell GroupWare/Collaboration GroupWise Monitor
Novell GroupWise GroupWise High Availability

Situation

When checking POA, MTA, or Gateway log files hundreds or thousands of new, small log files are found. The log files show as being created approximately every two minutes (or the refresh time setup in GroupWise Monitor). When checking GroupWise Monitor (GWMon) the agent shows as being up (Normal) during these times.

The beginning of these small log files look like this:
16:17:49 888 ************************* Initializing ************************
16:17:49 120 Checking guardian database
16:17:49 736 Starting GWPOA-GWEvent Reader 1
16:17:49 888 Error Listen Port is already in use. [0000]
16:17:49 888 Error verifying that the port is usable. [8555]
16:17:49 888 Shutdown of Threads
16:17:49 888 Error Starting TCP/IP Agent: [8555]

Performing "rcgrpwise status" on the Server or Node hosting the GroupWise Agent or Resource shows "unused" for the POA, MTA, and/or Gateway known to be running on this Server or Node. You can verify the agent is running by performing "ps -ef|grep -i group". This will show you which agent is currently running.

Verify whether a text file exists in /var/run/novell/groupwise directory. This file will be named "PO.DOM.pid" (POA), "DOM.pid" (MTA), or "GW.pid" (Gateway). The contents will be a single line containing the PID as listed in the "ps" command above. If this file is missing see the Resolution below.

Resolution

If the PID file for the GroupWise Agent wasn't created when the agent was started then GWMon will direct GWHA to start the agent. You can resolve this issue until the next restart of the agent by manually creating the .pid text file.

On the Server or Node hosting the agent:
1. "cd /var/run/novell/groupwise"
2. "vi <PO>.<DOM>.pid" (DOM.pid for MTA and GATEWAY.DOM.pid Gateway)
         (PO, Domain, and Gateway names should be all caps. The extension ".pid" is all lower case.)
3. Type "i" to switch to Insert mode.
4. Enter the "pid" as defined by the "ps" command in the Situation section above.
5. Hit Escape.
6. Hold down the Shift key and hit "Z" twice. This will save and close the file.

At this point GWMon and GWHA will no longer attempt to start the agent.

Additional Information

This solution may be different for versions of GroupWise 7 prior to SP3. GroupWise Monitor not only monitors each agent (POA, MTA, Gateway) but also communicates with GWHA. If the PID file is missing GWHA notifies GWMon. GWMon then directs GWHA to start the agent. Since the agent is already running the log shows that the agent's Listen port is already in use so the new agent process shuts down leaving a new log file.

Creating the new PID file in /var/run/novell/groupwise accomplishes two things:
1. "rcgrpwise status" works and shows the correct status of the agent(s) running
2. GWHA no longer notifies GWMon that there's no PID file for the agent so GWMon doesn't direct GWHA to start a new agent.

The GroupWise agents (gwpoa, gwmta, gwia, gwinter) control the creating of the PID file. The PID file name and location is defined by the /etc/init.d/grpwise script and is passed to the GroupWise Agent by startproc using the "-p" flag. If you attempt to manually start a GroupWise agent without using the /etc/init.d/grpwise script you will not get a PID file.

Furthermore, if you just attempt to start GroupWise (/etc/init.d/grpwise start) without defining any specific agents or startup files then the /etc/init.d/grpwise script will query the gwha.conf file and attempt to start all defined agents. If the gwha.conf file is corrupt, or incorrectly formatted, then the PID file will either not be created or it will be created with an incorrect name and not be recognized. (See TID 3740936 for improperly formatted gwha.conf file.)