General GMS Performance Troubleshooting

Document ID:7017626
Creation Date:20-May-2016
Modified Date:20-May-2016
- Micro Focus Products:
  GroupWise
  GroupWise Mobility Service

Environment

Novell GroupWise Mobility Service
Novell Data Synchronizer Mobility Pack
Novell GroupWise 2014

Situation

Emails, appointments, contacts reach mobile devices after a long time.
Slow, delayed sync of mail or appointments - takes a long time.
Slow performance of Mobility server.
GroupWise, Mobility/Device connector stops randomly

Resolution

Troubleshooting of Performance issues can be a complex task as there are many components that need to be considered and ruled out as the culprit. While not all issues have been effectively captured, a list of common slow sync issues can be found in TID 7013038 - Slow sync of Mobility - Master TID.

The purpose of this TID is to provide a technical guide for general GMS performance troubleshooting (tips and tricks).

GroupWise Sync Agent (GWSA):

Is the queue of GroupWise events high (in-memory cache of events to fetch from POA)?
tailf groupwise-agent.log | grep -i queue

Are these events decreasing or increasing? Here are some potential causes of increasing events found in the queue:

A lot of events just happen to be coming through the pipe and could potentially be processed fine without any problem. Please monitor the queue.
A potential GroupWise problem: spam, recurring rule being executed, etc. Repeated messages in the GW system could trickle down to a performance problem in GMS.
Events could be taking a long time to download from GroupWise to GMS. Perhaps a POA has a hung SOAP thread or unusually high latency: See TID 7013102.
The GMS WebAdmin GroupWise Poll POA for Events configuration is set too low: See TID 7014487.

What is the state of the events pending SyncEngine accept/reject?
Note: Once the event has been downloaded, GWSA passes it to the SyncEngine and stores it in the database.
psql -U datasync_user datasync -c 'select state,count(*) from consumerevents group by state;'

Are these events primarily for just for a specific user?
Please see TID 7016454 - Slow sync of events (Mobility server overloaded with consumerevents for some users).

Mobility Sync Agent (MSA):

Is the mobility-agent conversion queue high (events pending conversion to active-sync)?
tailf mobility-agent.log | grep -i queue
What is the state of the events pending conversion?
psql -U datasync_user datasync -c 'select state,count(*) from consumerevents group by state;'
Note: Enter the datasync_user's password. If this is unavailable, please execute dsapp -s to get a similar output of queues. Here are some known event states that could be reported:
STATE_PENDING = '1'
STATE_RETRY = '2'
STATE_DEPENDENT = '3'
STATE_PENDING_DEPENDENT = '4'
STATE_RETRY_DEPENDENT = '5'
STATE_ERROR_0 = '1000'
Are device threads available for incoming devices?
Note: Perhaps device(s) are overwhelming the server with requests.
tailf mobility-agent.log | grep -i threads

Which devices have the most incoming requests (by deviceId):
grep -Po "(?<=DeviceId=)[^<]*(?=&)" mobility-agent.log | uniq -c | sort -nr
Which devices have the most incoming requests (by connection string info):
grep -Po "(?<='QUERY_STRING':)[^<]*(?=', 'CONTENT)" mobility-agent.log | uniq -c | sort -nr
Note: This command will have more information about the device's request, including the 'CMD'.
What devices are connecting right now (show current incoming device requests)?
tailf mobility-agent.log | grep -Po "(?<='QUERY_STRING':)[^<]*(?=', 'CONTENT)" | sed "s/^/$(date)/"
Does the Operating System (OS) report a SYN Flood?
Note: A SYN flood is a form of denial-of-service attack in which an attacker sends a succession of SYN requests to a target's system in an attempt to consume enough server resources to make the system unresponsive to legitimate traffic.
grep -i 'syn flooding' /var/log/messages

Server / Hardware Performance:

Check server's available cpu and memory during slow performance:

There are many tools that can be used for this purpose. Here are a few tools or files: top, vmstat, free, /proc/meminfo, /var/log/messages

Check the server's disk performance during non-peak hours when GMS can be shut-down temporarily:

Please refer to TID 7009812 - Slow Performance of Mobility during peak hours.

Perhaps PostgreSQL (PSQL) queries / commands are slow. Perhaps the following conditions are true:

Slow Disk I/O can also impact PSQL performance (see above).
Manual PSQL database maintenance has not been run recently (recommend every 6 months).

To help determine when manual maintenance was last run, please see TID 7015569 - How to run dsapp general health check on a Mobility server.
To run manual maintenance, please see TID 7009453.

Performance of Nightly Maintenance is not happening or not completing regularly: See TID 7013192 - How to check Nightly Maintenance is completing on Mobility server.