General GMS Performance Troubleshooting

  • 7017626
  • 20-May-2016
  • 20-May-2016

Environment

Novell GroupWise Mobility Service
Novell Data Synchronizer Mobility Pack
Novell GroupWise 2014

Situation

Emails, appointments, contacts reach mobile devices after a long time.
Slow, delayed sync of mail or appointments - takes a long time.
Slow performance of Mobility server.
GroupWise, Mobility/Device connector stops randomly

Resolution

Troubleshooting of Performance issues can be a complex task as there are many components that need to be considered and ruled out as the culprit. While not all issues have been effectively captured, a list of common slow sync issues can be found in TID 7013038 - Slow sync of Mobility - Master TID.

The purpose of this TID is to provide a technical guide for general GMS performance troubleshooting (tips and tricks).

GroupWise Sync Agent (GWSA):

  • Is the queue of GroupWise events high (in-memory cache of events to fetch from POA)?
    tailf groupwise-agent.log | grep -i queue
    • Are these events decreasing or increasing? Here are some potential causes of increasing events found in the queue:
      • A lot of events just happen to be coming through the pipe and could potentially be processed fine without any problem. Please monitor the queue.
      • A potential GroupWise problem: spam, recurring rule being executed, etc. Repeated messages in the GW system could trickle down to a performance problem in GMS.
      • Events could be taking a long time to download from GroupWise to GMS. Perhaps a POA has a hung SOAP thread or unusually high latency: See TID 7013102.
      • The GMS WebAdmin GroupWise Poll POA for Events configuration is set too low: See TID 7014487.

  • What is the state of the events pending SyncEngine accept/reject?
    Note: Once the event has been downloaded, GWSA passes it to the SyncEngine and stores it in the database.
    psql -U datasync_user datasync -c 'select state,count(*) from consumerevents group by state;'
    • Are these events primarily for just for a specific user?
      Please see TID 7016454 - Slow sync of events (Mobility server overloaded with consumerevents for some users).

Mobility Sync Agent (MSA):

  • Is the mobility-agent conversion queue high (events pending conversion to active-sync)?
    tailf mobility-agent.log | grep -i queue

  • What is the state of the events pending conversion?
    psql -U datasync_user datasync -c 'select state,count(*) from consumerevents group by state;'
    Note: Enter the datasync_user's password. If this is unavailable, please execute dsapp -s to get a similar output of queues. Here are some known event states that could be reported:
    STATE_PENDING = '1'
    STATE_RETRY = '2'
    STATE_DEPENDENT = '3'
    STATE_PENDING_DEPENDENT = '4'
    STATE_RETRY_DEPENDENT = '5'
    STATE_ERROR_0 = '1000'

  • Are device threads available for incoming devices?
    Note: Perhaps device(s) are overwhelming the server with requests.
    tailf mobility-agent.log | grep -i threads
    • Which devices have the most incoming requests (by deviceId):
      grep -Po "(?<=DeviceId=)[^<]*(?=&)" mobility-agent.log | uniq -c | sort -nr
    • Which devices have the most incoming requests (by connection string info):
      grep -Po "(?<='QUERY_STRING':)[^<]*(?=', 'CONTENT)" mobility-agent.log | uniq -c | sort -nr
      Note: This command will have more information about the device's request, including the 'CMD'.
    • What devices are connecting right now (show current incoming device requests)?
      tailf mobility-agent.log | grep -Po "(?<='QUERY_STRING':)[^<]*(?=', 'CONTENT)" | sed "s/^/$(date)/"
    • Does the Operating System (OS) report a SYN Flood?
      Note: A SYN flood is a form of denial-of-service attack in which an attacker sends a succession of SYN requests to a target's system in an attempt to consume enough server resources to make the system unresponsive to legitimate traffic.
      grep -i 'syn flooding' /var/log/messages

Server / Hardware Performance:

  • Check server's available cpu and memory during slow performance:
    • There are many tools that can be used for this purpose. Here are a few tools or files: top, vmstat, free, /proc/meminfo, /var/log/messages

  • Check the server's disk performance during non-peak hours when GMS can be shut-down temporarily:
    • Please refer to TID 7009812 - Slow Performance of Mobility during peak hours.

  • Perhaps PostgreSQL (PSQL) queries / commands are slow. Perhaps the following conditions are true:
    • Slow Disk I/O can also impact PSQL performance (see above).
    • Manual PSQL database maintenance has not been run recently (recommend every 6 months).
      • To help determine when manual maintenance was last run, please see TID 7015569 - How to run dsapp general health check on a Mobility server.
      • To run manual maintenance, please see TID 7009453.
    • Performance of Nightly Maintenance is not happening or not completing regularly: See TID 7013192 - How to check Nightly Maintenance is completing on Mobility server.