GroupWise & Database Corruption

  • 7001372
  • 16-Sep-2008
  • 18-May-2016

Environment

Novell GroupWise 2014
Novell GroupWise 2012
Novell GroupWise 8
Novell GroupWise 7
Novell GroupWise 6.5

Situation

GroupWise & Database corruption.
This document is intended to provide some background understanding of GroupWise and GWCheck.
Although, specifically targetting the D107 Record Not Found Error, this document provides some detail on parent/child message relationships.
 
This is a high level overview and is not intended to provide an exhaustive treatise on the subject.

Resolution

Let’s start with defining a message.
There are basically two types of messages, simple messages and complex messages.
A simple message does not contain any other messages.
Complex messages contain other messages. For example forwarded or reply messages that include the original message.

A simple message and all of its parts are stored in a single message database

A complex message contains other messages. A complex message consists of a "parent or owning" message. The attached message can be considered a sub-message or child message. Each attached child message may refer to a different message database.

A simple message consists of several parts, To, From, Subject, Message body etc. and may consist of many parts (multiple file attachments).
 
For a message to be complete all of the parts must be present.

A D107 error is reported when one or more of these message parts is missing and is no longer contained in the database.

Each GroupWise user has a unique User database.
In GroupWise 6.5 and prior there are only 24 message databases to be shared by all users on a Post Office.
In GroupWise 7.x and newer there are now 255 message dbs.
 
If a GWCheck structural rebuild detects one or more corrupt blocks in a message database these blocks will be removed during the rebuild and those records will be lost.
 
Typically database corruption can be caused by many things. Some of the most common known causes of corruption are (in no particular order)
Server abends,
Server power failures,
Server hardware problems, nics, hard drives, memory etc.
Backup software attempting to backup open databases, virus scanners running against open databases,
User’s workstations accessing their mail via direct access – mapped drive (not client/server)
Any interruption of a database transaction can cause corruption.

If a message exists in a user database that references an attachment that has been lost or deleted, whenever a user attempts to follow the attached messages a D107 error will result.
Often these missing records go unnoticed until the user attempts to open the missing message or the user is moved to another post office, runs Hit the Road or the Auto-Archive process runs.

A GWCheck contents check on the user database will clean up many potential D107 errors.
However, GWCheck will only follow/check the parent level of messages.
 
Attached messages are not followed/checked due to many reasons, most importantly time and resources.
If GWCheck were to follow each attached message which in turn may have one or more attached messages nested several layers deep, GWCheck would become so slow as to become unusable. By the time GWCheck were to follow each attached message and each sub-attached message etc, more data would be written to the GroupWise datastore and before GWCheck could finish, the results would already be out of date.