Files appear to be locked if the Novell Cient workstation crashes

  • 7007226
  • 17-Nov-2010
  • 26-Apr-2012

Environment

Novell Client for Vista
Novell Client for Windows 2000/XP/2003

Situation

When the Novell Client for Windows unexpectedly shuts down (due to power outages or because Windows has crashed), after restarting the workstation the user cannot access certain files. These are usually database files which are flagged as being already in use. The files can be closed by using NCPCON. This behavior is seen regardless of the status of the "File Caching" setting on the Client.

Is there any way to prevent this from happening, or to correct the problem without manually releasing the "lock" on the file?

Resolution

No action is required. The files will eventually become free again even if no steps are taken.

Additional Information

If an NCP client has an uncontrolled loss of connectivity to the NCP server which prevents it from gracefully clearing the NCP connection and associated resources (possibly including outstanding NCP file handles), the mechanism to cause the abandoned NCP connection to become cleared is the TCP keep-alive interval in use on the NCP server TCP connections. The comparable feature on UDP or IPX is "watchdog packets."
 
For example, if the workstation blue screens, or has directly or indirectly lost network connectivity with the NCP server, the NCP server's TCP stack will send out a keep-alive packet which will go unanswered, and so the NCP server's TCP stack will escalate the TCP keep-alive verification until the workstation either responds or the TCP stack simply rests the connection.
 
At the point the NCP connection is reset, the NCP server will receive notification of this from the TCP stack and will clean up any outstanding resources associated with that NCP connection.  At that point, any outstanding handles open against files, locks against byte ranges within files, etc., will finally become cleared.
 
But between that time where the workstation loses connectivity and the NCP server's TCP stack declares the workstation is no longer responding to keep-alives, the TCP and NCP connection are still valid, and the files are still open and potentially locked by the station which owned that NCP connection.
 
Setting a lower TCP keep-alive interval for the NCP server will shorten this period of time for which the files are locked.  Note that shorting this time period means that any interruption in communication that lasts longer than the TCP keep-alive period will result in the connection getting reset.
 
This can be very undesirable when it happens to workstations which are actually still running, but have temporarily lost connection to the NCP server, because upon the reset of the connection, they will have lost all their open file handles and outstanding byte range locks.  The NCP connection itself will be automatically re-established, but the applications will then be in an undefined state because they essentially just experienced a server reboot, from the workstation's perspective; since all their outstanding locks and sharing permissions were discarded.
 
Changing the NCP server's TCP keep-alive to something like 300 seconds could be reasonable; the files will still be locked but not for an extended period of time.  Shorter than that is probably ill-advised unless you understand and accept the risk it creates for non-frozen/rebooted workstations, and have a specific reason to take on that risk.
 
More information on setting the TCP keep-alive interval can be found in TID 3138614, "eDirectory connection not clearing on a Linux server after abnormal workstation shutdown".