Usage of the FIRST_WATCHDOG_PACKET NCP parameter on OES Linux

  • 7004848
  • 11-Nov-2009
  • 29-Apr-2013

Environment

Novell Open Enterprise Server 2 (OES 2) Linux
Novell Cluster Services
Novell Client

Situation

With Novell Open Enterprise Server 2 (OES 2) Linux kernel, the feature parity with the Open Enterprise Server NetWare kernel for the FIRST_WATCHDOG_PACKET parameter was slightly changed.

By default, the OES NetWare equivalent of the parameter (Delay Before First Watchdog Packet), has a 5 minute idle-time before asking the workstation if it is still attached to the file server, and is than configured to run the same check at a 1 Minute interval, whereas for OES Linux the setting currently is set to disabled by default.

The implication for this is that 'stale' NCP connections on the server are never automatically cleaned up. So called 'stale' connections may exist on a server when NCP clients do not properly log off, or become disconnected from the server because of other reasons. There are several reasons why 'stale'  NCP connections on a server can exist.

When the FIRST_WATCHDOG_PACKET setting is not enabled, this will cause the NCP server to not clean-up such connections, and the connection table will continue to grow. There have been customer reports where there were 70k-80k NCP connections to the server where they should not be.

For as long as the server is running, and the NCP connection table continuously grows with 'stale' connections, eventually this may be causing the server to become sluggish, sometimes completely unresponsive even, and than the server requires to be rebooted in order to recover from the problem.

Resolution

The FIRST_WATCHDOG_PACKET parameter needs to be enabled and configured to run at specified intervals.
The number specified here, is measured in minutes.

With the FIRST_WATCHDOG_PACKET parameter enabled, a separate thread will run where the NCP server will ping each connection to the server at the defined interval, and verifies if the clients are still active. When the connection is inactive for the configured amount of time in minutes, the server sends a UDP connection ping request to the client. If there is no response from the client here, the connection is terminated.

A proper baseline setting for the FIRST_WATCHDOG_PACKET parameter for OES Linux would be considered 5 minutes, after which you need to identify if this suits your needs, or if this perhaps need to be adjusted to a value that best suits your environment.

The 'set' parameter can either be set in the NCP Console utility NCPCON or from the command line.
For example:
  1. type ncpcon and on the ncpcon command line type set FIRST_WATCHDOG_PACKET=<value>
  2. In a terminal window type ncpcon set FIRST_WATCHDOG_PACKET=<value>
The 'set' parameter can also be viewed from either the NCP Console utility NCPCON or from the command line.
For example:
  1. type ncpcon and on the ncpcon command line type set FIRST_WATCHDOG_PACKET
  2. In a terminal window type:  ncpcon set FIRST_WATCHDOG_PACKET

Once successfully set, the value for the parameter will be saved in the NCP server configuration file at :
/etc/opt/novell/ncpserv.conf



Note for OES11 SP1 (and later):
In addition to the set FIRST_WATCHDOG_PACKET Novell has introduced another SET parameter called : NCP_TCP_KEEPALIVE_INTERVAL.

This 'new' NCP_TCP_KEEPALIVE_INTERVAL parameter is the NCP equivalent of of the Linux 'tcp_keepalive_intvl' parameter as is defined in '/usr/src/linux/include/net/tcp.h'. 

Over the course of time, there have been a number of TID's published by Novell technical Support that describe situations where NCP connections were not being closed properly. The general suggestion is to configuree the 'tcp_keepalive_intvl' parameter in '/etc/sysctl.conf' to have a value  to overcomes this 'stale' connection problem.

TID examples suggesting possible modifications to '/etc/sysctl.conf' :   
    TID 3138614 - eDirectory connection not clearing on a Linux server after abnormal workstation shutdown
    TID 7007226 - Files appear to be locked if the Novell Client workstation crashes
    TID 7009860 - Novell Client drive mappings disappear after a period of idleness


The problem with making changes to '/etc/sysctl.conf', is that this will be a new system-wide setting, affecting all other server side TCP connections to this server.
Therefor, for the sole purpose of properly maintaining eDirectory / NCP connections this may be a considered a too impacting change. The NCP_TCP_KEEPALIVE_INTERVAL parameter is implemented to address that issue.


Explanation:
FIRST_WATCHDOG_PACKET: The server sends an NCP ping packet to the client if it detects no client activity for a specified amount of time. By doing this, the server tries to keep the connection alive.  Configure this parameter if there is any mechanism implemented between the server and the client that would break the idle connections.

NCP_TCP_KEEPALIVE_INTERVAL: If the client is inactive for a configured amount of time, the server sends a TCP packet to the client to check whether the client is still connected to the server or not. If the server does not get an acknowledgment from the client, then the server identifies that the client is not available and clears all the information related to the specific client connection.


The FIRST_WATCHDOG_PACKET parameter can still be used for environments which blocks TCP keep-alive packets (this can for example be the case while accessing the NCP server over WAN network and there is certainly a need to maintain the NCP connections in such environments).

Additional Information

  1. There is currently a discussion with NCP development if this parameter needs to be re-enabled by default in the future.
  2. Please note that when you manually make modifications to the configuration file, the appropriate services, in this case NDSD and NCP2NSS need to be manually restarted as well.
  3. The following two watchdog parameters that exist on Novell NetWare, do not exist in the current NCP implementation on OES2 Linux:
 "Number of Watchdog Packets" and "Delay Between Watchdog Packets".