Environment
SUSE Linux Enterprise Server 10
Novell Client
Situation
- Novell clients disconnecting from OES2
- First login with the Novell Client does not have a problem connecting to NCP volumes, but after a while they disconnect.
- Novell Client mapped network drives get disconnected after a period of inactivity like during lunch, or over night, etc.
- Clicking several times on the drive mapping will eventually reconnect.
Third party network devices, for security purposes, destroy idle connections based on their own setting. This could be set anywhere from minutes to hours depending on the device.
If one such device is between the client and the server over which you have a drive mapped and you go to lunch and that device is set to destroy any connection that has been idle for 15 minutes, then when you return from lunch your drive mapping will be lost.
Resolution
Solution 1
The solution that most directly
resolves the root cause of the problem is to adjust that security
setting on the third party device that is destroying the idle
connection without notifying either end of the connection.
Solution 2
You can adjust the config parameters for TCP on the
server.
This will effect ALL TCP connections for this server.
The following are the config
parameters and their default values for TCP keep alives:
# cat /proc/sys/net/ipv4/tcp_keepalive_time
7200
# cat /proc/sys/net/ipv4/tcp_keepalive_intvl
75
# cat /proc/sys/net/ipv4/tcp_keepalive_probes
9
The tcp_keepalive_time and tcp_keepalive_intvl are expressed in
seconds.
The tcp_keepalive_probs is expressed as a number of probes.
Using the defaults means that
the TCP keep alive routines will wait for two hours (7200 secs)
before sending the first keep alive probe. If there is no
reply to that probe then another probe will be sent after 75
seconds. If there is still no reply another probe will be
sent after another 75 seconds. This pattern continues until
either a reply is received or the probes number is reached.
In this case a total of 9 probes will be sent including the initial
one at 2 hours. If all nine are sent with out a reply being
received then the server will mark the connection as broken.
By default 7200 seconds + 8 x 75 seconds = 7800 seconds or 2 hours
and 10 minutes.
If you adjust the tcp_keepalive_time to a value less than the security setting on the third party device then the connection will never be seen as idle by that device.
Solution 3
Enable and configure the FIRST_WATCHDOG_PACKET parameter on the OES server. See TID 7004848, "Usage of the FIRST_WATCHDOG_PACKET NCP parameter on OES2 Linux".