The cluster node hosting iPrint was plugged into switch A had route A to the printers. The TCP Keep-Alive packets were being sent from the server to the printer (see the Notes below for an explanation of why the TCP Keep-Alive packets were being sent), but the TCP Keep-Alive packets were being dropped before reaching the printers. The customer plugged the cluster node into switch B, which in turn had route B (a different route to the printers than route A), and the TCP Keep-Alive packets were not being dropped. Therefore communications flowed as expected and the printing problems cleared up. At the time of this writting, it is unclear if Tipping Point Intrusion Prevention Systems was present or not on route B to the printers. Whatever routing hardware was on route A to the printer, it did not like TCP Keep-Alive packets, so those packets were being dropped, causing major print disruption on the network. Route B to the printers did not have this problem with TCP Keep-Alive packets being dropped.
The LAN traces also revealed that only that one connection with TCP Keep-Alive packets was getting dropped. Other kinds of IP traffic was flowing just fine. New connections to send data were also fine.
In this case, the server and the printer were two hops away, meaning there were two routers between the server and the printer. The LAN traces show that when TCP Keep-Alive packets started to hit the wire, the connection between the server and the printer ceased to happen. For example, if the server's source port was 800, and the desitination port on the printer was 515 (LPR), any communications from those port would cease after TCP Keep-Alive packets hit the wire. The LAN traces showed the packets leaving the server and/or the printer, but the packets were dropped and never made it to their destination. The customer was running Tipping Point Intrusion Prevention Systems (www.tippingpoint.com
). Tipping Point is a system that monitors network traffic and detects malicious activity on the network, and it blocks malicious traffic on the network in real time. TCP Keep-Alive packets function by decrementing the TCP sequence number by one, therefore traffic analyzers can interpret TCP Keep-Alive packets as being malicious in nature, and therefore those packets may be dropped.
TCP/IP COMMUNICATIONS BETWEEN A SERVER AND A PRINTER
The Novell gateway (NDPSGW.NLM or NDPDS.NLM/PH.NLM) generally uses TCP/IP to communicate between the server and the printer. The most common method of communication is to use the LPR protocol (please see RFC 1179 for details on LPR), but NDPSGW.NLM in NW65SP2 and later can use TCP port 9100 to talk to JetDirect type devices.
The client and the server will negotiate a connection and the server will start to send the data to the printer. The printer will ACK the packets that it receives. However, printers can only receive a certain amount of data before their buffers fill up. When the printer's buffer fills up, the TCP Window will go to zero bytes in size. At this point, the server will start sending TCP Keep-Alive packets to check on the TCP Window of the printer, to see if the TCP Window has increased from zero. Once the printer has printed out some of the print job it has received, the printer will send back a packet with the TCP Window that is non-zero in size. At that point the server will continue to feed data to the printer until the TCP Window goes to zero, or until the print job has completed. It is not uncommon to see the TCP Window size go to zero many times during the course of a print job, especially if the print job is large. Having the TCP Window go to zero, using TCP Keep-Alive packets to check on the printer's TCP Window size, and having the TCP Window increase to be non-zero is normal operation for printing. Therefore, anything that interprets TCP Keep-Alive packets as malicious network activity and starts to block those kinds of packets will cause network printing problems. Routers and intrusion detection systems should be configured to allow TCP Keep-Alive packets to freely flow on the network.
Formerly known as TID# 10094620
TID feedback from a Novell customer indicates that Tipping Point filter #7120 can cause this problem. Novell has not validated the TID feedback claim that filter 7120 is actually the problem filter. However, the customer provided information is included in this TID as a courtesy to those who may want to use Tipping Point on their network, but do not want to have the problems indicated in this TID. You may consider disabling filter number 7120 on your network to work around the issue documented. Please consult with Tipping Point if you have any questions about Tipping Point filters, what they do, and the ramifications of enabling or disabling certain filters.