How to Troubleshoot IPX Connectivity

  • 10024131
  • 1.0.46993041.2479259
  • 02-Jan-2000
  • 07-Oct-2003

Archived Content: This information is no longer maintained and is provided 'as is' for your convenience.

Goal

How to Troubleshoot IPX Connectivity

Fact

Novell Netware 6

Novell NetWare 5.1

Novell NetWare 5.0

Novell NetWare 4.2

Novell NetWare 4.11

Symptom

Cannot see servers across the WAN

Cannot see server in Network Neighborhood

Error:  "-625 0xFFFFFD8F = ERR_TRANSPORT_FAILURE"

Error: "-626"

Cause

Most common causes of connectivity problems:
1.  Hardware:  bad NIC, cable, hub, switch port, switch, router
2.  Configuration on the network:  router filters/configuration, bad routing tables
3.  Configuration on the server: duplicate Internal IPX Address, NIC drivers (.LAN files and the supporting modules, i.e. ETHERTSM, NBI, MSM) are not the latest from the manufacturer of the NIC, server NICs are bound to wrong external network number.

Fix

In order to troubleshoot IPX connectivity issues, you must identify who can communicate and who cannot.  This document will break down into three primary sections:
1.  Tools to Identify Communication Problems
2.  LAN Communication Problems
3.  WAN Communication Problems


Section 1: Tools to Identify Communication Problems:
A.  DSREPAIR
Use "DSREPAIR/Report Synchronization Status" to identify what in the replica ring can and cannot communicate.  Usual errors returned here are -625s or -626s referring to non-communicating servers.  DSRepair will only attempt to communicate with servers holding replicas of the same partitions held on this source server.

B.  IPXPING
Use IPXPING to determine if you can talk to your servers that have IPX bound on them.  In order for a server to return an IPXPING packet, it must have IPXRTR loaded.   (Type "LOAD IPXRTR" at the server console.  If IPXRTR is already loaded it will say so, if not it will load).  In order to use IPXPING, it will be necessary to know the Internal IPX Network Number of the server being tested.  To find this, type "CONFIG" on the target server.  Find the Internal IPX Network Number (sometimes called SERVER ID) at the top of the screen.  If you use this number, then leave the node as it is and also the packet size.  Anything above 90% indicates good communication.  If 0% is returned, that server cannot communicate.  If a low percentage is returned, increase the packet size on the next ping (to 1400 or so) to see if an abnormal amount of larger
packets are dropping.

C.  DISPLAY SERVERS
Type "DISPLAY SERVERS" at the console prompt to view what SAP advertisements this server is seeing.   This server is not getting a SAP from any server it cannot see here.  IMPORTANT NOTE:  A server can have a communication problem even if it can be seen on the source server.  "DISPLAY SERVERS" only displays the SAPs that are seen on the wire.

D.  IPXCON
Type "LOAD IPXCON" at the console prompt.  Go to "Services/Display Selected Entries".   Change "Type ALL" to"Type 0004" and then choose "Proceed".  This will list all the SAP type 0004s (NetWare File Servers) that this server can see.  If it cannot see a server, it won't be able to communicate with that server.  Each server should be able to at least see its own 0004.  Otherwise, Directory Services may be the culprit.  Make sure the database is loaded and open.

E.  Network Neighborhood
Use Network Neighborhood to view/browse the server in question.  If you can see the server under Entire Network/NetWare Servers, try to browse it (double-click it and read directories under a volume).  See Section 2A.

With these tools, you should be able to determine what can and cannot communicate.  You will also want to know if this is a local or a WAN issue.  Questions to consider:
1.) Can local workstations login to/see/browse this server?
2.) Can local servers see and communicate with each other?
If either of these answers are NO troubleshoot the problem as a LAN issue, see Section 2 below.
If the answers to these questions are YES, chances are the problem is a WAN issue, see Section 3 below.


Section 2: Troubleshooting LAN Connectivity Problems:
A.  Always use the manufacturer's latest NIC drivers and supporting modules.  If problems appeared shortly after applying a Support Pack, chances are high that the Support Pack overwrote the LAN drivers/modules.  The LAN drivers/modules should be reapplied.  Seeing the server but not being able to browse it in Network Neighborhood may also indicate the problem source.

B.  If you want to test the switch on a single server site, you have two options:  1) Take the server and a workstation and plug them both into a dumb hub.  Attempt login.  Or, 2) If you don't have a dumb hub, try using a cross-over cable directly from the workstation to the server and attempt to login.  If login attempt fails now, the problem is not on the wire but on the server.

C.  If a multi-server site has a questionable server in the same room with reliable servers, you can take the cable out of the back of a reliable server.  Plug it into the questionable server.  This tests both the cable and switch port at the same time.  It also ensures testing with a reliable cable and port.  

D.  If multiple NICs are in the same server, try disabling all of them except for one and test them one at a time.  You may find your problem resides just with one card.  

E.  If this server is new to the environment, you may have a duplicate Internal IPX Network Address.  You can test this by just changing the address.  Do a LOAD EDIT AUTOEXEC.NCF at the server console prompt.  From here you can change the Internal IPX network number (SERVER ID in nw5).  For testing, you could make a number up like 99887766.  If the server starts communicating after a reboot... then your IPX number was a duplicate.  Don't forget, it doesn't have to be a NetWare Server that is broadcasting out that IPX number, many devices have been known to conflict here.

F.  Also, if this server is new to the environment, you'll want to make sure that the NIC is bound to the same external network number as the segment it's on.  You can do this by looking at a CONFIG on the server console and comparing the network number underneath each frame type to make sure it is the same as another server on the same segment.  

G.  If you've narrowed the problem down to the NIC, it's a good troubleshooting step to replace the NIC in the server.  When you do this, you may want to test many things at once by using a different brand of NIC in a different slot on the motherboard and trying a different interrupt.

H.  If your server is not sapping out type 0004 (see IPXCON tool above), you won't be able to communicate with it.  This is usually an indication of a DS problem on the box and you'll want to start there with your troubleshooting.  


Section 3: Troubleshooting WAN Connectivity Problems:
Usually, if you have a WAN issue, you'll want to involve the other vendors in the mix:  Router vendors, WAN service providers, etc. However, here are a couple suggestions:

A.  If you can see all your local servers, but cannot see any servers outside your local site, then the problem is usually a wan issue.  Use IPXPING to determine this.  You'll want to identify which sites you can and cannot see to help you identify where the problem specifically is.  For example, if you are in Chicago and you cannot see New York, can you see Dallas or L.A.?  If you cannot see anybody at all, you might look at the local router.  If you can see all of the sites except for one, you might look at that sites router.  You also might check to see what other sites can see using RCONSOLE.  

B.  Remove all the filters on your router for troubleshooting.  Let's say for example that we had a router who had a filter that said Only Allow these networks out.  Well, if you brought up a new server and that network wasn't listed in that filter, he would be able to see everybody locally, but nobody past the router.

C.  Use IPXCON to see if you have corrupted routing tables.  Above we showed you how to get all the type 0004s that your server can see.  If in that same screen you hit enter on any server and then hit enter again on Destination Information you will see the route to that server.  If the hop count is 16 or above OR if there isn't anything in this field, the destination is unreachable.  If your routing tables are corrupted, first start by typing a RESET ROUTER at the server console prompt.  If that doesn't change the situation, you may want to clear the tables on the router.


Usually connectivity problems can be solved with the above troubleshooting steps.  However, if not, or if your connectivity problems are random, other suggestions would include getting a trace of the problem for Novell Tech Support.  A good packet trace would include the following steps (use TID 10011012 and/or SolutionLink if you don't have a packet tracing solution already):  

A.  Make sure the tracing machine and the problem machine are in the same collision domain (if not, all the packets seen will be broadcasts and Novell may ask for a retake).  This means the machines either need to be on the same dumb hub, or span (mirror) the ports on the switch.

B.  The only filter should be the MAC address of the problem machine.  Don't filter on any protocols or even on traffic between two machines.  Usually the best trace will be everything to and from the problem machine.

C.  If possible, get a good trace (a working scenario) and a bad trace (a non-working scenario).

D.  Some situations might require a simultaneous trace.  In this situation, you would want a trace on the problem machine and a trace on the destination server at the same time.  This will help us identify whether the packets are reaching their final destination..