Troubleshooting tips for NetWare SSH and SFTP

  • 7004900
  • 20-Nov-2009
  • 26-Apr-2012

Environment

Novell NetWare 6.5
Novell Open Enterprise Server (OES) NetWare

Situation

Troubleshooting tips for NetWare SSH and SFTP

Resolution

This document attempts to give a starting point for troubleshooting problems connecting to or authenticating to NetWare SSH (including SFTP and SCP). It may be expanded from time to time.

 
 
Enable the Highest Debug Logging

If users cannot connect or authenticate through the OpenSSH suite (offered through SSHD.NLM), edit the /etc/ssh/sshd_config file. Find the LogLevel parameter and set it to:

LogLevel DEBUG3

To put this change in place, SSHD.NLM must be unloaded and reloaded.


 
Obtain an sshd.log of the problem occurring.

After reproducting the problem, obtain the sys:etc/ssh/logs/sshd.log file. (Older events may have been archived in sshd1.log, sshd2.log, etc.) Common areas of concern in the sshd.log:

 
Identifying your session:

One of the easiest ways to find your session in a log is to search on the user name and/or IP address in use by the client. This will yield such results as:

17 Nov - 07:04:37[0001947264] (889383e0)SSHD Session <1> : Connection from 137.65.120.5 port 2401

or

17 Nov - 02:55:27[0001675073] (aa59f060)SSHD Shell <1> : sftpu1@137.65.120.5:1581 - SFTP connection(1) accepted.

The number in <brackets> after "Session" or "Shell" is the session number, also known as the thread number. Activities that are general and don't necessarily relate to one session will show session 0, for example:


17 Nov - 02:51:42[0001670975] (aa6aa520)SSHD Session <0>

Individual sessions will get numbered 1 and higher, and the system will increment the session number for each session that comes in. More than one session can be going on at once, so be careful to pay attention to the one you are attempting to follow.

 
Searching for the user in eDirectory:

A normal, successful ldap search for a user will take the following form:

17 Nov - 02:51:42[0001670975] (aa6aa520)SSHD Session <1> debug3: edir_find_user_context(sftpu1, uniqueid) sfilter = (&(objectclass=*)(uniqueid=sftpu1))

17 Nov - 02:51:42[0001670975] (aa6aa520)SSHD Session <1> debug3: edir_find_user_context(sftpu1, uniqueid) searchBase = o=partridge, scope = Subtree

17 Nov - 02:51:42[0001670975] (aa6aa520)SSHD Session <1> debug3: edir_find_user_context(sftpu1, uniqueid) attribute = uniqueid

17 Nov - 02:51:42[0001670975] (aa6aa520)SSHD Session <1> debug3: edir_find_user_context(sftpu1, uniqueid) value = sftpu1

17 Nov - 02:51:42[0001670975] (aa6aa520)SSHD Session <1> debug3: edir_find_user_context(sftpu1, uniqueid) ldap_get_dn() returned cn=sftpu1,o=partridge

Points to note in this example are:

- The first line shows that the user name being searched on in the above log information is "sftpu1" and that SSHD is searching for a object whose "uniqueid" attribute matches that name. It is possible that some user objects may not have this attribute, depending upon the tool which created the user. Older tools like NWADMN32 did not create that attribute.

- The second line shows the searchBase (the location in the tree where the search is starting) as well as the scope, which will be either "one level" for a simple context search, or "subtree" to include all sub-containers. These parameters are controlled by the "eDirNameContext" settings in sys:\etc\ssh\sshd_config. If more than one setting is present, multiple debug lines will occur.

- The third and fourth lines show that the search had a successful result. "attribute = uniqueid" was found where "value = sftpu1". It these lines to not appear, then the search was not successful. No "error" message will be shown. The absence of these "attribute" and "value" lines would be the main clue that nothing was found.

- If the search fails, then it is possible that a new, almost identical search will be done where SSHD searches on "cn" rather than "uniqueid". This can help if the user is missing the "uniqueid" attribute. However, in a default tree and default SSHD configuration, it is common that the ldap anonymous bind does not have enough rights to browse cn, so this would typically fail also.

- Shortly after a user is found, additional lines will show whether the password was correctly validated, and what home directory was read out of eDir. For example:

18 Nov - 00:22:58[0003081538] (aa6aa520)SSHD Session <4> : auth_edir_password() ldap_simple_bind_s() for cn=sftpu1,o=novell succeeded

18 Nov - 00:22:58[0003081538] (aa6aa520)SSHD Session <4> debug3: Home directory entry cn=SERVER1_SYS,o=novell#0#home\sftpu1 for user sftpu1

 
Searching for servers in eDirectory:

After a user object is found and the password validated, SSHD will typically do 2 server searches. One search will look for the local server's object in eDirectory, and will look like this:

17 Nov - 02:55:27[0001675070] (aa5a4240)SSHD Session <1> debug3: edir_get_server_dn(SERVER1) sfilter = (&(objectclass=NCP Server)(cn=SERVER1))

17 Nov - 02:55:27[0001675070] (aa5a4240)SSHD Session <1> debug3: edir_get_server_dn(SERVER1) searchBase = o=novell, scope = Subtree

Similarly to user searches, this shows the name being searches, the object type being filtered against, and the searchBase and scope of the search. (See the above "Searching for users" section for more details.)

When successful, it won't give any particular message saying "success". This search is also controlled by the "eDirNameContext" settings mentioned above.

Then a broader server search will be done, to identify all NCP Servers in the eDirNameContext search areas. It will look like:

17 Nov - 02:51:42[0001670978] (889383e0)SSHD Shell <1> debug3: edir_build_server_list() sfilter = (&(objectclass=NCP Server))

17 Nov - 02:51:42[0001670978] (889383e0)SSHD Shell <1> debug3: edir_build_server_list() searchBase = o=novell, scope = Subtree

17 Nov - 02:51:42[0001670978] (889383e0)SSHD Shell <1> debug3: edir_build_server_list() ldapsearchbase = o=novell

17 Nov - 02:51:42[0001670978] (889383e0)SSHD Shell <1> debug3: edir_build_server_list() Search for Server in context o=novell succeeded

17 Nov - 02:51:42[0001670978] (889383e0)SSHD Shell <1> debug3: edir_build_server_list() entry = 91e4baa0

17 Nov - 02:51:42[0001670978] (889383e0)SSHD Shell <1> debug3: edir_build_server_list() ldap_get_dn() returned cn=SERVER1,o=novell

17 Nov - 02:51:42[0001670978] (889383e0)SSHD Shell <1> debug3: edir_build_server_list() AddServer(SERVER1) ldapDN: cn=SERVER1

17 Nov - 02:51:42[0001670978] (889383e0)SSHD Shell <1> debug3: Server name- SERVER1 is not existing

Multiple servers may be found, so portions of this pattern may repeat. The messagine "SERVER1 is not existing" simply means that when the system went to add the server name to it's list, the name was not already in the list. This is normal.

Note: If a certain server is not found in this search, users will not be able to navigate to that server, or be placed in a home directory on that server.

 
Searching for volumes in eDirectory

After the server searches are done, a volume search will be done, and will look like this:

17 Nov - 02:55:27[0001675073] (aa59f060)SSHD Shell <1> debug3: edir_build_volume_list() sfilter = (&(objectclass=Volume))

17 Nov - 02:55:27[0001675073] (aa59f060)SSHD Shell <1> debug3: edir_build_volume_list() searchBase = o=novell, scope = Subtree

17 Nov - 02:55:27[0001675073] (aa59f060)SSHD Shell <1> debug3: edir_build_valume_list() ldapsearchbase = o=novell

17 Nov - 02:55:27[0001675073] (aa59f060)SSHD Shell <1> debug3: edir_build_volume_list() AddVolumeToServerByName(SERVER1, DOSFAT_C)

17 Nov - 02:55:27[0001675073] (aa59f060)SSHD Shell <1> debug3: edir_build_volume_list() AddVolumeToServerByName(SERVER1, test5)

17 Nov - 02:55:27[0001675073] (aa59f060)SSHD Shell <1> debug3: edir_build_volume_list() AddVolumeToServerByName(SERVER1, test3)

17 Nov - 02:55:27[0001675073] (aa59f060)SSHD Shell <1> debug3: edir_build_volume_list() AddVolumeToServerByName(SERVER1, SYS)

Note: If a certain volume is not found, users will not be able to navigate to that volume, or be placed in a home directory on that volume.

 
Reaching a user's home server or home directory:

As noted above, for a user to reach their home directory, the home server and home volume must be found in the ldap searches described above. If a user is not reaching a their home directory or a location to which they are trying to 'cd', review the debug search output.

In addition to *finding* the server and volume in question, SSHD must be able to create an identity token and path context to the server and volume in question. These can be found by seaching for mention of calls to "create_identity" and "NXCreatePathContext". These functions are provided by LIBC.NLM and can be a source of errors. If an error occured in one of these APIs, part of the output will list 4 error codes, in this format:

rc: 117, errno 7, h_errno: 10110, clienterrno: 0

The first code (rc) is usually the return code from LIBC itself. LIBC codes can be looked up here. Occasionally, the rc: is an eDirectory error code. eDir error codes can be referenced in TID 10010902 . That document was original written for older versions of DS but the information still applies.

The second code (errno) is not generally useful. It typically will report "7" both in cases of success and failure.

The third code (h_errno) may or may not be useful. A code in the 10000 range is typcially a Winsock error. For example, if LIBC received an error from Winsock (communication, host name resolution, etc) it will be shown here. Sometimes a value of 997 is shown here. The source of that value has not yet been determined by the author of this document, but occurs in enough different conditions that it is considered to be too generic to be helpful.

The fourth code (clienterrno) will often contain an eDirectory error code. eDir error codes can be referenced in TID 10010902 . That document was original written for older versions of DS but the information still applies.

The create_identity call will typically contain the fully distinquished user name. The NXCreatePathContext call will typically contain the server name and volume name where the user's "home" has been determined. These are good values to double check.

Between these 2 calls, as well as after the second call, messages containing home directory information will typically be found. This can help you determine whether the system is, for some reason, deciding that it needs to alter ("fix up") the target destination. Those lines may give clues, if the user is having a hard time reaching the right directory.

Additional Information

Formerly known as TID# 10100305