Environment
Novell NetWare 6.5 Support Pack 6
Novell NetWare 6.5 Support Pack 5
Situation
User would see THIS USER DOES NOT HAVE THE CORRECT CREDENTIALS
TO AUTHENTICATE TO THE CIMOM CLIENT when accessing the storage
management icon in Imanager. Once the the desired server was
selected in order to view its disk channel information, the above
error would occur. This error is usually the result of ldap failing
to find information on the user that is trying to authenticate. The
first troubleshooting step is to get the owcimom debug log:
edit the
sys:system\cimom\etc\openwbem\openwbem.conf file and
set
log.main.level =
DEBUG
Then unload and reload
owcimomd.nlm
Then reproduce the
problem.
The log is in: SYS:\SYSTEM\CIMOM\VAR\OWCIMOMD.LOG
In this case the
owcimomd.log showed this:
[1]HTTPServer: New thread
started
[3]HTTPServer: A thread got some work to do
[3]HTTPServer::authenticate: processing Basic
[3]NetWareAuthenticator: Didn't get cache entry for user admin. Doing LDAP authentication
[3]NetWareAuthenticator: Failed to start client session. Error: Can't contact LDAP server
[3]NetWareAuthenticator: Failed to authenticate user admin
[3]HTTPServer::authenticate: failed:
[3]HTTPServer: A thread got some work to do
[3]HTTPServer::authenticate: processing Basic
[3]NetWareAuthenticator: Didn't get cache entry for user admin. Doing LDAP authentication
[3]NetWareAuthenticator: Failed to start client session. Error: Can't contact LDAP server
[3]NetWareAuthenticator: Failed to authenticate user admin
[3]HTTPServer::authenticate: failed:
This information points to ldap
having a problem. The next troubleshooting step is to get a dstrace
with +ldap and +time
at the server
console:
load dstrace
dstrace screen on
dstrace - all
dstrace +ldap +time
dstrace file on
reproduce the
problem
dstrace file off
send in the
sys:system\dstrace.log.
in this case the dstrace.log
showed this:
LDAP: [2007/12/04 14:42:39]
Checking for configuration changes
LDAP: [2007/12/04 14:44:19] Work info status: Total:1 Peak:0 Busy:0
LDAP: [2007/12/04 14:44:19] Thread pool status: Total:4 Peak:3 Busy:3
LDAP: [2007/12/04 14:49:27] Work info status: Total:1 Peak:0 Busy:0
LDAP: [2007/12/04 14:49:27] Thread pool status: Total:4 Peak:3 Busy:3
LDAP: [2007/12/04 14:51:38] New TLS connection 0x96338540 from 127.0.0.1:1130, monitor = 0x1e0, index = 3
LDAP: [2007/12/04 14:51:38] Monitor 0x1e0 initiating TLS handshake on connection 0x96338540
LDAP: [2007/12/04 14:51:38] DoTLSHandshake on connection 0x96338540
LDAP: [2007/12/04 14:51:38] TLS accept failure 1 on connection 0x96338540, setting err = -5875. Error stack:
error:14094412:SSL routines:SSL3_READ_BYTES:sslv3 alert bad certificate - SSL alert number 42
LDAP: [2007/12/04 14:51:38] TLS handshake failed on connection 0x96338540, err = -5875
LDAP: [2007/12/04 14:51:38] BIO ctrl called with unknown cmd 7
LDAP: [2007/12/04 14:51:38] Server closing connection 0x96338540, socket error = -5875
LDAP: [2007/12/04 14:51:38] Connection 0x96338540 closed
LDAP: [2007/12/04 14:44:19] Work info status: Total:1 Peak:0 Busy:0
LDAP: [2007/12/04 14:44:19] Thread pool status: Total:4 Peak:3 Busy:3
LDAP: [2007/12/04 14:49:27] Work info status: Total:1 Peak:0 Busy:0
LDAP: [2007/12/04 14:49:27] Thread pool status: Total:4 Peak:3 Busy:3
LDAP: [2007/12/04 14:51:38] New TLS connection 0x96338540 from 127.0.0.1:1130, monitor = 0x1e0, index = 3
LDAP: [2007/12/04 14:51:38] Monitor 0x1e0 initiating TLS handshake on connection 0x96338540
LDAP: [2007/12/04 14:51:38] DoTLSHandshake on connection 0x96338540
LDAP: [2007/12/04 14:51:38] TLS accept failure 1 on connection 0x96338540, setting err = -5875. Error stack:
error:14094412:SSL routines:SSL3_READ_BYTES:sslv3 alert bad certificate - SSL alert number 42
LDAP: [2007/12/04 14:51:38] TLS handshake failed on connection 0x96338540, err = -5875
LDAP: [2007/12/04 14:51:38] BIO ctrl called with unknown cmd 7
LDAP: [2007/12/04 14:51:38] Server closing connection 0x96338540, socket error = -5875
LDAP: [2007/12/04 14:51:38] Connection 0x96338540 closed
A bad certificate error. There
are a couple of certificates used in the cimom authentication
process:
Your browser, in this case IE,
contacts the Netware server's Imanager piece with a certificate
that is given to the browser when Imanager is installed. The user
then logs into Imanager. Imanager then contacts the storage
management plugin when the user clicks on the icon, Then the
plugin, storagemgmt.npm, contacts a file called wbem.jar.
Wbem.jar makes the connection to the owcimom client. This is an
https secure connection using the the
sys:system\cimom\etc\openwbem\hostkey+cert.pem certificate. Once
this connection is made then the cimom client contacts ldap to
search for the credentials of the user that's trying to
authenticate. This connection between owcimom and ldap is also a
secure https connection that, by default, uses the
sys:public\rootcert.der certificate. This can be seen in the
sys:system\cimom\etc\openwbem\openwbem.conf file. This file
indicates what certificate ldap will use for this connection. The
cimom client passes this certifcate, specified in this file, to
ldap when making the connection using port 636. . It was this
connection over port 636 that was failing with the BAD CERTIFICATE
error in the ldap trace. The bad certificate was the rootcert.der.
When the connection works, ldap would have contacted edirectory to
get the users' credentials. These would then have then been passed
back to the cimom client and the user would have been
authenticated. But for some reason, ldap did not trust the
rootcert.der on this box. This problem could occur if the
rootcert.der were corrupt, or if a different rootcert.der had
been copied in from another tree and thus signed by different
CA authority, to this server. Another way this could possibly
happen is if the CA authority had been deleted and recreated and a
new rootcert.der generated.
See TIDs 3937454, 10098437,
10066259
A manual ldapsearch was done over
port 636 to verify that the port was working. We exported the
certificateDNS to do this test. It worked. That's why this
certficate was used in the solution.
ldapsearch -D
cn=admin,cn=users,ou=provo,o=novell -w novell -h 151.155.247.22 -p
636 -e c:\SSLCert.der -b cn=users,ou=provo,o=novell
cn=admin
You must substitute your own
information in this command. You can name the newly exported
certficateDNS to any filename.der you want. We exported the certificateDNS and copied it
to the c: on the local workstation. Then created a test directory
on the server, sys:cert, and copied the new certificate in there.
We then unloaded and reloaded owcimomd.nlm and nldap.nlm. We then
went into imanager but the error still persisted, however the
information in the ldap trace changed:
[2007/12/13 14:11:31] New TLS
connection 0x8cc6f620 from 127.0.0.1:1180, monitor = 0x19d, index =
1
[2007/12/13 14:11:31] Monitor 0x19d initiating TLS handshake on connection 0x8cc6f620
[2007/12/13 14:11:31] DoTLSHandshake on connection 0x8cc6f620
[2007/12/13 14:11:32] BIO ctrl called with unknown cmd 7
[2007/12/13 14:11:32] Completed TLS handshake on connection 0x8cc6f620
[2007/12/13 14:11:32] DoBind on connection 0x8cc6f620
[2007/12/13 14:11:32] Treating simple bind with empty DN and no password as anonymous
[2007/12/13 14:11:32] Bind name:NULL, version:3, authentication:simple
[2007/12/13 14:11:32] Sending operation result 0:"":"" to connection 0x8cc6f620
[2007/12/13 14:11:32] Operation 0x1:0x60 on connection 0x8cc6f620 completed in 0 seconds
[2007/12/13 14:11:32] DoSearch on connection 0x8cc6f620
[2007/12/13 14:11:32] Search request:
base: ""
scope:2 dereference:0 sizelimit:0 timelimit:10 attrsonly:0
filter: "(&(objectclass=inetorgperson)(uid=admin))"
no attributes
[2007/12/13 14:11:32] Empty attribute list implies all user attributes
[2007/12/13 14:11:32] Sending search result entry "cn=Admin,o=wii" to connection 0x8cc6f620
[2007/12/13 14:11:32] Cannot resolve NDS name 'O=wii_lib' in ResolveAndAuthNDSName, err = all referrals failed (-626)
[2007/12/13 14:11:32] LDAPSearchToCB: Cannot Resolve and Auth base DN, err = all referrals failed (-626)
[2007/12/13 14:11:32] LDAPSearchToCB failed, err = all referrals failed (-626)
[2007/12/13 14:11:32] Sending operation result 80:"":"NDS error: all referrals failed (-626)" to connection 0x8cc6f620
[2007/12/13 14:11:32] Operation 0x2:0x63 on connection 0x8cc6f620 completed in 0 seconds
[2007/12/13 14:11:32] DoUnbind on connection 0x8cc6f620
[2007/12/13 14:11:32] Connection 0x8cc6f620 closed
[2007/12/13 14:11:31] Monitor 0x19d initiating TLS handshake on connection 0x8cc6f620
[2007/12/13 14:11:31] DoTLSHandshake on connection 0x8cc6f620
[2007/12/13 14:11:32] BIO ctrl called with unknown cmd 7
[2007/12/13 14:11:32] Completed TLS handshake on connection 0x8cc6f620
[2007/12/13 14:11:32] DoBind on connection 0x8cc6f620
[2007/12/13 14:11:32] Treating simple bind with empty DN and no password as anonymous
[2007/12/13 14:11:32] Bind name:NULL, version:3, authentication:simple
[2007/12/13 14:11:32] Sending operation result 0:"":"" to connection 0x8cc6f620
[2007/12/13 14:11:32] Operation 0x1:0x60 on connection 0x8cc6f620 completed in 0 seconds
[2007/12/13 14:11:32] DoSearch on connection 0x8cc6f620
[2007/12/13 14:11:32] Search request:
base: ""
scope:2 dereference:0 sizelimit:0 timelimit:10 attrsonly:0
filter: "(&(objectclass=inetorgperson)(uid=admin))"
no attributes
[2007/12/13 14:11:32] Empty attribute list implies all user attributes
[2007/12/13 14:11:32] Sending search result entry "cn=Admin,o=wii" to connection 0x8cc6f620
[2007/12/13 14:11:32] Cannot resolve NDS name 'O=wii_lib' in ResolveAndAuthNDSName, err = all referrals failed (-626)
[2007/12/13 14:11:32] LDAPSearchToCB: Cannot Resolve and Auth base DN, err = all referrals failed (-626)
[2007/12/13 14:11:32] LDAPSearchToCB failed, err = all referrals failed (-626)
[2007/12/13 14:11:32] Sending operation result 80:"":"NDS error: all referrals failed (-626)" to connection 0x8cc6f620
[2007/12/13 14:11:32] Operation 0x2:0x63 on connection 0x8cc6f620 completed in 0 seconds
[2007/12/13 14:11:32] DoUnbind on connection 0x8cc6f620
[2007/12/13 14:11:32] Connection 0x8cc6f620 closed
Clearly the ssl connection had
been made so the new certificate had worked. However, edirectory
had a problem. There was a partition, O=wii_lib that appeared
to be corrupt. It could not be accessed by any utility. The
following error would occur:
"Error -626 This object could not
be found. It is possible that this object exists but the server
could not communicate with a server holding a copy of the object."
-626 means all referrals failed.
Edirectory did not know what this partition was. Ldap has to search
the entire tree to find any instances of the user in question. It
was failing on this partition for the same reason as the other
utilities.
Resolution
1. Exported a new certificateDNS in consoleOne to the c:
on the workstation. Then this certificate was copied to the
sys:cert directory. The openwbem.conf was modified to use this
new certificate:
################################################################################
# The authentication module to be used by owcimomd. This should be a
# an absolute path to the shared library containing the authentication module.
# The authentication module to be used by owcimomd. This should be a
# an absolute path to the shared library containing the authentication module.
owcimomd.authentication_module =
/system/cimom/lib/openwbem/authentication/libnetwareauthentication.nlm
ldap_auth.ldap_host = 127.0.0.1
ldap_auth.cert_file = /cert/
#ldap_auth.searchbase = o=novell
ldap_auth.ldap_host = 127.0.0.1
ldap_auth.cert_file = /cert/
#ldap_auth.searchbase = o=novell
################################################################################
The cimom client is hard coded to look on the sys:. So the
volume name is not needed in the path. The owcimomd.nlm and
nldap.nlm were then unloaded and reloaded. In cases of heavy ldap
usage, it might be better to reboot the server after hours if
possible rather than bouncing the nlms.
2. Fixed the corruption in edirectory. When we checked the replica
type we found that the replica was sub-ref without any master. So
to get rid of the O=wii_lib partition, we designated that
server as a Master replica holding server and successfully merged
that partition into Root. We checked the partition and the
partition was empty. This container was then deleted as it
served no purpose in the customer's tree.