NSS Errors 20407 and/or 20051

  • 7008324
  • 08-Apr-2011
  • 07-Jun-2013

Environment

Novell Open Enterprise Server 2 (OES 2) Linux
Novell Storage Services (NSS)

Situation

Errors repeatedly appearing in /var/log/messages
Feb 10 18:28:25 oes_srv1 kernel: NSSLOG ==> [Error] comnPool.c[2554]
Feb 10 18:28:25 oes_srv1 kernel:      Feb 10, 2011   6:28:25 pm NSS<COMN>-4.12a-xxxx:
Feb 10 18:28:25 oes_srv1 kernel:      Pool CHIFS14: System data error 20407(nameTree.c[499]).   Block 0(file block 0)(ZID 0)
Feb 10 18:28:25 oes_srv1 kernel: err=20801 comnVol.c[894]
Feb 10 18:28:25 oes_srv1 kernel: err=20801 comnVol.c[894]
Feb 10 18:28:25 oes_srv1 kernel: err=20801 comnVol.c[894]
Feb 10 18:28:25 oes_srv1 kernel: err=20801 comnVol.c[894]
Feb 10 18:28:28 oes_srv1 kernel: NSSLOG ==> [Error] zlssMSAP.c[538]
Feb 10 18:28:28 oes_srv1 kernel:      Feb 10, 2011   6:28:28 pm NSS<ZLSS>-4.12a-xxxx:
Feb 10 18:28:28 oes_srv1 kernel:      MSAP: Pool "CHIFS14" read error 20206.
Feb 10 18:28:28 oes_srv1 kernel: 
Feb 10 18:28:28 oes_srv1 kernel: NSSLOG ==> [MSAP] comnLog.c[201]
Feb 10 18:28:28 oes_srv1 kernel:      Pool "CHIFS14" - Read error(20206).
Feb 10 18:28:38 oes_srv1 afptcpd[17746]: [error] Failed to get file info from ZID for volume - USERS02. Error - 20051 <0x4e53>
Feb 10 18:28:38 oes_srv1 afptcpd[17746]: [error] zAFPOpenRoot: Failed to open by ZID. Error - 20051 <0x4e53>
Feb 10 18:28:38 oes_srv1 afptcpd[17746]: [error] FPGetFileDirParms: Failed to get file object info. Error - 20051 <0x4e53>
Feb 10 18:28:38 oes_srv1 afptcpd[17746]: [error] zAFPOpenRoot: Failed to open by ZID. Error - 20051 <0x4e53>
Feb 10 18:28:38 oes_srv1 afptcpd[17746]: [error] FPGetFileDirParms: Failed to get file object info. Error - 20051 <0x4e53>
Feb 10 18:28:40 oes_srv1 afptcpd[17746]: [error] zAFPOpenRoot: Failed to open by ZID. Error - 20051 <0x4e53>
Feb 10 18:28:40 oes_srv1 afptcpd[17746]: [error] FPGetFileDirParms: Failed to get file object info. Error - 20051 <0x4e53>
Feb 10 18:28:40 oes_srv1 afptcpd[17746]: [error] zAFPOpenRoot: Failed to open by ZID. Error - 20051 <0x4e53>
Feb 10 18:28:40 oes_srv1 afptcpd[17746]: [error] FPGetFileDirParms: Failed to get file object info. Error - 20051 <0x4e53>
...etc

Then the pool will deactivate.  When the pool is activated, it will deactivate soon after, usually within a few seconds to a couple of minutes.

Resolution

Fixed in OES2SP3 and OES11.

Workaround:  Purge the pool using ravsui; e.g.

ravsui --purge-deleted-files rebuild <poolname>

Additional Information

This specific case of these errors is caused by attempting to purge an NSS Pool when there is a specific type of NSS metadata corruption.

When troubleshooting this problem, it needs to be determined if a purge is running.  A purge can be run in a number of places:
  • By an administrator, via ncpcon
  • By a user, via Novell Client
  • By the server's auto-purge mechanism:  When an NSS Pool becomes full, an automatic purge will start in order to free up space
    • Use nssmu or iManager to establish if a pool is getting full
    • As a short-term measure, the following NSS parameters can be used within nsscon to delay the threshold at which the auto-purge starts:
      • nss /PoolHighWaterMark=poolname:Percent/MB/GB
      • nss /PoolLowWaterMark=poolname:Percent/MB/GB
    • For example, the following commands will configure auto-purge to start when the p_users pool has only 5% free space and it will continue to run until there is 10% free space:
      • nss /PoolHighWaterMark=p_users:10%
      • nss /PoolLowWaterMark=p_users:5%
    • For further details, refer to the Novell Documentation at https://www.novell.com/documentation/oes2/