AFP Cluster Resource Goes Comatose During Fail Over If File Is Open From Client

  • 7002566
  • 05-Feb-2009
  • 27-Apr-2012

Environment

Novell Open Enterprise Server 2 (OES 2) SP 1
Novell Open Enterprise Server (Linux based)
Novell Cluster Services (NCS)
AFP

Situation

During manual or automatic fail over of an AFP cluster resource on OES 2 SP 1 from one node to another, the resource will fail to come up on the other server.  The resource ends up going comatose.  The only way to recover is to restart the servers in the cluster.  At closer look, the reason the resource goes comatose is because it remains mounted on the first server.  It never gets completely, or correctly dismounted and deactivated.

Resolution

There are three parts to the resolution:
 
1.  Make sure that the AFP proxy user setup during install of AFP is added to the default password policy you selected (also during install and configuration of AFP) during configuration.  If there is more than one proxy user being used, then each proxy user must also be assigned to the password policy.
 
2.  Make sure that under the password policy (as viewed by iManager), that the proxy user(s) are also allowed to retrieve passwords.  This can be checked and edited from the following location:
  • Open iManager
  • Select PASSWORDS from the left-hand menu
  • Select PASSWORD POLICIES from the sub-menu under passwords
  • Select the correct password policy.  By default it will be the AFP DEFAULT POLICY
  • On the pop-up window for the password policy select the UNIVERSAL PASSWORD tab and CONFIGURATION OPTIONS sub tab
  • Scroll down until you see the section titled ALLOW THE FOLLOWING TO RETRIEVE PASSWORDS
  • If the proxy user(s) is not listed, then add the user.

3.  The above two steps may resolve some situations, however, there is currently a defect filed on this.  OES 2 SP 1 shipping code for AFP does not properly close file handles for open files during the migration process of a cluster resource.  The defect has been identified and has been resolved.  The official patch is still being worked on at the time of the writing of this TID (Feb. 5, 2009).  The defect number is identified below. 

Bug Number

470773