Edir 8.7.0 Error: "Cache memory allocator out of available memory"

  • 7021222
  • 22-Aug-2017
  • 22-Aug-2017

Environment

Novell eDirectory 8.7 for All Platforms
Novell eDirectory 8.6 for All Platforms
Novell Directory Services 7

Situation

Servers have between 6 - 12 ip addresses bound to them as well as multiple ipx network numbers.

The core routers had been re-configured to tunnel ipx thru ip. One server had been upgraded to eDirectory 8.7  Edir 8.7.0 Error: "Cache memory allocator out of available memory"

Error: "Short term memory allocator is out of memory"  "Attempts to get more memory failed" seen on the console

On a server with real copies of objects (the sender)  DSTRACE.NLM with the +DSA switch shows constant "Insufficient buffers" errors.

On a server attempting to re-backlink an external reference partition root (a receiver)  DSTRACE.NLM with the +RN and +DRL switches shows "Resetting replica ring AVA list due to insufficient buffer size of xxxxxxxx"

Resolution

The fix was to increase the referral buffer to the maximum outbound size permitted by eDirectory, 64K.  This fix has been out into the following DS versions as well as all later ones. 

- DS 7.62c
- DS 8.85b
- eDir 8.6.2 10350.24
- eDir 8.7.0 10411.10
- eDir 8.7.1 FP1


Alternately, you can unload httpstk, portal and other services in the referral list and unload then reload ds.  This will remove these services from the replica referral list so that we no longer advertise that these services are loaded.  You can also unbind IP from some of the IP addresses for this server.  Both of these methods will reduce the size of the referral value for that server.  One other workaround is to find the server value in the referral list that has a length of over FA0 and remove that server from the ring.

Additional Information

A server holding real copies of the tree's partitions was upgraded to eDirectory 8.7.  This server had eight IP addresses bound and all services were loaded.  The referral list for this server became greater that 4k in size.  An Xref server (a server attempting to backlink an xref partition root object (ie., did not hold that partition) contacted this server to update its backlink since it was flagged 10000. 

We get the insufficient buffer message from the real copy server.  This server sees a 4k limitation in the code, knows the xref server cannot get the complete referral list within this buffer and sends back the insufficient buffer message.  The xref server flushes it's buffers, returns them to the OS (this is the Reset AVA message), and requests from the OS a contiguous block of memory plus an additional 4k.  The OS provides this. We try again to backlink.  We continue to loop since the real copy server will always return to us the error.  This loop of adding another 4k continues until all cache buffers have been consumed.  Since the OS re-allocates the previous buffer memory to VM it remains fragmented.  Normally the OS will defrag this memory within 2 minutes.  Since we are constantly releasing and re-requesting memory buffers from the OS it does not have time to defrag so we will continue into an extremely fragmented memory environment.

Formerly known as TID# 10084982
Formerly known as TID# NOVL90916