Insufficient memory errors when rebuilding NSS volumes under OES Linux

  • 3070623
  • 12-Jun-2007
  • 27-Apr-2012

Environment

Novell Open Enterprise Server (OES)
Novell Open Enterprise Server (Linux based)

Situation

Attempting to verify a 5.8TB NSS volume on OES Linux with the ravsui utility, results in the following memory errors:

nsscon message:
NSS error: Verify aborted prior to producing detailed information
Status: 20000
Name: zERR_NO_MEMORY
Source: repairZVP.c[1864]

/var/log/messages:
kernel: NSSLOG ==> [Ownership] /usr/src/packages/.../zPool.c[1194]
kernel: Pool "DATA" has been released by repair.c[238]
kernel: NSSLOG ==> [Ownership] /usr/src/packages/.../zPool.c[1142]
kernel: Pool "DATA" is owned by repair.c[1945]
kernel: NSSLOG ==> [Error] /usr/src/packages/.../zalloc.c[82]
kernel: March 28, 2007 6:17:46 pm NSS-4.07a-217:
kernel: Error allocating 191288320 bytes of memory.
kernel: You may not have enough memory. Either close some other applicati
kernel: NSSLOG ==> [Ownership] /usr/src/packages/.../zPool.c[1194]
kernel: Pool "DATA" has been released by repair.c[238]

ravsui messages:
Mar 28, 2007 6:17:46 pm NSS-4.07a-217: /usr/src/.../zalloc.c[82]
Error allocating 191288320 bytes of memory.
You may not have enough memory. Either close some other applications or add more memory.

Resolution

This problem may be encountered when rebuilding large NSS volumes and is due to the method NSS utilizes the available cache buffers. NSS manages cache buffers on Linux using methods similiar to those used in other Linux file systems such as ReiserFS, Polyserve, XFS, with the exception of EXT.

For file data, NSS uses the Linux cache page manager to gain access to available memory in the system. There are some limits in place so that when copying large files, NSS does not starve other user applications for memory. This is similar to the cache handling used in NetWare®.

For metadata, NSS uses kernel memory. NSS can use only a percentage of this space because other applications share this space. By default, NSS reserves a minimum buffer cache size of 30,000 4KB buffers, which is about 120 MB of the kernel memory space. You can adjust the minimum number of buffers to be used by NSS with the MinBufferCacheSize parameter.

For a 32-bit machine, the kernel cache memory limit is 1 GB cache. Depending on what else is running, you might need to modify how much space you allocate for NSS.

For example, when running ravsui(8) for pool verify or a pool rebuild, the utility needs contiguous space in kernel memory separate from the space allocated to the core NSS process. The larger the pool, the larger the space that is needed. On a 32-bit machine with a 1 GB limit, you might need to stop other processes temporarily to free up space so that the verify or rebuild can run. You can optionally modify the amount of space used by the core NSS process by lowering the setting for MinBufferCacheSize to as little as 10000 4KB buffers. When the verify or rebuild is done, you can change the setting back to its normal setting.

1) Open a terminal console as the root user.

2) Start nsscon(8). At the console prompt, enter

nsscon

3) Set the minimum number of cache buffers used by NSS on Linux. In nsscon, enter

nss /MinBufferCacheSize=value

where vaue is the number of 4KB buffers to assign for NSS. The default
value is 30000. The maximum setting is the amount of memory in KB
divided by 4 KB. For a 32-bit machine the maximum setting is 250000
buffers.

After making these changes, perform the pool verify or rebuild. Following completion of the rebuild, follow these same steps to reset the cache sizes to their default values.

Note - Rebuilding extremely large NSS pools may require more memory than is physically available on the 32-bit platform. This limitation will be addressed with the 64-bit version of Open Enterprise Server 2 for Linux.