Slow backup restore performance tuning parameters for NSS and TSAFS.NLM

  • 3341011
  • 11-Jan-2007
  • 26-Apr-2012

Environment

Novell NetWare 6.0 Support Pack 8
Novell NetWare 6.5 Support Pack 4
Novell NetWare 6.5 Support Pack 5
Novell Netware 6.5 Support Pack 6
Novell Netware 6.5 Support Pack 7

Situation

Slow backup restore performance tuning parameters for NSS and TSAFS.NLM

Resolution

There are many issues that can affect backup/restore performance. There is tuning that can be done on the server and NSS volumes. These are only ballpark figures. The server must be benchmarked to find the optimum settings.

These two parms must be set in c:\nwserver\nssstart.cfg. Make sure there are no typos or NSS won't load. Nssstart.cfg is not created by default.
/AuthCacheSize=20000
/NumWorkToDos=100

These parms can be set in AUTOEXEC.NCF. Note: If these are placed in this file they must start with NSS. For example - nss /ClosedFileCacheSize=2000. They can also be placed in the C:\NWSERVER\NSSSTART.CFG and there they would be used without the NSS in the beginning.

/ClosedFileCacheSize=200000
/MinBufferCacheSize=20000
/MinOsBufferCacheSize=20000
/CacheBalanceMaxBuffersPerSession=20000
/CacheUserMaxPercent=70
/AllocAheadBlks=63
/NameCacheSize=200000
/NoCopyBuffersOnXlatch
/ReadAheadBlks=:64 -- on NetWare 6.5 boxes. A line must be added for each volume. This sets a count for the number of 4k blocks to read with each request. In this case, 256k at a time.
These settings are ballpark figures. They may need to be adjusted depending on how much ram the server has.
Setting these too high can cause excessive memory usage and can affect other apps as well as performance. The "closed file cache size and the name cache size, if set too high, can cause NSS.NLM to take excessive amounts of memory. These can help performance but experience shows that there are usually several problems that add up to one big problem. Setting these two parms too high can actually degrade performance. If the server has about 2 gig or less, then the default of 100000 should be used.
  1. Make sure you have the latest updates for the tape software.
  2. Faster hardware can make a big difference.
  3. The type of data can make a huge difference. Lots of small files will slow down performance, especially if they're all in one directory. The backup will spend more time opening,scanning and closing files rather than reading data. If there are more large files mixed in with the smaller ones, then performance can increase because more time is spent reading data rather than opening files, which is what increases throughput.
  4. Background processes like compression, virus scans and large data copies will slow performance down.
  5. Virus scanners also can be an issue. They usually hook into the OS file system to intercept file opens so they can scan the files prior to backup. The virus scanner can be configured to run at some other time than the backup. This can also compound the problem if the files being scanned are compressed. The virus scanner can decompress them before scanning for viruses, which will slow things down even more. A good way to see if this is happening is to enable the NSS /COMPSCREEN at the server console during the backup to see if files are being decompressed.
  6. Lots of open files will slow down performance. These are usually seen with the error FFFDFFF5. This means the file is open by some other application. If the tape software can be configured to skip open files until the end of the job rather than retrying to open them immediately, then performance can be increased as some tape software solutions, by default, will retry to open the locked file multiple times before moving on.
  7. Backing up over the wire is slower than backups local to the server especially if most of the files are small files, 64k or less. If there is any LAN latency performance can take a significant hit. The wire is much slower at transferring data than reading the data directly from the disk. One thing that may help is to

    set tcp nagle algorithm=off
    set tcp delayed acknowledgement=off
    set tcp sack option=off

    on both host and target servers.

    tsatest can be used to determine if the lan is a bottleneck. There is more information about tsatest below.

- Make sure you have the latest disk drivers and firmware updates for your HBAs. There have been issues where performance was increase greatly because of later firmware/drivers.
- Use the tsatest.nlm utilitiy on different lan segments to see if there is a problem. This tool now ships with tsa5up19.exe.exe. Tsatest can be used to test the throughput on the wire and on the server itself to see if the lan could be a bottleneck. Tsatest is also useful because it does not require a tape drive, so the tape drive can be eliminated as a possible problem as well.
-Make sure you have the latest tsa files.

-Raid5 systems with a small stripe size can also be a problem. Check the configuration of the disk storage or san. If using a raid system, a larger stripe size can help performance.

-Creating one large LUN on the raid rather than several smaller ones can result in significant performance loss. It's faster to have multiple luns with the volumes/data spread out over them.

-Make sure you have the latest bios/firmware updates for your server.

-There have been issues where full backups are fast and incremental/differential backups are slow. This can happen because of the tape software doing its own filtering on inc/diff backups rather than letting the tsafs.nlm do it. There is a parm in tsafs.nlm that can help this:

LOAD TSAFS /NOCACHINGMODE

This will disable the read ahead cache for tsafs.nlm so that files are not cached unnecessarily during inc/diff backups. You can re-enable this cache when doing full backups:

LOAD TSAFS /CACHINGMODE

This is a load time parameter so you could create a script that would load/unload tsafs accordingly.

Tsafs can also be tuned as well. Once tsafs is loaded, typing tsafs again at the server console will show what most of the parameters are set for. If most of the data consists of small files, then make a best estimate as to what the mean file size is. That will help in determining what the best size of the read buffers should be. Tsafs could then be tuned to favor smaller files with the:

tsafs /ReadBufferSize=16384

That would set the read buffers for tsafs to 16k. If the mean file size is 16k or less, that would enable the tsafs to read the files with less read requests. Setting the nss cache balance to a lower percent would give tsafs more memory for caching files. If the mean file size is 64k or thereabouts, set the tsafs /readbuffersize=65536. The read buffers in the tape software could also be set to similar values.



tsafs /cachememorythreshold=5

may help as well. There have been problems with memory setting this value too high. 10 would be a good place to start. The recommended setting is 1 for servers that have memory fragmentation problems. If the server has more memory, then even a setting of 1 would give tsafs more memory to cache file data.

- On servers that have 4 or 2 processors, the tsafs /readthreadsperjob=x can be set to 2 or 4. On machines with only one processor, set the /readthreadsperjob=1. Setting the /readthreadsperjob too high will result in performance loss.

-Tsatest is a good tool for finding out where potential bottlenecks are. This is an nlm that can be loaded on the target server for a local backup, or from another NetWare server over the wire. It's a backup simulator that requires no special hardware, tape drives, databases, etc. By loading tsatest on the target server, the wire and tape software can be eliminated as potential bottlenecks. Throughput can be gauged and then a backup can be done over the wire to see if the lan could be slowing things down. For a complete listing of tsatest load line parameters, type tsatest /?. Usually it's loaded like this:

load tsatest /s= /u= /p= /v=
individual paths can be specified as well. By default, tsatest will do full backups. An incremental backup can be specified by adding the /c=2 parameter to the load line. The sys:\etc\tsatest.log file can be created with the /log parameter. This file can be sent to Novell for analysis.
Backup/restore performance can be reduced when backing up over the lan. Sometimes up to 1 half of the performance can be lost due to lan latency alone. Tsatest is a good way to determine if that's happening. Tests can be run on the target server itself and then the target server can be backed up over the wire from another NetWare server. The results can be compared.
For a good document on tsatest read:

https://developer.novell.com/ndk/doc/samplecode/smscomp_sample/tsatest/tsatest.html




.

Additional Information


Formerly known as TID# 10093351