Understanding Disk I/O in Relation to Retain Performance

  • 7018891
  • 08-May-2017
  • 23-May-2017

Environment

Retain 3.x, 4.x

Situation

I'm planning and designing a new Retain system or I have an existing one but it is slow.  Does storage design and disk I/O speed have anything to do with Retain performance?

Resolution

Storage design and disk I/O has everything to do with Retain performance as archive jobs are I/O intensive.  You have the following processes writing to disk simultaneously:
  • The indexer to the [storage path]/index
  • The database (if on the Retain server)
  • The Retain Server to [storage path]/archive
  • The Retain Server to the logs directory
    • Linux:  /var/logs/retain-tomcat7
    • Windows:  [drive]:\Program Files\Beginfinite\Retain\Tomcat7\logs

With all of that disk activity, if a single spindle (drive) is having to handle all of it, then you can see that the performance bottleneck would be disk I/O.  However, many disk systems these days involve multiple disks using (i.e., RAID 5 or RAID 10) that write the data across multiple disks.  The more disks involved, the more you are spreading the load and typically the faster the disk performance will be.  You also have a difference in drives (SATA/SAS/SSD). In those cases, you now are looking at whether the disks are local to the server or in a SAN/NAS.  

RAID Considerations

Let's say your server employs RAID 5, which provides better redundancy than, say, RAID 10.  If there were 4 disks.   As you know, RAID 5 uses an extra parity bit that consumes an entire disk, which leaves it with 3 drives on which to stripe across.  If one of those drives becomes unavailable, that leaves you with 2.  Striping across 2 or 3 drives doesn't lend for great speed, especially if the disks are lower end SATA drives.  

SAN / NAS Considerations

If on a SAN/NAS, now you are looking at the network link speed as well.  You could have very fast drives, but if your link speed is 1 Gb/s, your bottleneck is going to be your link.

The 1 GB/s network link is slower than a SATA 2 or 3 connection (AKA SATA 3 Gb/s and SATA 6 Gb/s.) Your SATA 2 connection (which is now getting to be a pretty old standard) is 3x faster than a 1000 Mb/s network link (or 1 Gb/s network connection). A fast single HDD can saturate a 1 Gb/s connection but not quite a 3 Gb/s connection (SATA 2.0, or SATA 3 Gb/s) with a sequential read/write. 7,200 RPM platter drives usually top out around 160-170 MB/s (or 1.28-1.36 Gb/s).

Measuring Disk Performance

It really comes down to IOPS.  Here is a very simple IOPS calculator: http://www.thecloudcalculator.com/calculators/disk-raid-and-iops.html or you can find one of your own.

So, it really comes down to you understanding your underlying disk storage.  This article just gives food for thought.  If you are running Retain on a VM guest server like most customers do, then you need to also understand your VM host and VM infrastructure.  Is the Retain storage viewed by the server OS running on the VM guest as "local" storage?  If so, what type of disk system is holding your VM's datastore?  If it is not local storage but the server is connecting to external storage, then you need to take a look at the external system's configuration.

Bottom line:  Disk I/O performance is key to Retain's performance and there are several areas to investigate where the bottlenecks could be.