HP OpenView Performance Agent for SUN Dictionary of Operating System Performance Metrics Print Date 09/2005 OVPA for SUN Release C.04.50 © Copyright 2005 Hewlett-Packard Development Company, L.P. All rights reserved. *************************************************************** Introduction ============ This dictionary contains definitions of the SUN operating system performance metrics for HP OpenView Performance Agent. This document is divided into the following sections: * "Metric Names by Data Class," which lists the metrics alphabetically by data class. Use these metric names for exporting data with the extract utility. You can also use these metric names in defining alarm conditions in your alarmdef file. * "Metric Definitions," which describes each metric in alphabetical order. Please note that the metric help has been put in a more generic format and references are made to the other platforms that also support each of the metrics. Metric Names by Data Class ========================== SunOS Global Metrics -------------------- BLANK DATE DATE_SECONDS DAY INTERVAL RECORD_TYPE TIME YEAR GBL_ACTIVE_CPU GBL_ACTIVE_PROC GBL_ALIVE_PROC GBL_COMPLETED_PROC GBL_CPU_IDLE_TIME GBL_CPU_IDLE_UTIL GBL_CPU_NICE_TIME GBL_CPU_NICE_UTIL GBL_CPU_SYS_MODE_TIME GBL_CPU_SYS_MODE_UTIL GBL_CPU_TOTAL_TIME GBL_CPU_TOTAL_UTIL GBL_CPU_USER_MODE_TIME GBL_CPU_USER_MODE_UTIL GBL_CSWITCH_RATE GBL_DISK_PHYS_BYTE GBL_DISK_PHYS_BYTE_RATE GBL_DISK_PHYS_IO GBL_DISK_PHYS_IO_RATE GBL_DISK_PHYS_READ GBL_DISK_PHYS_READ_BYTE_RATE GBL_DISK_PHYS_READ_RATE GBL_DISK_PHYS_WRITE GBL_DISK_PHYS_WRITE_BYTE_RATE GBL_DISK_PHYS_WRITE_RATE GBL_DISK_TIME_PEAK GBL_DISK_UTIL_PEAK GBL_FS_SPACE_UTIL_PEAK GBL_INTERRUPT GBL_INTERRUPT_RATE GBL_INTERVAL GBL_LOST_MI_TRACE_BUFFERS GBL_MEM_FREE_UTIL GBL_MEM_PAGEOUT GBL_MEM_PAGEOUT_BYTE GBL_MEM_PAGEOUT_BYTE_RATE GBL_MEM_PAGEOUT_RATE GBL_MEM_PAGE_REQUEST GBL_MEM_PAGE_REQUEST_RATE GBL_MEM_SWAPIN_BYTE_RATE GBL_MEM_SWAPOUT_BYTE_RATE GBL_MEM_SYS_UTIL GBL_MEM_USER_UTIL GBL_MEM_UTIL GBL_NET_COLLISION_1_MIN_RATE GBL_NET_COLLISION_PCT GBL_NET_COLLISION_RATE GBL_NET_ERROR_1_MIN_RATE GBL_NET_ERROR_RATE GBL_NET_IN_ERROR_PCT GBL_NET_IN_ERROR_RATE GBL_NET_IN_PACKET GBL_NET_IN_PACKET_RATE GBL_NET_OUT_ERROR_PCT GBL_NET_OUT_ERROR_RATE GBL_NET_OUT_PACKET GBL_NET_OUT_PACKET_RATE GBL_NET_PACKET_RATE GBL_NFS_CALL GBL_NFS_CALL_RATE GBL_NUM_DISK GBL_NUM_NETWORK GBL_NUM_USER GBL_PROC_SAMPLE GBL_RUN_QUEUE GBL_STARTED_PROC GBL_STARTED_PROC_RATE GBL_STATTIME GBL_SWAP_SPACE_UTIL GBL_SYSTEM_UPTIME_HOURS GBL_SYSTEM_UPTIME_SECONDS GBL_TT_OVERFLOW_COUNT TBL_FILE_LOCK_USED TBL_FILE_TABLE_UTIL TBL_INODE_CACHE_USED TBL_MSG_TABLE_USED TBL_MSG_TABLE_UTIL TBL_SEM_TABLE_USED TBL_SEM_TABLE_UTIL TBL_SHMEM_ACTIVE TBL_SHMEM_TABLE_USED TBL_SHMEM_TABLE_UTIL TBL_SHMEM_USED SunOS Application Metrics -------------------- BLANK DATE DATE_SECONDS DAY INTERVAL RECORD_TYPE TIME YEAR APP_ACTIVE_PROC APP_ALIVE_PROC APP_COMPLETED_PROC APP_CPU_SYS_MODE_TIME APP_CPU_SYS_MODE_UTIL APP_CPU_TOTAL_TIME APP_CPU_TOTAL_UTIL APP_CPU_USER_MODE_TIME APP_CPU_USER_MODE_UTIL APP_MAJOR_FAULT_RATE APP_MEM_UTIL APP_MEM_VIRT APP_MINOR_FAULT_RATE APP_NAME APP_NUM APP_PRI APP_PROC_RUN_TIME APP_SAMPLE SunOS Process Metrics -------------------- BLANK DATE DATE_SECONDS DAY INTERVAL RECORD_TYPE TIME YEAR PROC_APP_ID PROC_CPU_SYS_MODE_TIME PROC_CPU_SYS_MODE_UTIL PROC_CPU_TOTAL_TIME PROC_CPU_TOTAL_TIME_CUM PROC_CPU_TOTAL_UTIL PROC_CPU_TOTAL_UTIL_CUM PROC_CPU_USER_MODE_TIME PROC_CPU_USER_MODE_UTIL PROC_GROUP_ID PROC_INTEREST PROC_INTERVAL_ALIVE PROC_MAJOR_FAULT PROC_MEM_RES PROC_MEM_VIRT PROC_MINOR_FAULT PROC_PAGEFAULT PROC_PAGEFAULT_RATE PROC_PARENT_PROC_ID PROC_PRI PROC_PROC_ARGV1 PROC_PROC_ID PROC_PROC_NAME PROC_RUN_TIME PROC_STOP_REASON PROC_THREAD_COUNT PROC_TTY PROC_USER_NAME SunOS Transaction Metrics -------------------- BLANK DATE DATE_SECONDS DAY INTERVAL RECORD_TYPE TIME YEAR TTBIN_TRANS_COUNT_1 TTBIN_TRANS_COUNT_10 TTBIN_TRANS_COUNT_2 TTBIN_TRANS_COUNT_3 TTBIN_TRANS_COUNT_4 TTBIN_TRANS_COUNT_5 TTBIN_TRANS_COUNT_6 TTBIN_TRANS_COUNT_7 TTBIN_TRANS_COUNT_8 TTBIN_TRANS_COUNT_9 TTBIN_UPPER_RANGE_1 TTBIN_UPPER_RANGE_10 TTBIN_UPPER_RANGE_2 TTBIN_UPPER_RANGE_3 TTBIN_UPPER_RANGE_4 TTBIN_UPPER_RANGE_5 TTBIN_UPPER_RANGE_6 TTBIN_UPPER_RANGE_7 TTBIN_UPPER_RANGE_8 TTBIN_UPPER_RANGE_9 TT_ABORT TT_ABORT_WALL_TIME_PER_TRAN TT_APP_NAME TT_APP_TRAN_NAME TT_CLIENT_ADDRESS TT_CLIENT_ADDRESS_FORMAT TT_CLIENT_TRAN_ID TT_COUNT TT_FAILED TT_INFO TT_NAME TT_NUM_BINS TT_SLO_COUNT TT_SLO_PERCENT TT_SLO_THRESHOLD TT_TRAN_1_MIN_RATE TT_TRAN_ID TT_UNAME TT_USER_MEASUREMENT_AVG TT_USER_MEASUREMENT_AVG_2 TT_USER_MEASUREMENT_AVG_3 TT_USER_MEASUREMENT_AVG_4 TT_USER_MEASUREMENT_AVG_5 TT_USER_MEASUREMENT_AVG_6 TT_USER_MEASUREMENT_MAX TT_USER_MEASUREMENT_MAX_2 TT_USER_MEASUREMENT_MAX_3 TT_USER_MEASUREMENT_MAX_4 TT_USER_MEASUREMENT_MAX_5 TT_USER_MEASUREMENT_MAX_6 TT_USER_MEASUREMENT_MIN TT_USER_MEASUREMENT_MIN_2 TT_USER_MEASUREMENT_MIN_3 TT_USER_MEASUREMENT_MIN_4 TT_USER_MEASUREMENT_MIN_5 TT_USER_MEASUREMENT_MIN_6 TT_USER_MEASUREMENT_NAME TT_USER_MEASUREMENT_NAME_2 TT_USER_MEASUREMENT_NAME_3 TT_USER_MEASUREMENT_NAME_4 TT_USER_MEASUREMENT_NAME_5 TT_USER_MEASUREMENT_NAME_6 TT_WALL_TIME_PER_TRAN SunOS Disk Metrics -------------------- BLANK DATE DATE_SECONDS DAY INTERVAL RECORD_TYPE TIME YEAR BYDSK_AVG_SERVICE_TIME BYDSK_DEVNAME BYDSK_PHYS_BYTE BYDSK_PHYS_BYTE_RATE BYDSK_PHYS_IO BYDSK_PHYS_IO_RATE BYDSK_PHYS_READ BYDSK_PHYS_READ_BYTE BYDSK_PHYS_READ_BYTE_RATE BYDSK_PHYS_READ_RATE BYDSK_PHYS_WRITE BYDSK_PHYS_WRITE_BYTE BYDSK_PHYS_WRITE_BYTE_RATE BYDSK_PHYS_WRITE_RATE BYDSK_REQUEST_QUEUE BYDSK_UTIL SunOS Network Interface Metrics -------------------- BLANK DATE DATE_SECONDS DAY INTERVAL RECORD_TYPE TIME YEAR BYNETIF_COLLISION BYNETIF_COLLISION_RATE BYNETIF_ERROR BYNETIF_ERROR_RATE BYNETIF_ID BYNETIF_IN_BYTE_RATE BYNETIF_IN_PACKET BYNETIF_IN_PACKET_RATE BYNETIF_NAME BYNETIF_OUT_BYTE_RATE BYNETIF_OUT_PACKET BYNETIF_OUT_PACKET_RATE SunOS CPU Metrics -------------------- BLANK DATE DATE_SECONDS DAY INTERVAL RECORD_TYPE TIME YEAR BYCPU_CPU_CLOCK BYCPU_CPU_SYS_MODE_TIME BYCPU_CPU_SYS_MODE_UTIL BYCPU_CPU_TOTAL_TIME BYCPU_CPU_TOTAL_UTIL BYCPU_CPU_USER_MODE_TIME BYCPU_CPU_USER_MODE_UTIL BYCPU_ID BYCPU_INTERRUPT BYCPU_INTERRUPT_RATE BYCPU_STATE SunOS Filesystem Metrics -------------------- BLANK DATE DATE_SECONDS DAY INTERVAL RECORD_TYPE TIME YEAR FS_BLOCK_SIZE FS_DEVNAME FS_DIRNAME FS_FRAG_SIZE FS_INODE_UTIL FS_MAX_INODES FS_MAX_SIZE FS_SPACE_RESERVED FS_SPACE_USED FS_SPACE_UTIL FS_TYPE SunOS Configuration Metrics -------------------- BLANK DATE DATE_SECONDS DAY INTERVAL RECORD_TYPE TIME YEAR GBL_BOOT_TIME GBL_COLLECTOR GBL_CPU_CLOCK GBL_GMTOFFSET GBL_JAVAARG GBL_LOGFILE_VERSION GBL_LOGGING_TYPES GBL_MACHINE GBL_MACHINE_MODEL GBL_MEM_AVAIL GBL_MEM_PHYS GBL_NUM_CPU GBL_OSNAME GBL_OSRELEASE GBL_OSVERSION GBL_SUBPROCSAMPLEINTERVAL GBL_SWAP_SPACE_AVAIL GBL_SWAP_SPACE_AVAIL_KB GBL_SYSTEM_ID GBL_THRESHOLD_CPU GBL_THRESHOLD_DISK GBL_THRESHOLD_NOKILLED GBL_THRESHOLD_NONEW GBL_THRESHOLD_PROCMEM TBL_FILE_TABLE_AVAIL TBL_INODE_CACHE_AVAIL TBL_MSG_TABLE_AVAIL TBL_SEM_TABLE_AVAIL TBL_SHMEM_TABLE_AVAIL METRIC DEFINITIONS -------------------- APP_ACTIVE_PROC -------------------- An active process is one that exists and consumes some CPU time. APP_ACTIVE_PROC is the sum of the alive-process- time/interval-time ratios of every process belonging to an application that is active (uses any CPU time) during an interval. The following diagram of a four second interval showing two processes, A and B, for an application should be used to understand the above definition. Note the difference between active processes, which consume CPU time, and alive processes which merely exist on the system. ----------- Seconds ----------- 1 2 3 4 Proc ---- ---- ---- ---- ---- A live live live live B live/CPU live/CPU live dead Process A is alive for the entire four second interval, but consumes no CPU. A's contribution to APP_ALIVE_PROC is 4*1/4. A contributes 0*1/4 to APP_ACTIVE_PROC. B's contribution to APP_ALIVE_PROC is 3*1/4. B contributes 2*1/4 to APP_ACTIVE_PROC. Thus, for this interval, APP_ACTIVE_PROC equals 0.5 and APP_ALIVE_PROC equals 1.75. Because a process may be alive but not active, APP_ACTIVE_PROC will always be less than or equal to APP_ALIVE_PROC. This metric indicates the number of processes in an application group that are competing for the CPU. This metric is useful, along with other metrics, for comparing loads placed on the system by different groups of processes. On non HP-UX systems, this metric is derived from sampled process data. Since the data for a process is not available after the process has died on this operating system, a process whose life is shorter than the sampling interval may not be seen when the samples are taken. Thus this metric may be slightly less than the actual value. Increasing the sampling frequency captures a more accurate count, but the overhead of collection may also rise. APP_ALIVE_PROC -------------------- An alive process is one that exists on the system. APP_ALIVE_PROC is the sum of the alive-process-time/interval- time ratios for every process belonging to a given application. The following diagram of a four second interval showing two processes, A and B, for an application should be used to understand the above definition. Note the difference between active processes, which consume CPU time, and alive processes which merely exist on the system. ----------- Seconds ----------- 1 2 3 4 Proc ---- ---- ---- ---- ---- A live live live live B live/CPU live/CPU live dead Process A is alive for the entire four second interval but consumes no CPU. A's contribution to APP_ALIVE_PROC is 4*1/4. A contributes 0*1/4 to APP_ACTIVE_PROC. B's contribution to APP_ALIVE_PROC is 3*1/4. B contributes 2*1/4 to APP_ACTIVE_PROC. Thus, for this interval, APP_ACTIVE_PROC equals 0.5 and APP_ALIVE_PROC equals 1.75. Because a process may be alive but not active, APP_ACTIVE_PROC will always be less than or equal to APP_ALIVE_PROC. On non HP-UX systems, this metric is derived from sampled process data. Since the data for a process is not available after the process has died on this operating system, a process whose life is shorter than the sampling interval may not be seen when the samples are taken. Thus this metric may be slightly less than the actual value. Increasing the sampling frequency captures a more accurate count, but the overhead of collection may also rise. APP_COMPLETED_PROC -------------------- The number of processes in this group that completed during the interval. On non HP-UX systems, this metric is derived from sampled process data. Since the data for a process is not available after the process has died on this operating system, a process whose life is shorter than the sampling interval may not be seen when the samples are taken. Thus this metric may be slightly less than the actual value. Increasing the sampling frequency captures a more accurate count, but the overhead of collection may also rise. APP_CPU_SYS_MODE_TIME -------------------- The time, in seconds, during the interval that the CPU was in system mode for processes in this group. A process operates in either system mode (also called kernel mode on Unix or privileged mode on Windows) or user mode. When a process requests services from the operating system with a system call, it switches into the machine's privileged protection mode and runs in system mode. On a system with multiple CPUs, this metric is normalized. That is, the CPU used over all processors is divided by the number of processors online. This represents the usage of the total processing capacity available. APP_CPU_SYS_MODE_UTIL -------------------- The percentage of time during the interval that the CPU was used in system mode for processes in this group. A process operates in either system mode (also called kernel mode on Unix or privileged mode on Windows) or user mode. When a process requests services from the operating system with a system call, it switches into the machine's privileged protection mode and runs in system mode. On a system with multiple CPUs, this metric is normalized. That is, the CPU used over all processors is divided by the number of processors online. This represents the usage of the total processing capacity available. High system CPU utilizations are normal for IO intensive groups. Abnormally high system CPU utilization can indicate that a hardware problem is causing a high interrupt rate. It can also indicate programs that are not making efficient system calls. APP_CPU_TOTAL_TIME -------------------- The total CPU time, in seconds, devoted to processes in this group during the interval. On a system with multiple CPUs, this metric is normalized. That is, the CPU used over all processors is divided by the number of processors online. This represents the usage of the total processing capacity available. APP_CPU_TOTAL_UTIL -------------------- The percentage of the total CPU time devoted to processes in this group during the interval. This indicates the relative CPU load placed on the system by processes in this group. On a system with multiple CPUs, this metric is normalized. That is, the CPU used over all processors is divided by the number of processors online. This represents the usage of the total processing capacity available. Large values for this metric may indicate that this group is causing a CPU bottleneck. This would be normal in a computation-bound workload, but might mean that processes are using excessive CPU time and perhaps looping. If the “other” application shows significant amounts of CPU, you may want to consider tuning your parm file so that process activity is accounted for in known applications. APP_CPU_TOTAL_UTIL = APP_CPU_SYS_MODE_UTIL + APP_CPU_USER_MODE_UTIL NOTE: On Windows, the sum of the APP_CPU_TOTAL_UTIL metrics may not equal GBL_CPU_TOTAL_UTIL. Microsoft states that “this is expected behavior” because the GBL_CPU_TOTAL_UTIL metric is taken from the NT performance library Processor objects while the APP_CPU_TOTAL_UTIL metrics are taken from the Process objects. Microsoft states that there can be CPU time accounted for in the Processor system objects that may not be seen in the Process objects. APP_CPU_USER_MODE_TIME -------------------- The time, in seconds, that processes in this group were in user mode during the interval. User CPU is the time spent in user mode at a normal priority, at real-time priority (on HP-UX, AIX, and Windows systems), and at a nice priority. On a system with multiple CPUs, this metric is normalized. That is, the CPU used over all processors is divided by the number of processors online. This represents the usage of the total processing capacity available. APP_CPU_USER_MODE_UTIL -------------------- The percentage of time that processes in this group were using the CPU in user mode during the interval. User CPU is the time spent in user mode at a normal priority, at real-time priority (on HP-UX, AIX, and Windows systems), and at a nice priority. High user mode CPU percentages are normal for computation- intensive groups. Low values of user CPU utilization compared to relatively high values for APP_CPU_SYS_MODE_UTIL can indicate a hardware problem or improperly tuned programs in this group. On a system with multiple CPUs, this metric is normalized. That is, the CPU used over all processors is divided by the number of processors online. This represents the usage of the total processing capacity available. APP_MAJOR_FAULT_RATE -------------------- The number of major page faults per second that required a disk IO for processes in this group during the interval. APP_MEM_UTIL -------------------- On Unix systems, this is the approximate percentage of the system's physical memory used as resident memory by processes in this group that were alive at the end of the interval. This metric summarizes process private and shared memory in each application. On Windows, this is an estimate of the percentage of the system's physical memory allocated for working set memory by processes in this group during the interval. On HP-UX, this consists of text, data, stack, as well the process' portion of shared memory regions (such as, shared libraries, text segments, and shared data). The sum of the shared region pages is typically divided by the number of references. On Unix systems, each application's total resident memory is summed. This value is then divided by the summed total of all applications resident memory and then multiplied by the ratio of available user memory versus total physical memory to arrive at a calculated percentage of the total physical memory. It must be remembered, however, that this is a calculated metric that shows the approximate percentage of the physical memory used as resident memory by the processes in this application during the interval. On Windows, the sum of the working set sizes for each process in this group is kept as APP_MEM_RES. This value is divided by the sum of APP_MEM_RES for all applications defined on the system to come up with a ratio of this application's working set size to the total. This value is then multiplied by the ratio of available user memory versus total physical memory to arrive at a calculated percent of total physical memory. APP_MEM_VIRT -------------------- On Unix systems, this is the sum (in KB) of virtual memory for processes in this group that were alive at the end of the interval. This consists of text, data, stack, and shared memory regions. On HP-UX, since PROC_MEM_VIRT typically takes shared region references into account, this approximates the total virtual memory consumed by all processes in this group. On all other Unix systems, this is the sum of the virtual memory region sizes for all processes in this group. When the virtual memory size for processes includes shared regions, such as shared memory and library text and data, the shared regions are counted multiple times in this sum. For example, if the application contains four processes that are attached to a 500MB shared memory region, then 2000MB is reported in this metric. As such, this metric can overestimate the virtual memory being used by processes in this group when they share memory regions. On Windows, this is the sum (in KB) of paging file space used for all processes in this group during the interval. Groups of processes may have working set sizes (APP_MEM_RES) larger than the size of their pagefile space. APP_MINOR_FAULT_RATE -------------------- The number of minor page faults per second satisfied in memory (pages were reclaimed from one of the free lists) for processes in this group during the interval. APP_NAME -------------------- The name of the application (up to 20 characters). This comes from the parm file where the applications are defined. The application called “other” captures all processes not aggregated into applications specifically defined in the parm file. In other words, if no applications are defined in the parm file, then all process data would be reflected in the “other” application. APP_NUM -------------------- The sequentially assigned number of this application. APP_PRI -------------------- On Unix systems, this is the average priority of the processes in this group during the interval. On Windows, this is the average base priority of the processes in this group during the interval. APP_PROC_RUN_TIME -------------------- The average run time for processes in this group that completed during the interval. On non HP-UX systems, this metric is derived from sampled process data. Since the data for a process is not available after the process has died on this operating system, a process whose life is shorter than the sampling interval may not be seen when the samples are taken. Thus this metric may be slightly less than the actual value. Increasing the sampling frequency captures a more accurate count, but the overhead of collection may also rise. APP_SAMPLE -------------------- The number of samples of process data that have been averaged or accumulated during this sample. BLANK -------------------- An empty field used for spacing reports. For example, this field can be used to create a blank column in a spreadsheet that may be used to sum several items. BYCPU_CPU_CLOCK -------------------- The clock speed of the CPU in the current slot. The clock speed is in MHz for the selected CPU. BYCPU_CPU_SYS_MODE_TIME -------------------- The time, in seconds, that this CPU was in system mode during the interval. A process operates in either system mode (also called kernel mode on Unix or privileged mode on Windows) or user mode. When a process requests services from the operating system with a system call, it switches into the machine's privileged protection mode and runs in system mode. BYCPU_CPU_SYS_MODE_UTIL -------------------- The percentage of time that this CPU was in system mode during the interval. A process operates in either system mode (also called kernel mode on Unix or privileged mode on Windows) or user mode. When a process requests services from the operating system with a system call, it switches into the machine's privileged protection mode and runs in system mode. BYCPU_CPU_TOTAL_TIME -------------------- The total time, in seconds, that this CPU was not idle during the interval. BYCPU_CPU_TOTAL_UTIL -------------------- The percentage of time that this CPU was not idle during the interval. BYCPU_CPU_USER_MODE_TIME -------------------- The time, in seconds, during the interval that this CPU was in user mode. User CPU is the time spent in user mode at a normal priority, at real-time priority (on HP-UX, AIX, and Windows systems), and at a nice priority. BYCPU_CPU_USER_MODE_UTIL -------------------- The percentage of time that this CPU was in user mode during the interval. User CPU is the time spent in user mode at a normal priority, at real-time priority (on HP-UX, AIX, and Windows systems), and at a nice priority. BYCPU_ID -------------------- The ID number of this CPU. On some Unix systems, such as AAN, CPUs are not sequentially numbered. BYCPU_INTERRUPT -------------------- The number of device interrupts for this CPU during the interval. BYCPU_INTERRUPT_RATE -------------------- The average number of device interrupts per second for this CPU during the interval. On HP-UX, a value of “na” is displayed on a system with multiple CPUs. BYCPU_STATE -------------------- A text string indicating the current state of a processor. On HP-UX, this is either “Enabled”, “Disabled” or “Unknown”. On AIX, this is either “Idle/Offline” or “Online”. On all other systems, this is either “Offline”, “Online” or “Unknown”. BYDSK_AVG_SERVICE_TIME -------------------- The average time, in milliseconds, that this disk device spent processing each disk request during the interval. For example, a value of 5.14 would indicate that disk requests during the last interval took on average slightly longer than five one- thousandths of a second to complete for this device. Some Linux kernels, typically 2.2 and older kernels, do not support the instrumentation needed to provide values for this metric. This metric will be “na” on the affected kernels. The “sar -d” command will also not be present on these systems. Distributions and OS releases that are known to be affected include: TurboLinux 7, SuSE 7.2, and Debian 3.0. This is a measure of the speed of the disk, because slower disk devices typically show a larger average service time. Average service time is also dependent on factors such as the distribution of I/O requests over the interval and their locality. It can also be influenced by disk driver and controller features such as I/O merging and command queueing. Note that this service time is measured from the perspective of the kernel, not the disk device itself. For example, if a disk device can find the requested data in its cache, the average service time could be quicker than the speed of the physical disk hardware. This metric can be used to help determine which disk devices are taking more time than usual to process requests. BYDSK_DEVNAME -------------------- The name of this disk device. On HP-UX, the name identifying the specific disk spindle is the hardware path which specifies the address of the hardware components leading to the disk device. On AAN, these names are the same disk names displayed by “iostat”. On AIX, this is the path name string of this disk device. This is the fsname parameter in the mount(1M) command. If more than one file system is contained on a device (that is, the device is partitioned), this is indicated by an asterisk (“*”) at the end of the path name. On OSF1, this is the path name string of this disk device. This is the file-system parameter in the mount(1M) command. On Windows, this is the unit number of this disk device. BYDSK_PHYS_BYTE -------------------- The number of KBs of physical IOs transferred to or from this disk device during the interval. On Unix systems, all types of physical disk IOs are counted, including file system, virtual memory, and raw IO. The average KB transferred to or from the current disk device during the interval. On AAN systems, this metric is only available on Sun 5.X or later. BYDSK_PHYS_BYTE_RATE -------------------- The average KBs per second transferred to or from this disk device during the interval. On Unix systems, all types of physical disk IOs are counted, including file system, virtual memory, and raw IO. BYDSK_PHYS_IO -------------------- The number of physical IOs for this disk device during the interval. On Unix systems, all types of physical disk IOs are counted, including file system, virtual memory, and raw reads. BYDSK_PHYS_IO_RATE -------------------- The average number of physical IO requests per second for this disk device during the interval. On Unix systems, all types of physical disk IOs are counted, including file system IO, virtual memory and raw IO. BYDSK_PHYS_READ -------------------- The number of physical reads for this disk device during the interval. On Unix systems, all types of physical disk reads are counted, including file system, virtual memory, and raw reads. On AIX, this is an estimated value based on the ratio of read bytes to total bytes transferred. The actual number of reads is not tracked by the kernel. This is calculated as BYDSK_PHYS_READ = BYDSK_PHYS_IO * (BYDSK_PHYS_READ_BYTE / BYDSK_PHYS_IO_BYTE) BYDSK_PHYS_READ_BYTE -------------------- The KBs transferred from this disk device during the interval. On Unix systems, all types of physical disk reads are counted, including file system, virtual memory, and raw IO. BYDSK_PHYS_READ_BYTE_RATE -------------------- The average KBs per second transferred from this disk device during the interval. On Unix systems, all types of physical disk reads are counted, including file system, virtual memory, and raw IO. BYDSK_PHYS_READ_RATE -------------------- The average number of physical reads per second for this disk device during the interval. On Unix systems, all types of physical disk reads are counted, including file system, virtual memory, and raw reads. On AIX, this is an estimated value based on the ratio of read bytes to total bytes transferred. The actual number of reads is not tracked by the kernel. This is calculated as BYDSK_PHYS_READ_RATE = BYDSK_PHYS_IO_RATE * (BYDSK_PHYS_READ_BYTE / BYDSK_PHYS_IO_BYTE) BYDSK_PHYS_WRITE -------------------- The number of physical writes for this disk device during the interval. On Unix systems, all types of physical disk writes are counted, including file system IO, virtual memory IO, and raw writes. On AIX, this is an estimated value based on the ratio of write bytes to total bytes transferred because the actual number of writes is not tracked by the kernel. This is calculated as BYDSK_PHYS_WRITE = BYDSK_PHYS_IO * (BYDSK_PHYS_WRITE_BYTE / BYDSK_PHYS_IO_BYTE) BYDSK_PHYS_WRITE_BYTE -------------------- The KBs transferred to this disk device during the interval. On Unix systems, all types of physical disk writes are counted, including file system, virtual memory, and raw IO. BYDSK_PHYS_WRITE_BYTE_RATE -------------------- The average KBs per second transferred to this disk device during the interval. On Unix systems, all types of physical disk writes are counted, including file system, virtual memory, and raw IO. BYDSK_PHYS_WRITE_RATE -------------------- The average number of physical writes per second for this disk device during the interval. On Unix systems, all types of physical disk writes are counted, including file system IO, virtual memory IO, and raw writes. On AIX, this is an estimated value based on the ratio of write bytes to total bytes transferred. The actual number of writes is not tracked by the kernel. This is calculated as BYDSK_PHYS_WRITE_RATE = BYDSK_PHYS_IO_RATE * (BYDSK_PHYS_WRITE_BYTE / BYDSK_PHYS_IO_BYTE) BYDSK_REQUEST_QUEUE -------------------- The average number of IO requests that were in the wait queue for this disk device during the interval. These requests are the physical requests (as opposed to logical IO requests). Some Linux kernels, typically 2.2 and older kernels, do not support the instrumentation needed to provide values for this metric. This metric will be “na” on the affected kernels. The “sar -d” command will also not be present on these systems. Distributions and OS releases that are known to be affected include: TurboLinux 7, SuSE 7.2, and Debian 3.0. BYDSK_UTIL -------------------- On HP-UX, this is the percentage of the time during the interval that the disk device had IO in progress from the point of view of the Operating System. In other words, the utilization or percentage of time busy servicing requests for this device. On the non-HP-UX systems, this is the percentage of the time that this disk device was busy transferring data during the interval. Some Linux kernels, typically 2.2 and older kernels, do not support the instrumentation needed to provide values for this metric. This metric will be “na” on the affected kernels. The “sar -d” command will also not be present on these systems. Distributions and OS releases that are known to be affected include: TurboLinux 7, SuSE 7.2, and Debian 3.0. This is a measure of the ability of the IO path to meet the transfer demands being placed on it. Slower disk devices may show a higher utilization with lower IO rates than faster disk devices such as disk arrays. A value of greater than 50% utilization over time may indicate that this device or its IO path is a bottleneck, and the access pattern of the workload, database, or files may need reorganizing for better balance of disk IO load. BYNETIF_COLLISION -------------------- The number of physical collisions that occurred on the network interface during the interval. A rising rate of collisions versus outbound packets is an indication that the network is becoming increasingly congested. This metric does not currently include deferred packets. This data is not collected for non-broadcasting devices, such as loopback (lo), and is always zero. For HP-UX, this will be the same as the sum of the “Single Collision Frames“, ”Multiple Collision Frames“, ”Late Collisions“, and ”Excessive Collisions“ values from the output of the ”lanadmin“ utility for the network interface. Remember that “lanadmin” reports cumulative counts. As of the HP-UX 11.0 release and beyond, “netstat -i” shows network activity on the logical level (IP) only. For most other Unix systems, this is the same as the sum of the “Coll” column from the “netstat -i” command (“collisions” from the “netstat -i -e” command on Linux) for a network device. See also netstat(1). AIX does not support the collision count for the ethernet interface. The collision count is supported for the token ring (tr) and loopback (lo) interfaces. For more information, please refer to the netstat(1) man page. Physical statistics are packets recorded by the network drivers. These numbers most likely will not be the same as the logical statistics. The values returned for the loopback interface will show “na” for the physical statistics since there is no network driver activity. Logical statistics are packets seen only by the Interface Protocol (IP) layer of the networking subsystem. Not all packets seen by IP will go out and come in through a network driver. An example is the loopback interface (127.0.0.1). Pings or other network generating commands (ftp, rlogin, and so forth) to 127.0.0.1 will not change physical driver statistics. Pings to IP addresses on remote systems will change physical driver statistics. This metric is updated at the sampling interval, regardless of the number of IP addresses on the system. BYNETIF_COLLISION_RATE -------------------- The number of physical collisions per second on the network interface during the interval. A rising rate of collisions versus outbound packets is an indication that the network is becoming increasingly congested. This metric does not currently include deferred packets. This data is not collected for non-broadcasting devices, such as loopback (lo), and is always zero. Physical statistics are packets recorded by the network drivers. These numbers most likely will not be the same as the logical statistics. The values returned for the loopback interface will show “na” for the physical statistics since there is no network driver activity. Logical statistics are packets seen only by the Interface Protocol (IP) layer of the networking subsystem. Not all packets seen by IP will go out and come in through a network driver. An example is the loopback interface (127.0.0.1). Pings or other network generating commands (ftp, rlogin, and so forth) to 127.0.0.1 will not change physical driver statistics. Pings to IP addresses on remote systems will change physical driver statistics. This metric is updated at the sampling interval, regardless of the number of IP addresses on the system. BYNETIF_ERROR -------------------- The number of physical errors that occurred on the network interface during the interval. An increasing number of errors may indicate a hardware problem in the network. On Unix systems, this data is not available for loop-back (lo) devices and is always zero. For HP-UX, this will be the same as the sum of the “Inbound Errors” and “Outbound Errors” values from the output of the “lanadmin” utility for the network interface. Remember that “lanadmin” reports cumulative counts. As of the HP-UX 11.0 release and beyond, “netstat -i” shows network activity on the logical level (IP) only. For all other Unix systems, this is the same as the sum of “Ierrs” (RX-ERR on Linux) and “Oerrs” (TX-ERR on Linux) from the “netstat -i” command for a network device. See also netstat(1). Physical statistics are packets recorded by the network drivers. These numbers most likely will not be the same as the logical statistics. The values returned for the loopback interface will show “na” for the physical statistics since there is no network driver activity. Logical statistics are packets seen only by the Interface Protocol (IP) layer of the networking subsystem. Not all packets seen by IP will go out and come in through a network driver. An example is the loopback interface (127.0.0.1). Pings or other network generating commands (ftp, rlogin, and so forth) to 127.0.0.1 will not change physical driver statistics. Pings to IP addresses on remote systems will change physical driver statistics. This metric is updated at the sampling interval, regardless of the number of IP addresses on the system. BYNETIF_ERROR_RATE -------------------- The number of physical errors per second on the network interface during the interval. On Unix systems, this data is not available for loop-back (lo) devices and is always zero. Physical statistics are packets recorded by the network drivers. These numbers most likely will not be the same as the logical statistics. The values returned for the loopback interface will show “na” for the physical statistics since there is no network driver activity. Logical statistics are packets seen only by the Interface Protocol (IP) layer of the networking subsystem. Not all packets seen by IP will go out and come in through a network driver. An example is the loopback interface (127.0.0.1). Pings or other network generating commands (ftp, rlogin, and so forth) to 127.0.0.1 will not change physical driver statistics. Pings to IP addresses on remote systems will change physical driver statistics. This metric is updated at the sampling interval, regardless of the number of IP addresses on the system. BYNETIF_ID -------------------- The ID number of the network interface. BYNETIF_IN_BYTE_RATE -------------------- The number of KBs per second received from the network via this interface during the interval. Only the bytes in packets that carry data are included in this rate. Physical statistics are packets recorded by the network drivers. These numbers most likely will not be the same as the logical statistics. The values returned for the loopback interface will show “na” for the physical statistics since there is no network driver activity. Logical statistics are packets seen only by the Interface Protocol (IP) layer of the networking subsystem. Not all packets seen by IP will go out and come in through a network driver. An example is the loopback interface (127.0.0.1). Pings or other network generating commands (ftp, rlogin, and so forth) to 127.0.0.1 will not change physical driver statistics. Pings to IP addresses on remote systems will change physical driver statistics. This metric is updated at the sampling interval, regardless of the number of IP addresses on the system. BYNETIF_IN_PACKET -------------------- The number of successful physical packets received through the network interface during the interval. Successful packets are those that have been processed without errors or collisions. For HP-UX, this will be the same as the sum of the “Inbound Unicast Packets“ and ”Inbound Non-Unicast Packets“ values from the output of the “lanadmin” utility for the network interface. Remember that “lanadmin” reports cumulative counts. As of the HP-UX 11.0 release and beyond, “netstat -i” shows network activity on the logical level (IP) only. For all other Unix systems, this is the same as the sum of the “Ipkts” column (RX-OK on Linux) from the “netstat -i” command for a network device. See also netstat(1). Physical statistics are packets recorded by the network drivers. These numbers most likely will not be the same as the logical statistics. The values returned for the loopback interface will show “na” for the physical statistics since there is no network driver activity. Logical statistics are packets seen only by the Interface Protocol (IP) layer of the networking subsystem. Not all packets seen by IP will go out and come in through a network driver. An example is the loopback interface (127.0.0.1). Pings or other network generating commands (ftp, rlogin, and so forth) to 127.0.0.1 will not change physical driver statistics. Pings to IP addresses on remote systems will change physical driver statistics. This metric is updated at the sampling interval, regardless of the number of IP addresses on the system. BYNETIF_IN_PACKET_RATE -------------------- The number of successful physical packets per second received through the network interface during the interval. Successful packets are those that have been processed without errors or collisions. Physical statistics are packets recorded by the network drivers. These numbers most likely will not be the same as the logical statistics. The values returned for the loopback interface will show “na” for the physical statistics since there is no network driver activity. Logical statistics are packets seen only by the Interface Protocol (IP) layer of the networking subsystem. Not all packets seen by IP will go out and come in through a network driver. An example is the loopback interface (127.0.0.1). Pings or other network generating commands (ftp, rlogin, and so forth) to 127.0.0.1 will not change physical driver statistics. Pings to IP addresses on remote systems will change physical driver statistics. This metric is updated at the sampling interval, regardless of the number of IP addresses on the system. BYNETIF_NAME -------------------- The name of the network interface. For HP-UX 11.0 and beyond, these are the same names that appear in the “Description” field of the “lanadmin” command output. On all other Unix systems, these are the same names that appear in the “Name” column of the “netstat -i” command. Some examples of device names are: lo - loop-back driver ln - Standard Ethernet driver en - Standard Ethernet driver le - Lance Ethernet driver ie - Intel Ethernet driver tr - Token-Ring driver et - Ether Twist driver bf - fiber optic driver All of the device names will have the unit number appended to the name. For example, a loop-back device in unit 0 will be “lo0”. BYNETIF_OUT_BYTE_RATE -------------------- The number of KBs per second sent to the network via this interface during the interval. Only the bytes in packets that carry data are included in this rate. Physical statistics are packets recorded by the network drivers. These numbers most likely will not be the same as the logical statistics. The values returned for the loopback interface will show “na” for the physical statistics since there is no network driver activity. Logical statistics are packets seen only by the Interface Protocol (IP) layer of the networking subsystem. Not all packets seen by IP will go out and come in through a network driver. An example is the loopback interface (127.0.0.1). Pings or other network generating commands (ftp, rlogin, and so forth) to 127.0.0.1 will not change physical driver statistics. Pings to IP addresses on remote systems will change physical driver statistics. This metric is updated at the sampling interval, regardless of the number of IP addresses on the system. BYNETIF_OUT_PACKET -------------------- The number of successful physical packets sent through the network interface during the interval. Successful packets are those that have been processed without errors or collisions. For HP-UX, this will be the same as the sum of the “Outbound Unicast Packets“ and ”Outbound Non-Unicast Packets“ values from the output of the “lanadmin” utility for the network interface. Remember that “lanadmin” reports cumulative counts. As of the HP-UX 11.0 release and beyond, “netstat -i” shows network activity on the logical level (IP) only. For all other Unix systems, this is the same as the sum of the “Opkts” column (TX-OK on Linux) from the “netstat -i” command for a network device. See also netstat(1). Physical statistics are packets recorded by the network drivers. These numbers most likely will not be the same as the logical statistics. The values returned for the loopback interface will show “na” for the physical statistics since there is no network driver activity. Logical statistics are packets seen only by the Interface Protocol (IP) layer of the networking subsystem. Not all packets seen by IP will go out and come in through a network driver. An example is the loopback interface (127.0.0.1). Pings or other network generating commands (ftp, rlogin, and so forth) to 127.0.0.1 will not change physical driver statistics. Pings to IP addresses on remote systems will change physical driver statistics. This metric is updated at the sampling interval, regardless of the number of IP addresses on the system. BYNETIF_OUT_PACKET_RATE -------------------- The number of successful physical packets per second sent through the network interface during the interval. Successful packets are those that have been processed without errors or collisions. Physical statistics are packets recorded by the network drivers. These numbers most likely will not be the same as the logical statistics. The values returned for the loopback interface will show “na” for the physical statistics since there is no network driver activity. Logical statistics are packets seen only by the Interface Protocol (IP) layer of the networking subsystem. Not all packets seen by IP will go out and come in through a network driver. An example is the loopback interface (127.0.0.1). Pings or other network generating commands (ftp, rlogin, and so forth) to 127.0.0.1 will not change physical driver statistics. Pings to IP addresses on remote systems will change physical driver statistics. This metric is updated at the sampling interval, regardless of the number of IP addresses on the system. DATE -------------------- The date the information in this record was captured, based on local time. The date is an ASCII field in mm/dd/yyyy format unless localized. If localized, the separators may be different and the subfield may be in a different sequence. In ASCII files this field will always contain 10 characters. Each subfield (mm, dd, yyyy) will contain a leading zero if the value is less than 10. This metric is extracted from GBL_STATTIME, which is obtained using the time() system call at the time of data collection. This field responds to language localization. For example, in Italy the field would appear as dd/mm/yyyy and in Japan it would be yyyy/mm/dd. In binary files this field is in MPE CALENDAR format in the least significant 16 bits of the field. The most significant 16 bits should all be zero. Dividing the field by 512 will isolate the year (that is, 94). This field MOD 512 will isolate the day of the year. DATE_SECONDS -------------------- The time that the data in this record was captured, expressed in seconds since January 1, 1970, based on local time. This is related to the standard time-stamp returned by the unix system call time(), but has had the local time zone correction applied. DAY -------------------- The julian day of the year that the data in this record was captured. This metric is extracted from GBL_STATTIME. FS_BLOCK_SIZE -------------------- The maximum block size of this file system, in bytes. A value of “na” may be displayed if the file system is not mounted. If the product is restarted, these unmounted file systems are not displayed until remounted. FS_DEVNAME -------------------- On Unix systems, this is the path name string of the current device. On Windows, this is the disk drive string of the current device. On HP-UX, this is the “fsname” parameter in the mount(1M) command. For NFS devices, this includes the name of the node exporting the file system. It is possible that a process may mount a device using the mount(2) system call. This call does not update the “/etc/mnttab” and its name is blank. This situation is rare, and should be corrected by syncer(1M). Note that once a device is mounted, its entry is displayed, even after the device is unmounted, until the midaemon process terminates. On AAN, this is the path name string of the current device, or “tmpfs” for memory based file systems. See tmpfs(7). FS_DIRNAME -------------------- On Unix systems, this is the path name of the mount point of the file system. On Windows, this is the drive letter associated with the selected disk partition. On HP-UX, this is the path name of the mount point of the file system if the logical volume has a mounted file system. This is the directory parameter of the mount(1M) command for most entries. Exceptions are: * For lvm swap areas, this field contains “lvm swap device”. * For logical volumes with no mounted file systems, this field contains “Raw Logical Volume” (relevant only to OVPA). On HP-UX, the file names are in the same order as shown in the “/usr/sbin/mount -p” command. File systems are not displayed until they exhibit IO activity once the midaemon has been started. Also, once a device is displayed, it continues to be displayed (even after the device is unmounted) until the midaemon process terminates. On AAN, only “UFS”, “HSFS” and “TMPFS” file systems are listed. See mount(1M) and mnttab(4). “TMPFS” file systems are memory based filesystems and are listed here for convenience. See tmpfs(7). On AIX, see mount(1M) and filesystems(4). On OSF1, see mount(2). FS_FRAG_SIZE -------------------- The fundamental file system block size, in bytes. A value of “na” may be displayed if the file system is not mounted. If the product is restarted, these unmounted file systems are not displayed until remounted. FS_INODE_UTIL -------------------- Percentage of this file system's inodes in use during the interval. A value of “na” may be displayed if the file system is not mounted. If the product is restarted, these unmounted file systems are not displayed until remounted. FS_MAX_INODES -------------------- Number of configured file system inodes. A value of “na” may be displayed if the file system is not mounted. If the product is restarted, these unmounted file systems are not displayed until remounted. FS_MAX_SIZE -------------------- Maximum number that this file system could obtain if full, in MB. Note that this is the user space capacity - it is the file system space accessible to non root users. On most Unix systems, the df command shows the total file system capacity which includes the extra file system space accessible to root users only. The equivalent fields to look at are “used” and “avail”. For the target file system, to calculate the maximum size in MB, use FS Max Size = (used + avail)/1024 A value of “na” may be displayed if the file system is not mounted. If the product is restarted, these unmounted file systems are not displayed until remounted. On HP-UX, this metric is updated at 4 minute intervals to minimize collection overhead. FS_SPACE_RESERVED -------------------- The amount of file system space in MBs reserved for superuser allocation. On AIX, this metric is typically zero because by default AIX does not reserve any file system space for the superuser. FS_SPACE_USED -------------------- The amount of file system space in MBs that is being used. FS_SPACE_UTIL -------------------- Percentage of the file system space in use during the interval. Note that this is the user space capacity - it is the file system space accessible to non root users. On most Unix systems, the df command shows the total file system capacity which includes the extra file system space accessible to root users only. A value of “na” may be displayed if the file system is not mounted. If the product is restarted, these unmounted file systems are not displayed until remounted. On HP-UX, this metric is updated at 4 minute intervals to minimize collection overhead. FS_TYPE -------------------- A string indicating the file system type. On Unix systems, some of the possible types are: hfs - user file system ufs - user file system ext2 - user file system cdfs - CD-ROM file system vxfs - Veritas (vxfs) file system nfs - network file system nfs3 - network file system Version 3 On Windows, some of the possible types are: NTFS - New Technology File System FAT - 16-bit File Allocation Table FAT32 - 32-bit File Allocation Table FAT uses a 16-bit file allocation table entry (216 clusters). FAT32 uses a 32-bit file allocation table entry. However, Windows 2000 reserves the first 4 bits of a FAT32 file allocation table entry, which means FAT32 has a theoretical maximum of 228 clusters. NTFS is native file system of Windows NT and beyond. GBL_ACTIVE_CPU -------------------- The number of CPUs online on the system. For HP-UX and certain versions of Linux, the sar(1M) command allows you to check the status of the system CPUs. For AAN and DEC, the commands psrinfo(1M) and psradm(1M) allow you to check or change the status of the system CPUs. For AIX, the pstat(1) command allows you to check the status of the system CPUs. GBL_ACTIVE_PROC -------------------- An active process is one that exists and consumes some CPU time. GBL_ACTIVE_PROC is the sum of the alive-process- time/interval-time ratios of every process that is active (uses any CPU time) during an interval. The following diagram of a four second interval during which two processes exist on the system should be used to understand the above definition. Note the difference between active processes, which consume CPU time, and alive processes which merely exist on the system. ----------- Seconds ----------- 1 2 3 4 Proc ---- ---- ---- ---- ---- A live live live live B live/CPU live/CPU live dead Process A is alive for the entire four second interval but consumes no CPU. A's contribution to GBL_ALIVE_PROC is 4*1/4. A contributes 0*1/4 to GBL_ACTIVE_PROC. B's contribution to GBL_ALIVE_PROC is 3*1/4. B contributes 2*1/4 to GBL_ACTIVE_PROC. Thus, for this interval, GBL_ACTIVE_PROC equals 0.5 and GBL_ALIVE_PROC equals 1.75. Because a process may be alive but not active, GBL_ACTIVE_PROC will always be less than or equal to GBL_ALIVE_PROC. This metric is a good overall indicator of the workload of the system. An unusually large number of active processes could indicate a CPU bottleneck. To determine if the CPU is a bottleneck, compare this metric with GBL_CPU_TOTAL_UTIL and GBL_RUN_QUEUE. If GBL_CPU_TOTAL_UTIL is near 100 percent and GBL_RUN_QUEUE is greater than one, there is a bottleneck. On non HP-UX systems, this metric is derived from sampled process data. Since the data for a process is not available after the process has died on this operating system, a process whose life is shorter than the sampling interval may not be seen when the samples are taken. Thus this metric may be slightly less than the actual value. Increasing the sampling frequency captures a more accurate count, but the overhead of collection may also rise. GBL_ALIVE_PROC -------------------- An alive process is one that exists on the system. GBL_ALIVE_PROC is the sum of the alive-process-time/interval- time ratios for every process. The following diagram of a four second interval during which two processes exist on the system should be used to understand the above definition. Note the difference between active processes, which consume CPU time, and alive processes which merely exist on the system. ----------- Seconds ----------- 1 2 3 4 Proc ---- ---- ---- ---- ---- A live live live live B live/CPU live/CPU live dead Process A is alive for the entire four second interval but consumes no CPU. A's contribution to GBL_ALIVE_PROC is 4*1/4. A contributes 0*1/4 to GBL_ACTIVE_PROC. B's contribution to GBL_ALIVE_PROC is 3*1/4. B contributes 2*1/4 to GBL_ACTIVE_PROC. Thus, for this interval, GBL_ACTIVE_PROC equals 0.5 and GBL_ALIVE_PROC equals 1.75. Because a process may be alive but not active, GBL_ACTIVE_PROC will always be less than or equal to GBL_ALIVE_PROC. On non HP-UX systems, this metric is derived from sampled process data. Since the data for a process is not available after the process has died on this operating system, a process whose life is shorter than the sampling interval may not be seen when the samples are taken. Thus this metric may be slightly less than the actual value. Increasing the sampling frequency captures a more accurate count, but the overhead of collection may also rise. GBL_BOOT_TIME -------------------- The date and time when the system was last booted. GBL_COLLECTOR -------------------- ASCII field containing collector name and version. The collector name will appear as either “SCOPE/xx V.UU.FF.LF” or “Coda RV.UU.FF.LF”. xx identifies the platform; V = version, UU = update level, FF = fix level, and LF = lab fix id. For example, SCOPE/UX C.04.00.00; or Coda A.07.10.04. GBL_COMPLETED_PROC -------------------- The number of processes that terminated during the interval. On non HP-UX systems, this metric is derived from sampled process data. Since the data for a process is not available after the process has died on this operating system, a process whose life is shorter than the sampling interval may not be seen when the samples are taken. Thus this metric may be slightly less than the actual value. Increasing the sampling frequency captures a more accurate count, but the overhead of collection may also rise. GBL_CPU_CLOCK -------------------- The clock speed of the CPUs in MHz if all of the processors have the same clock speed. Otherwise, “na” is shown if the processors have different clock speeds. GBL_CPU_IDLE_TIME -------------------- The time, in seconds, that the CPU was idle during the interval. This is the total idle time, including waiting for I/O. On a system with multiple CPUs, this metric is normalized. That is, the CPU used over all processors is divided by the number of processors online. GBL_CPU_IDLE_UTIL -------------------- The percentage of time that the CPU was idle during the interval. This is the total idle time, including waiting for I/O. On Unix systems, this is the same as the sum of the “%idle” and “%wio” fields reported by the “sar -u” command. On a system with multiple CPUs, this metric is normalized. That is, the CPU used over all processors is divided by the number of processors online. GBL_CPU_NICE_TIME -------------------- The time, in seconds, that the CPU was in user mode at a nice priority during the interval. On HP-UX, the NICE metrics include positive nice value CPU time only. Negative nice value CPU is broken out into NNICE (negative nice) metrics. Positive nice values range from 20 to 39. Negative nice values range from 0 to 19. On a system with multiple CPUs, this metric is normalized. That is, the CPU used over all processors is divided by the number of processors online. This represents the usage of the total processing capacity available. On AAN systems, this metric is only available on SunOS 4.1.X. GBL_CPU_NICE_UTIL -------------------- The percentage of time that the CPU was in user mode at a nice priority during the interval. On HP-UX, the NICE metrics include positive nice value CPU time only. Negative nice value CPU is broken out into NNICE (negative nice) metrics. Positive nice values range from 20 to 39. Negative nice values range from 0 to 19. On a system with multiple CPUs, this metric is normalized. That is, the CPU used over all processors is divided by the number of processors online. This represents the usage of the total processing capacity available. On AAN systems, this metric is only available on SunOS 4.1.X. GBL_CPU_SYS_MODE_TIME -------------------- The time, in seconds, that the CPU was in system mode during the interval. A process operates in either system mode (also called kernel mode on Unix or privileged mode on Windows) or user mode. When a process requests services from the operating system with a system call, it switches into the machine's privileged protection mode and runs in system mode. On a system with multiple CPUs, this metric is normalized. That is, the CPU used over all processors is divided by the number of processors online. This represents the usage of the total processing capacity available. GBL_CPU_SYS_MODE_UTIL -------------------- Percentage of time the CPU was in system mode during the interval. A process operates in either system mode (also called kernel mode on Unix or privileged mode on Windows) or user mode. When a process requests services from the operating system with a system call, it switches into the machine's privileged protection mode and runs in system mode. This metric is a subset of the GBL_CPU_TOTAL_UTIL percentage. This is NOT a measure of the amount of time used by system daemon processes, since most system daemons spend part of their time in user mode and part in system calls, like any other process. On a system with multiple CPUs, this metric is normalized. That is, the CPU used over all processors is divided by the number of processors online. This represents the usage of the total processing capacity available. High system mode CPU percentages are normal for IO intensive applications. Abnormally high system mode CPU percentages can indicate that a hardware problem is causing a high interrupt rate. It can also indicate programs that are not calling system calls efficiently. GBL_CPU_TOTAL_TIME -------------------- The total time, in seconds, that the CPU was not idle in the interval. This is calculated as GBL_CPU_TOTAL_TIME = GBL_CPU_USER_MODE_TIME + GBL_CPU_SYS_MODE_TIME On a system with multiple CPUs, this metric is normalized. That is, the CPU used over all processors is divided by the number of processors online. This represents the usage of the total processing capacity available. GBL_CPU_TOTAL_UTIL -------------------- Percentage of time the CPU was not idle during the interval. This is calculated as GBL_CPU_TOTAL_UTIL = GBL_CPU_USER_MODE_UTIL + GBL_CPU_SYS_MODE_UTIL On a system with multiple CPUs, this metric is normalized. That is, the CPU used over all processors is divided by the number of processors online. This represents the usage of the total processing capacity available. GBL_CPU_TOTAL_UTIL + GBL_CPU_IDLE_UTIL = 100% This metric varies widely on most systems, depending on the workload. A consistently high CPU utilization can indicate a CPU bottleneck, especially when other indicators such as GBL_RUN_QUEUE and GBL_ACTIVE_PROC are also high. High CPU utilization can also occur on systems that are bottlenecked on memory, because the CPU spends more time paging and swapping. NOTE: On Windows, this metric may not equal the sum of the APP_CPU_TOTAL_UTIL metrics. Microsoft states that “this is expected behavior“ because this GBL_CPU_TOTAL_UTIL metric is taken from the performance library Processor objects while the APP_CPU_TOTAL_UTIL metrics are taken from the Process objects. Microsoft states that there can be CPU time accounted for in the Processor system objects that may not be seen in the Process objects. GBL_CPU_USER_MODE_TIME -------------------- The time, in seconds, that the CPU was in user mode during the interval. User CPU is the time spent in user mode at a normal priority, at real-time priority (on HP-UX, AIX, and Windows systems), and at a nice priority. On a system with multiple CPUs, this metric is normalized. That is, the CPU used over all processors is divided by the number of processors online. This represents the usage of the total processing capacity available. GBL_CPU_USER_MODE_UTIL -------------------- The percentage of time the CPU was in user mode during the interval. User CPU is the time spent in user mode at a normal priority, at real-time priority (on HP-UX, AIX, and Windows systems), and at a nice priority. This metric is a subset of the GBL_CPU_TOTAL_UTIL percentage. On a system with multiple CPUs, this metric is normalized. That is, the CPU used over all processors is divided by the number of processors online. This represents the usage of the total processing capacity available. High user mode CPU percentages are normal for computation- intensive applications. Low values of user CPU utilization compared to relatively high values for GBL_CPU_SYS_MODE_UTIL can indicate an application or hardware problem. GBL_CSWITCH_RATE -------------------- The average number of context switches per second during the interval. On HP-UX, this includes context switches that result in the execution of a different process and those caused by a process stopping, then resuming, with no other process running in the meantime. On Windows, this includes switches from one thread to another either inside a single process or across processes. A thread switch can be caused either by one thread asking another for information or by a thread being preempted by another higher priority thread becoming ready to run. GBL_DISK_PHYS_BYTE -------------------- The number of KBs transferred to and from disks during the interval. The bytes for all types of physical IOs are counted. Only local disks are counted in this measurement. NFS devices are excluded. It is not directly related to the number of IOs, since IO requests can be of differing lengths. On Unix systems, this includes file system IO, virtual memory IO, and raw IO. On Windows, all types of physical IOs are counted. GBL_DISK_PHYS_BYTE_RATE -------------------- The average number of KBs per second at which data was transferred to and from disks during the interval. The bytes for all types physical IOs are counted. Only local disks are counted in this measurement. NFS devices are excluded. This is a measure of the physical data transfer rate. It is not directly related to the number of IOs, since IO requests can be of differing lengths. This is an indicator of how much data is being transferred to and from disk devices. Large spikes in this metric can indicate a disk bottleneck. On Unix systems, all types of physical disk IOs are counted, including file system, virtual memory, and raw reads. GBL_DISK_PHYS_IO -------------------- The number of physical IOs during the interval. Only local disks are counted in this measurement. NFS devices are excluded. On Unix systems, all types of physical disk IOs are counted, including file system IO, virtual memory IO and raw IO. On HP-UX, this is calculated as GBL_DISK_PHYS_IO = GBL_DISK_FS_IO + GBL_DISK_VM_IO + GBL_DISK_SYSTEM_IO + GBL_DISK_RAW_IO GBL_DISK_PHYS_IO_RATE -------------------- The number of physical IOs per second during the interval. Only local disks are counted in this measurement. NFS devices are excluded. On Unix systems, all types of physical disk IOs are counted, including file system IO, virtual memory IO and raw IO. On HP-UX, this is calculated as GBL_DISK_PHYS_IO_RATE = GBL_DISK_FS_IO_RATE + GBL_DISK_VM_IO_RATE + GBL_DISK_SYSTEM_IO_RATE + GBL_DISK_RAW_IO_RATE GBL_DISK_PHYS_READ -------------------- The number of physical reads during the interval. Only local disks are counted in this measurement. NFS devices are excluded. On Unix systems, all types of physical disk reads are counted, including file system, virtual memory, and raw reads. On HP-UX, there are many reasons why there is not a direct correlation between the number of logical IOs and physical IOs. For example, small sequential logical reads may be satisfied from the buffer cache, resulting in fewer physical IOs than logical IOs. Conversely, large logical IOs or small random IOs may result in more physical than logical IOs. Logical volume mappings, logical disk mirroring, and disk striping also tend to remove any correlation. On HP-UX, this is calculated as GBL_DISK_PHYS_READ = GBL_DISK_FS_READ + GBL_DISK_VM_READ + GBL_DISK_SYSTEM_READ + GBL_DISK_RAW_READ GBL_DISK_PHYS_READ_BYTE_RATE -------------------- The average number of KBs transferred from the disk per second during the interval. Only local disks are counted in this measurement. NFS devices are excluded. GBL_DISK_PHYS_READ_RATE -------------------- The number of physical reads per second during the interval. Only local disks are counted in this measurement. NFS devices are excluded. On Unix systems, all types of physical disk reads are counted, including file system, virtual memory, and raw reads. On HP-UX, this is calculated as GBL_DISK_PHYS_READ_RATE = GBL_DISK_FS_READ_RATE + GBL_DISK_VM_READ_RATE + GBL_DISK_SYSTEM_READ_RATE + GBL_DISK_RAW_READ_RATE GBL_DISK_PHYS_WRITE -------------------- The number of physical writes during the interval. Only local disks are counted in this measurement. NFS devices are excluded. On Unix systems, all types of physical disk writes are counted, including file system IO, virtual memory IO, and raw writes. On HP-UX, since this value is reported by the drivers, multiple physical requests that have been collapsed to a single physical operation (due to driver IO merging) are only counted once. On HP-UX, there are many reasons why there is not a direct correlation between logical IOs and physical IOs. For example, small logical writes may end up entirely in the buffer cache, and later generate fewer physical IOs when written to disk due to the larger IO size. Or conversely, small logical writes may require physical prefetching of the corresponding disk blocks before the data is merged and posted to disk. Logical volume mappings, logical disk mirroring, and disk striping also tend to remove any correlation. On HP-UX, this is calculated as GBL_DISK_PHYS_WRITE = GBL_DISK_FS_WRITE + GBL_DISK_VM_WRITE + GBL_DISK_SYSTEM_WRITE + GBL_DISK_RAW_WRITE GBL_DISK_PHYS_WRITE_BYTE_RATE -------------------- The average number of KBs transferred to the disk per second during the interval. Only local disks are counted in this measurement. NFS devices are excluded. On Unix systems, all types of physical disk writes are counted, including file system IO, virtual memory IO, and raw writes. GBL_DISK_PHYS_WRITE_RATE -------------------- The number of physical writes per second during the interval. Only local disks are counted in this measurement. NFS devices are excluded. On Unix systems, all types of physical disk writes are counted, including file system IO, virtual memory IO, and raw writes. On HP-UX, since this value is reported by the drivers, multiple physical requests that have been collapsed to a single physical operation (due to driver IO merging) are only counted once. On HP-UX, this is calculated as GBL_DISK_PHYS_WRITE_RATE = GBL_DISK_FS_WRITE_RATE + GBL_DISK_VM_WRITE_RATE + GBL_DISK_SYSTEM_WRITE_RATE + GBL_DISK_RAW_WRITE_RATE GBL_DISK_TIME_PEAK -------------------- The time, in seconds, during the interval that the busiest disk was performing IO transfers. This is for the busiest disk only, not all disk devices. This counter is based on an end- to-end measurement for each IO transfer updated at queue entry and exit points. Only local disks are counted in this measurement. NFS devices are excluded. GBL_DISK_UTIL_PEAK -------------------- The utilization of the busiest disk during the interval. On HP-UX, this is the percentage of time during the interval that the busiest disk device had IO in progress from the point of view of the Operating System. On all other systems, this is the percentage of time during the interval that the busiest disk was performing IO transfers. It is not an average utilization over all the disk devices. Only local disks are counted in this measurement. NFS devices are excluded. Some Linux kernels, typically 2.2 and older kernels, do not support the instrumentation needed to provide values for this metric. This metric will be “na” on the affected kernels. The “sar -d” command will also not be present on these systems. Distributions and OS releases that are known to be affected include: TurboLinux 7, SuSE 7.2, and Debian 3.0. A peak disk utilization of more than 50 percent often indicates a disk IO subsystem bottleneck situation. A bottleneck may not be in the physical disk drive itself, but elsewhere in the IO path. GBL_FS_SPACE_UTIL_PEAK -------------------- The percentage of occupied disk space to total disk space for the fullest file system found during the interval. Only locally mounted file systems are counted in this metric. This metric can be used as an indicator that at least one file system on the system is running out of disk space. On Unix systems, CDROM and PC file systems are also excluded. This metric can exceed 100 percent. This is because a portion of the file system space is reserved as a buffer and can only be used by root. If the root user has made the file system grow beyond the reserved buffer, the utilization will be greater than 100 percent. This is a dangerous situation since if the root user totally fills the file system, the system may crash. On Windows, CDROM file systems are also excluded. GBL_GMTOFFSET -------------------- The difference, in minutes, between local time and GMT (Greenwich Mean Time). GBL_INTERRUPT -------------------- The number of IO interrupts during the interval. GBL_INTERRUPT_RATE -------------------- The average number of IO interrupts per second during the interval. On HPUX and AAN this value includes clock interrupts. To get non-clock device interrupts, subtract clock interrupts from the value. GBL_INTERVAL -------------------- The amount of time in the interval. This measured interval is slightly larger than the desired or configured interval if the collection program is delayed by a higher priority process and cannot sample the data immediately. GBL_JAVAARG -------------------- This boolean value indicates whether the java class overloading mechanism is enabled or not. This metric will be set when the javaarg flag in the parm file is set. The metric affected by this setting is PROC_PROC_ARGV1. This setting is useful to construct parm file java application definitions using the argv1= keyword. GBL_LOGFILE_VERSION -------------------- Three byte ASCII field containing the log file version number. The log file version is assigned by scopeux and is incremented when changes to the log file causes the layout to be different from previous versions. The current version is “ D”. Every effort is made to protect the information investment maintained in historical log files by providing forward compatibility and/or conversion utilities when log files change. GBL_LOGGING_TYPES -------------------- A 13-byte field indicating the types of data logged by the collector. This is controlled by the LOG statement in the parm file. Each position will contain either a space or the characters as shown below. Note that positions two (all applications) and four (all processes) were implemented for HP internal use only and are not normally used outside of HP. An @ in position two indicates that all applications are logged each five minute interval even if they had no activity during the interval. An @ in position four indicates that all processes, not just the interesting ones, are logged each one minute interval. This can result in very large log files.An @ in position 6 indicates all devices( File System Device,Disk,CPU,LAN,Logical Volume) are logged. Position Char Meaning 1 G Global data 2 @ All applications 3 A Applications 4 @ All processes 5 P Interesting processes 6 @ All Devices 7 F File System Device 8 D Disk 9 C CPU 10 L LAN 11 V Logical Volume 12 T Transaction data 13 space Not used GBL_LOST_MI_TRACE_BUFFERS -------------------- The number of trace buffers lost by the measurement processing daemon. On HP-UX systems, if this value is > 0, the measurement subsystem is not keeping up with the system events that generate traces. For other Unix systems, if this value is > 0, the measurement subsystem is not keeping up with the ARM API calls that generate traces. Note: The value reported for this metric will roll over to 0 once it crosses INTMAX. GBL_MACHINE -------------------- On most Unix systems, this is a text string representing the type of computer. This is similar to what is returned by the command “uname -m”. On AIX, this is a text string representing the model number of the computer. This is similar to what is returned by the command “uname -M”. For example, “7043-150”. On Windows, this is a text string representing the type of the computer. For example, “80686”. GBL_MACHINE_MODEL -------------------- The CPU model. This is similar to the information returned by the GBL_MACHINE metric and the uname command. However, this metric returns more information on some processors. On HP-UX, this is the same information returned by the model command. GBL_MEM_AVAIL -------------------- The amount of physical available memory in the system (in MBs unless otherwise specified). Beginning with the OVPA 4.0 release, this metric is now reported in MBytes to better report the significant increases in system memory capacities. WARNING: This change in scale applies to this metric when logged by OVPA or displayed with GlancePlus for this release and beyond. However, the presentation of this metric recorded in legacy data (data logged with OVPA C.03 and previous releases), will remain in units of KBytes when viewed with extract or OVPM. On Windows, memory resident operating system code and data is not included as available memory. GBL_MEM_FREE_UTIL -------------------- The percentage of physical memory that was free at the end of the interval. GBL_MEM_PAGEOUT -------------------- The total number of page outs to the disk during the interval. On HP-UX, Solaris, and AIX, this reflects paging activity between memory and paging space. It does not include activity between memory and file systems. On Linux and Windows, this includes paging activity for both file systems and paging space. On HP-UX, this is the same as the “page outs” value from the “vmstat -s” command. On AIX, this is the same as the “paging space page outs” value. Remember that “vmstat -s” reports cumulative counts. GBL_MEM_PAGEOUT_BYTE -------------------- The number of KBs (or MBs if specified) of page outs during the interval. On HP-UX, Solaris, and AIX, this reflects paging activity between memory and paging space. It does not include activity between memory and file systems. On Linux and Windows, this includes paging activity for both file systems and paging space. GBL_MEM_PAGEOUT_BYTE_RATE -------------------- The number of KBs (or MBs if specified) per second of page outs during the interval. On HP-UX, Solaris, and AIX, this reflects paging activity between memory and paging space. It does not include activity between memory and file systems. On Linux and Windows, this includes paging activity for both file systems and paging space. GBL_MEM_PAGEOUT_RATE -------------------- The total number of page outs to the disk per second during the interval. On HP-UX, Solaris, and AIX, this reflects paging activity between memory and paging space. It does not include activity between memory and file systems. On Linux and Windows, this includes paging activity for both file systems and paging space. On HP-UX and AIX, this is the same as the “po” value from the vmstat command. On Solaris, this is the same as the sum of the “epo” and “apo” values from the “vmstat -p” command, divided by the page size in KB. On Windows, this counter also includes paging traffic on behalf of the system cache to access file data for applications and so may be high when there is no memory pressure. GBL_MEM_PAGE_REQUEST -------------------- The number of page requests to or from the disk during the interval. On HP-UX, Solaris, and AIX, this includes pages paged to or from the paging space and not to the file system. On Linux and Windows, this includes pages paged to or from both paging space and the file system. On HP-UX, this is the same as the sun of the “page ins” and “page outs” values from the “vmstat -s” command. On AIX, this is the same as the sum of the “paging space page ins” and “paging space page outs” values. Remember that “vmstat -s” reports cumulative counts. On Windows, this counter also includes paging traffic on behalf of the system cache to access file data for applications and so may be high when there is no memory pressure. GBL_MEM_PAGE_REQUEST_RATE -------------------- The number of page requests to or from the disk per second during the interval. On HP-UX, Solaris, and AIX, this includes pages paged to or from the paging space and not to or from the file system. On Linux and Windows, this includes pages paged to or from both paging space and the file system. On HP-UX and AIX, this is the same as the sum of the “pi” and “po” values from the vmstat command. On Solaris, this is the same as the sum of the “epi”, “epo”, “api”, and “apo” values from the “vmstat -p” command, divided by the page size in KB. Higher than normal rates can indicate either a memory or a disk bottleneck. Compare GBL_DISK_UTIL_PEAK and GBL_MEM_UTIL to determine which resource is more constrained. High rates may also indicate memory thrashing caused by a particular application or set of applications. Look for processes with high major fault rates to identify the culprits. GBL_MEM_PHYS -------------------- The amount of physical memory in the system (in MBs unless otherwise specified). Beginning with the OVPA 4.0 release, this metric is now reported in MBytes to better report the significant increases in system memory capacities. WARNING: This change in scale applies to this metric when logged by OVPA or displayed with GlancePlus for this release and beyond. However, the presentation of this metric recorded in legacy data (data logged with OVPA C.03 and previous releases), will remain in units of KBytes when viewed with extract or OVPM. On HP-UX, banks with bad memory are not counted. Note that on some machines, the Processor Dependent Code (PDC) code uses the upper 1MB of memory and thus reports less than the actual physical memory of the system. Thus, on a system with 256MB of physical memory, this metric and dmesg(1M) might only report 267,386,880 bytes (255MB). This is all the physical memory that software on the machine can access. On Windows, this is the total memory available, which may be slightly less than the total amount of physical memory present in the system. This value is also reported in the Control Panel's About Windows NT help topic. GBL_MEM_SWAPIN_BYTE_RATE -------------------- The number of KBs per second transferred from disk due to swap ins (or reactivations on HP-UX) during the interval. On AIX, swap metrics are equal to the corresponding page metrics. On HP-UX, process swapping was replaced by a combination of paging and deactivation. Process deactivation occurs when the system is thrashing or when the amount of free memory falls below a critical level. The swapper then marks certain processes for deactivation and removes them from the run queue. Pages within the associated memory regions are reused or paged out by the memory management vhand process in favor of pages belonging to processes that are not deactivated. Unlike traditional process swapping, deactivated memory pages may or may not be written out to the swap area, because a process could be reactivated before the paging occurs. To summarize, a process swap-out on HP-UX is a process deactivation. A swap-in is a reactivation of a deactivated process. Swap metrics that report swap-out bytes now represent bytes paged out to swap areas from deactivated regions. Because these pages are pushed out over time based on memory demands, these counts are much smaller than HP-UX 9.x counts where the entire process was written to the swap area when it was swapped-out. Likewise, swap-in bytes now represent bytes paged in as a result of reactivating a deactivated process and reading in any pages that were actually paged out to the swap area while the process was deactivated. GBL_MEM_SWAPOUT_BYTE_RATE -------------------- The number of KBs (or MBs if specified) per second transferred out to disk due to swap outs (or deactivations on HP-UX) during the interval. On AIX, swap metrics are equal to the corresponding page metrics. On HP-UX, process swapping was replaced by a combination of paging and deactivation. Process deactivation occurs when the system is thrashing or when the amount of free memory falls below a critical level. The swapper then marks certain processes for deactivation and removes them from the run queue. Pages within the associated memory regions are reused or paged out by the memory management vhand process in favor of pages belonging to processes that are not deactivated. Unlike traditional process swapping, deactivated memory pages may or may not be written out to the swap area, because a process could be reactivated before the paging occurs. To summarize, a process swap-out on HP-UX is a process deactivation. A swap-in is a reactivation of a deactivated process. Swap metrics that report swap-out bytes now represent bytes paged out to swap areas from deactivated regions. Because these pages are pushed out over time based on memory demands, these counts are much smaller than HP-UX 9.x counts where the entire process was written to the swap area when it was swapped-out. Likewise, swap-in bytes now represent bytes paged in as a result of reactivating a deactivated process and reading in any pages that were actually paged out to the swap area while the process was deactivated. GBL_MEM_SYS_UTIL -------------------- The percentage of physical memory used by the system during the interval. System memory does not include the buffer cache. On HP-UX 11.0, this metric does not include some kinds of dynamically allocated kernel memory. This has always been reported in the GBL_MEM_USER* metrics. On HP-UX 11.11 and beyond, this metric includes some kinds of dynamically allocated kernel memory. GBL_MEM_USER_UTIL -------------------- The percent of physical memory allocated to user code and data at the end of the interval. This metric shows the percent of memory owned by user memory regions such as user code, heap, stack and other data areas including shared memory. This does not include memory for buffer cache. On HP-UX 11.0, this metric includes some kinds of dynamically allocated kernel memory. On HP-UX 11.11 and beyond, this metric does not include some kinds of dynamically allocated kernel memory. This is now reported in the GBL_MEM_SYS* metrics. Large fluctuations in this metric can be caused by programs which allocate large amounts of memory and then either release the memory or terminate. A slow continual increase in this metric may indicate a program with a memory leak. GBL_MEM_UTIL -------------------- The percentage of physical memory in use during the interval. This includes system memory (occupied by the kernel), buffer cache and user memory. On HP-UX, this calculation is done using the byte values for physical memory and used memory, and is therefore more accurate than comparing the reported kilobyte values for physical memory and used memory. On AAN, high values for this metric may not indicate a true memory shortage. This metric can be influenced by the VMM (Virtual Memory Management) system. GBL_NET_COLLISION_1_MIN_RATE -------------------- The number of collisions per minute on all network interfaces during the interval. This metric does not include deferred packets. Collisions occur on any busy network, but abnormal collision rates could indicate a hardware or software problem. AIX does not support the collision count for the ethernet interface. The collision count is supported for the token ring (tr) and loopback (lo) interfaces. For more information, please refer to the netstat(1) man page. This metric is updated at the sampling interval, regardless of the number of IP addresses on the system. GBL_NET_COLLISION_PCT -------------------- The percentage of collisions to total outbound packet attempts during the interval. Outbound packet attempts include both successful packets and collisions. A rising rate of collisions versus outbound packets is an indication that the network is becoming increasingly congested. This metric does not currently include deferred packets. AIX does not support the collision count for the ethernet interface. The collision count is supported for the token ring (tr) and loopback (lo) interfaces. For more information, please refer to the netstat(1) man page. This metric is updated at the sampling interval, regardless of the number of IP addresses on the system. GBL_NET_COLLISION_RATE -------------------- The number of collisions per second on all network interfaces during the interval. This metric does not include deferred packets. A rising rate of collisions versus outbound packets is an indication that the network is becoming increasingly congested. AIX does not support the collision count for the ethernet interface. The collision count is supported for the token ring (tr) and loopback (lo) interfaces. For more information, please refer to the netstat(1) man page. This metric is updated at the sampling interval, regardless of the number of IP addresses on the system. GBL_NET_ERROR_1_MIN_RATE -------------------- The number of errors per minute on all network interfaces during the interval. This rate should normally be zero or very small. A large error rate can indicate a hardware or software problem. This metric is updated at the sampling interval, regardless of the number of IP addresses on the system. GBL_NET_ERROR_RATE -------------------- The number of errors per second on all network interfaces during the interval. This metric is updated at the sampling interval, regardless of the number of IP addresses on the system. GBL_NET_IN_ERROR_PCT -------------------- The percentage of inbound network errors to total inbound packet attempts during the interval. Inbound packet attempts include both packets successfully received and those that encountered errors. A large number of errors may indicate a hardware problem on the network. The percentage of inbound errors to total packets attempted should remain low. This metric is updated at the sampling interval, regardless of the number of IP addresses on the system. GBL_NET_IN_ERROR_RATE -------------------- The number of inbound errors per second on all network interfaces during the interval. A large number of errors may indicate a hardware problem on the network. The percentage of inbound errors to total packets attempted should remain low. This metric is updated at the sampling interval, regardless of the number of IP addresses on the system. GBL_NET_IN_PACKET -------------------- The number of successful packets received through all network interfaces during the interval. Successful packets are those that have been processed without errors or collisions. For HP-UX, this will be the same as the sum of the “Inbound Unicast Packets“ and ”Inbound Non-Unicast Packets“ values from the output of the “lanadmin” utility for the network interface. Remember that “lanadmin” reports cumulative counts. As of the HP-UX 11.0 release and beyond, “netstat -i” shows network activity on the logical level (IP) only. For all other Unix systems, this is the same as the sum of the “Ipkts” column (RX-OK on Linux) from the “netstat -i” command for a network device. See also netstat(1). This metric is updated at the sampling interval, regardless of the number of IP addresses on the system. On Windows system, the packet size for NBT connections is defined as 1 Kbyte. GBL_NET_IN_PACKET_RATE -------------------- The number of successful packets per second received through all network interfaces during the interval. Successful packets are those that have been processed without errors or collisions. This metric is updated at the sampling interval, regardless of the number of IP addresses on the system. On Windows system, the packet size for NBT connections is defined as 1 Kbyte. GBL_NET_OUT_ERROR_PCT -------------------- The percentage of outbound network errors to total outbound packet attempts during the interval. Outbound packet attempts include both packets successfully sent and those that encountered errors. The percentage of outbound errors to total packets attempted to be transmitted should remain low. This metric is updated at the sampling interval, regardless of the number of IP addresses on the system. GBL_NET_OUT_ERROR_RATE -------------------- The number of outbound errors per second on all network interfaces during the interval. This metric is updated at the sampling interval, regardless of the number of IP addresses on the system. GBL_NET_OUT_PACKET -------------------- The number of successful packets sent through all network interfaces during the last interval. Successful packets are those that have been processed without errors or collisions. For HP-UX, this will be the same as the sum of the “Outbound Unicast Packets“ and ”Outbound Non-Unicast Packets“ values from the output of the “lanadmin” utility for the network interface. Remember that “lanadmin” reports cumulative counts. As of the HP-UX 11.0 release and beyond, “netstat -i” shows network activity on the logical level (IP) only. For all other Unix systems, this is the same as the sum of the “Opkts” column (TX-OK on Linux) from the “netstat -i” command for a network device. See also netstat(1). This metric is updated at the sampling interval, regardless of the number of IP addresses on the system. On Windows system, the packet size for NBT connections is defined as 1 Kbyte. GBL_NET_OUT_PACKET_RATE -------------------- The number of successful packets per second sent through the network interfaces during the interval. Successful packets are those that have been processed without errors or collisions. This metric is updated at the sampling interval, regardless of the number of IP addresses on the system. On Windows system, the packet size for NBT connections is defined as 1 Kbyte. GBL_NET_PACKET_RATE -------------------- The number of successful packets per second (both inbound and outbound) for all network interfaces during the interval. Successful packets are those that have been processed without errors or collisions. This metric is updated at the sampling interval, regardless of the number of IP addresses on the system. On Windows system, the packet size for NBT connections is defined as 1 Kbyte. GBL_NFS_CALL -------------------- The number of NFS calls the local system has made as either a NFS client or server during the interval. This includes both successful and unsuccessful calls. Unsuccessful calls are those that cannot be completed due to resource limitations or LAN packet errors. NFS calls include create, remove, rename, link, symlink, mkdir, rmdir, statfs, getattr, setattr, lookup, read, readdir, readlink, write, writecache, null and root operations. GBL_NFS_CALL_RATE -------------------- The number of NFS calls per second the system made as either a NFS client or NFS server during the interval. Each computer can operate as both a NFS server, and as an NFS client. This metric includes both successful and unsuccessful calls. Unsuccessful calls are those that cannot be completed due to resource limitations or LAN packet errors. NFS calls include create, remove, rename, link, symlink, mkdir, rmdir, statfs, getattr, setattr, lookup, read, readdir, readlink, write, writecache, null and root operations. GBL_NUM_CPU -------------------- The number of CPUs physically on the system. This includes all CPUs, either online or offline. For HP-UX and certain versions of Linux, the sar(1M) command allows you to check the status of the system CPUs. For AAN and DEC, the commands psrinfo(1M) and psradm(1M) allow you to check or change the status of the system CPUs. GBL_NUM_DISK -------------------- The number of disks on the system. Only local disk devices are counted in this metric. On HP-UX, this is a count of the number of disks on the system that have ever had activity over the cumulative collection time. GBL_NUM_NETWORK -------------------- GBL_NUM_USER -------------------- The number of users logged in at the time of the interval sample. This is the same as the command “who | wc -l”. For Unix systems, the information for this metric comes from the utmp file which is updated by the login command. For more information, read the man page for utmp. Some applications may create users on the system without using login and updating the utmp file. These users are not reflected in this count. This metric can be a general indicator of system usage. In a networked environment, however, users may maintain inactive logins on several systems. On Windows, the information for this metric comes from the Server Sessions counter in the Performance Libraries Server object. It is a count of the number of users using this machine as a file server. GBL_OSNAME -------------------- A string representing the name of the operating system. On Unix systems, this is the same as the output from the “uname - s” command. GBL_OSRELEASE -------------------- The current release of the operating system. On most Unix systems, this is same as the output from the “uname -r” command. On AIX, this is the actual patch level of the operating system. This is similar to what is returned by the command “lslpp -l bos.rte” as the most recent level of the COMMITTED Base OS Runtime. For example, “5.2.0”. GBL_OSVERSION -------------------- A string representing the version of the operating system. This is the same as the output from the “uname -v” command. This string is limited to 20 characters, and as a result, the complete version name might be truncated. On Windows, this is a string representing the service pack installed on the operating system. GBL_PROC_SAMPLE -------------------- The number of process data samples that have been averaged into global metrics (such as GBL_ACTIVE_PROC) that are based on process samples. GBL_RUN_QUEUE -------------------- On Unix systems, the value shown is the 1-minute load average for all processors. On HP-UX, the load average is the average number of processes waiting for CPU per processor, whereas on other Unix systems, the load average is the total number of runnable and running threads summed over all processors during the interval. In other words, for non HP-UX systems, this metric correlates to the number of threads executing on and waiting for any processor. On Windows, this is approximately the average Processor Queue Length during the interval. On Unix systems, GBL_RUN_QUEUE will typically be a small number. Larger than normal values for this metric indicate CPU contention among processes. This CPU bottleneck is also normally indicated by 100 percent GBL_CPU_TOTAL_UTIL. It may be OK to have GBL_CPU_TOTAL_UTIL be 100 percent if no other processes are waiting for the CPU. However, if GBL_CPU_TOTAL_UTIL is 100 percent and GBL_RUN_QUEUE is greater than the number of processors, it indicates a CPU bottleneck. On Windows, the Processor Queue reflects a count of process threads which are ready to execute. A thread is ready to execute (in the Ready state) when the only resource it is waiting on is the processor. The Windows operating system itself has many system threads which intermittently use small amounts of processor time. Several low priority threads intermittently wake up and execute for very short intervals. Depending on when the collection process samples this queue, there may be none or several of these low-priority threads trying to execute. Therefore, even on an otherwise quiescent system, the Processor Queue Length can be high. High values for this metric during intervals where the overall CPU utilization (gbl_cpu_total_util) is low do not indicate a performance bottleneck. Relatively high values for this metric during intervals where the overall CPU utilization is near 100% can indicate a CPU performance bottleneck. HP-UX RUN/PRI/CPU Queue differences for multi-cpu systems: For example, let's assume we're using a system with eight processors. We start eight CPU intensive processes that consume almost all of the CPU resources. The approximate values shown for the CPU related queue metrics would be: GBL_RUN_QUEUE = 1.0 GBL_PRI_QUEUE = 0.1 GBL_CPU_QUEUE = 1.0 Assume we start an additional eight CPU intensive processes. The approximate values now shown are: GBL_RUN_QUEUE = 2.0 GBL_PRI_QUEUE = 8.0 GBL_CPU_QUEUE = 16.0 At this point, we have sixteen CPU intensive processes running on the eight processors. Keeping the definitions of the three queue metrics in mind, the run queue is 2 (that is, 16 / 8); the pri queue is 8 (only half of the processes can be active at any given time); and the cpu queue is 16 (half of the processes waiting in the cpu queue that are ready to run, plus one for each active process). This illustrates that the run queue is the average of the 1- minute load averages for all processors; the pri queue is the number of processes or kernel threads that are blocked on “PRI” (priority); and the cpu queue is the number of processes or kernel threads in the cpu queue that are ready to run, including the processes or kernel threads using the CPU. GBL_STARTED_PROC -------------------- The number of processes that started during the interval. GBL_STARTED_PROC_RATE -------------------- The number of processes that started per second during the interval. GBL_STATTIME -------------------- An ASCII string representing the time at the end of the interval, based on local time. GBL_SUBPROCSAMPLEINTERVAL -------------------- The SubProcSampleInterval parameter sets the internal sampling interval of process data. This option only changes the frequency of how often the operating system process table is scanned in order to accumulate process statistics during a log interval and does not change the logging interval for process data logging. If, for example, the CPU utilization is higher than expected (possibly due to a large operating system process table), you can decrease the utilization by increasing the sampling interval. Note: Increasing the AABPROC sample interval (SUBPROC can be used interchangeably with AABPROCSAMPLEINTERVAL) parameter may decrease the accuracy of application data and process data since short-lived processes (those completing within a sample interval) cannot be captured and hence logged by scopeux. To set process subintervals to 5 (default), 10, 15, 20, 30, or 60 seconds (these are the only values allowed), you will have to enter the AABPROC or AABPROCSAMPLEINTERVAL sample interval parameter in your parm file. You cannot input a value lower than 5. For example, to set the interval to 15 seconds, add one of the following lines in your parm file: AABPROC=15 or AABPROCSAMPLEINTERVAL=15 Changes made to the parm file are logged every time the Performance Agent is restarted. To check changes made to the AABPROC sample interval parameter in your parm file, you can use the following command: # utility -xs -D |grep -i sub 04/23/99 13:04 Process Collection Sample SubInterval 5 seconds -> 5 seconds 04/23/99 14:31 Process Collection Sample SubInterval 5 seconds -> 15 seconds 04/23/99 14:43 Process Collection Sample SubInterval 15 seconds -> 30 seconds Specify the full pathname of the performance tool bin directory as needed. You can also export the GBL_SUBPROCSAMPLEINTERVAL metric from the Configuration data. GBL_SWAP_SPACE_AVAIL -------------------- The total amount of potential swap space, in MB. On HP-UX, this is the sum of the device swap areas enabled by the swapon command, the allocated size of any file system swap areas, and the allocated size of pseudo swap in memory if enabled. Note that this is potential swap space. This is the same as (AVAIL: total) as reported by the “swapinfo -mt” command. On AAN, this is the total amount of swap space available from the physical backing store devices (disks) plus the amount currently available from main memory. This is the same as (used + available) /1024, reported by the “swap -s” command. On Linux, this is same as (Swap: total) as reported by the “free -m” command. On Unix systems, this metric is updated every 30 seconds or the sampling interval, whichever is greater. GBL_SWAP_SPACE_AVAIL_KB -------------------- The total amount of potential swap space, in KB. On HP-UX, this is the sum of the device swap areas enabled by the swapon command, the allocated size of any file system swap areas, and the allocated size of pseudo swap in memory if enabled. Note that this is potential swap space. Since swap is allocated in fixed (SWCHUNK) sizes, not all of this space may actually be usable. For example, on a 61MB disk using 2 MB swap size allocations, 1 MB remains unusable and is considered wasted space. On HP-UX, this is the same as (AVAIL: total) as reported by the “swapinfo -t” command. On AAN, this is the total amount of swap space available from the physical backing store devices (disks) plus the amount currently available from main memory. This is the same as (used + available)/1024, reported by the “swap -s” command. On Unix systems, this metric is updated every 30 seconds or the sampling interval, whichever is greater. GBL_SWAP_SPACE_UTIL -------------------- The percent of available swap space that was being used by running processes in the interval. On Windows, this is the percentage of virtual memory, which is available to user processes, that is in use at the end of the interval. It is not an average over the entire interval. It reflects the ratio of committed memory to the current commit limit. The limit may be increased by the operating system if the paging file is extended. This is the same as (Committed Bytes / Commit Limit) * 100 when comparing the results to Performance Monitor. On HP-UX, swap space must be reserved (but not allocated) before virtual memory can be created. If all of available swap is reserved, then no new processes or virtual memory can be created. Swap space locations are actually assigned (used) when a page is actually written to disk or locked in memory (pseudo swap in memory). This is the same as (PCT USED: total) as reported by the “swapinfo -mt” command. On Unix systems, this metric is a measure of capacity rather than performance. As this metric nears 100 percent, processes are not able to allocate any more memory and new processes may not be able to run. Very low swap utilization values may indicate that too much area has been allocated to swap, and better use of disk space could be made by reallocating some swap partitions to be user filesystems. On Unix systems, this metric is updated every 30 seconds or the sampling interval, whichever is greater. GBL_SYSTEM_ID -------------------- The network node hostname of the system. This is the same as the output from the “uname -n” command. On Windows, the name obtained from GetComputerName. GBL_SYSTEM_UPTIME_HOURS -------------------- The time, in hours, since the last system reboot. GBL_SYSTEM_UPTIME_SECONDS -------------------- The time, in seconds, since the last system reboot. GBL_THRESHOLD_CPU -------------------- The percent of CPU that a process must use to become interesting during an interval. The default for this threshold is “5.0”, which means a process must have a value of at least 5.0% for PROC_CPU_TOTAL_UTIL to exceed this threshold. All threshold values are supplied by the parm file. A process must exceed at least one threshold value in any given interval before it will be considered interesting and be logged. GBL_THRESHOLD_DISK -------------------- On HP-UX, this is the rate (IOs/sec) of physical disk IOs that a process must generate to become interesting during an interval. On Linux, this is the KB rate of physical disk IOs that the system must generate to become interesting during an interval. On the other Unix systems, this is the rate of either block disk IOs or major faults that a process must generate to become interesting during an interval. The default values and corresponding metric for this threshold are noted below. In order to exceed this threshold, the metric noted must match or exceed the value shown. HP-UX 5.0 for PROC_DISK_PHYS_IO_RATE for the given process AAN 5.0 for PROC_DISK_BLOCK_IO_RATE for the given process AIX 5.0 for PROC_DISK_BLOCK_IO_RATE for the given process OSF1 2.0 for PROC_IO_BYTE_RATE for the given process Linux 15.0 for GBL_DISK_PHYS_BYTE_RATE All threshold values are supplied by the parm file. A process must exceed at least one threshold value in any given interval before it will be considered interesting and be logged. GBL_THRESHOLD_NOKILLED -------------------- This is a flag specifying that terminating processes are not interesting. The flag is set by the THRESHOLD NOKILLED statement in the parm file. If this flag is set, then the process will be logged only if it exceeds at least one of the thresholds. The default (blank) is for the flag to be turned off, which means a terminating process will be logged in the interval it exits even if it did not exceed any thresholds during that interval. This is so that the death of a process is recorded even if it does not exceed any of the thresholds. On HP-UX, an exception to this is short-lived processes that are alive for less than one second. By default, short-lived processes are not considered interesting. However, there is a flag (THRESHOLD_SHORTLIVED) to turn on the logging of short- lived processes. GBL_THRESHOLD_NONEW -------------------- This is a flag specifying that newly created processes are not interesting. The flag is set by the THRESHOLD NONEW statement in the parm file. If this flag is set, then the process will be logged only if it exceeds at least one of the thresholds. The default (blank) is for the flag to be turned off, which means a new process will be logged in the interval it was created even if it did not exceed any thresholds during that interval. This is so that the existence of a process is recorded even if it does not exceed any of the thresholds. On HP-UX, an exception to this is short-lived processes that are alive for less than one second. By default, short-lived processes are not considered interesting. However, there is a flag (THRESHOLD_SHORTLIVED) to turn on the logging of short- lived processes. GBL_THRESHOLD_PROCMEM -------------------- The virtual memory in MB that a process must use to become interesting during an interval. The default for this threshold is 500 MB and is compared with the value of the PROC_MEM_VIRT metric. All threshold values are supplied by the parm file. A process must exceed at least one threshold value in any given interval before it will be considered interesting and be logged. GBL_TT_OVERFLOW_COUNT -------------------- The number of new transactions that could not be measured because the Measurement Processing Daemon's (midaemon) Measurement Performance Database is full. If this happens, the default Measurement Performance Database size is not large enough to hold all of the registered transactions on this system. This can be remedied by stopping and restarting the midaemon process using the -smdvss option to specify a larger Measurement Performance Database size. The current Measurement Performance Database size can be checked using the midaemon - sizes option. INTERVAL -------------------- The number of seconds in the measurement interval. For the process data class, this is the number of seconds the process was alive during the interval. PROC_APP_ID -------------------- The ID number of the application to which the process (or kernel thread, if HP-UX) belonged during the interval. Application “other” always has an ID of 1. There can be up to 128 user-defined applications, which are defined in the parm file. PROC_CPU_SYS_MODE_TIME -------------------- The CPU time in system mode in the context of the process (or kernel thread, if HP-UX) during the interval. A process operates in either system mode (also called kernel mode on Unix or privileged mode on Windows) or user mode. When a process requests services from the operating system with a system call, it switches into the machine's privileged protection mode and runs in system mode. On a threaded operating system, such as HP-UX 11.0 and beyond, process usage of a resource is calculated by summing the usage of that resource by its kernel threads. If this metric is reported for a kernel thread, the value is the resource usage by that single kernel thread. If this metric is reported for a process, the value is the sum of the resource usage by all of its kernel threads. Alive kernel threads and kernel threads that have died during the interval are included in the summation. PROC_CPU_SYS_MODE_UTIL -------------------- The percentage of time that the CPU was in system mode in the context of the process (or kernel thread, if HP-UX) during the interval. A process operates in either system mode (also called kernel mode on Unix or privileged mode on Windows) or user mode. When a process requests services from the operating system with a system call, it switches into the machine's privileged protection mode and runs in system mode. Unlike the global and application CPU metrics, process CPU is not averaged over the number of processors on systems with multiple CPUs. Single-threaded processes can use only one CPU at a time and never exceed 100% CPU utilization. High system mode CPU utilizations are normal for IO intensive programs. Abnormally high system CPU utilization can indicate that a hardware problem is causing a high interrupt rate. It can also indicate programs that are not using system calls efficiently. A classic “hung shell” shows up with very high system mode CPU because it gets stuck in a loop doing terminal reads (a system call) to a device that never responds. On a threaded operating system, such as HP-UX 11.0 and beyond, process usage of a resource is calculated by summing the usage of that resource by its kernel threads. If this metric is reported for a kernel thread, the value is the resource usage by that single kernel thread. If this metric is reported for a process, the value is the sum of the resource usage by all of its kernel threads. Alive kernel threads and kernel threads that have died during the interval are included in the summation. On multi-processor HP-UX systems, processes which have component kernel threads executing simultaneously on different processors could have resource utilization sums over 100%. The maximum percentage is 100% times the number of CPUs online. PROC_CPU_TOTAL_TIME -------------------- The total CPU time, in seconds, consumed by a process (or kernel thread, if HP-UX) during the interval. Unlike the global and application CPU metrics, process CPU is not averaged over the number of processors on systems with multiple CPUs. Single-threaded processes can use only one CPU at a time and never exceed 100% CPU utilization. On HP-UX, the total CPU time is the sum of the CPU time components for a process or kernel thread, including system, user, context switch, interrupts processing, realtime, and nice utilization values. On a threaded operating system, such as HP-UX 11.0 and beyond, process usage of a resource is calculated by summing the usage of that resource by its kernel threads. If this metric is reported for a kernel thread, the value is the resource usage by that single kernel thread. If this metric is reported for a process, the value is the sum of the resource usage by all of its kernel threads. Alive kernel threads and kernel threads that have died during the interval are included in the summation. On multi-processor HP-UX systems, processes which have component kernel threads executing simultaneously on different processors could have resource utilization sums over 100%. The maximum percentage is 100% times the number of CPUs online. PROC_CPU_TOTAL_TIME_CUM -------------------- The total CPU time consumed by a process (or kernel thread, if HP-UX) over the cumulative collection time. CPU time is in seconds unless otherwise specified. The cumulative collection time is defined from the point in time when either: a) the process (or kernel thread, if HP-UX) was first started, or b) the performance tool was first started, or c) the cumulative counters were reset (relevant only to GlancePlus, if available for the given platform), whichever occurred last. This is calculated as PROC_CPU_TOTAL_TIME_CUM = PROC_CPU_SYS_MODE_TIME_CUM + PROC_CPU_USER_MODE_TIME_CUM On a threaded operating system, such as HP-UX 11.0 and beyond, process usage of a resource is calculated by summing the usage of that resource by its kernel threads. If this metric is reported for a kernel thread, the value is the resource usage by that single kernel thread. If this metric is reported for a process, the value is the sum of the resource usage by all of its kernel threads. Alive kernel threads and kernel threads that have died during the interval are included in the summation. PROC_CPU_TOTAL_UTIL -------------------- The total CPU time consumed by a process (or kernel thread, if HP-UX) as a percentage of the total CPU time available during the interval. Unlike the global and application CPU metrics, process CPU is not averaged over the number of processors on systems with multiple CPUs. Single-threaded processes can use only one CPU at a time and never exceed 100% CPU utilization. On HP-UX, the total CPU utilization is the sum of the CPU utilization components for a process or kernel thread, including system, user, context switch, interrupts processing, realtime, and nice utilization values. On a threaded operating system, such as HP-UX 11.0 and beyond, process usage of a resource is calculated by summing the usage of that resource by its kernel threads. If this metric is reported for a kernel thread, the value is the resource usage by that single kernel thread. If this metric is reported for a process, the value is the sum of the resource usage by all of its kernel threads. Alive kernel threads and kernel threads that have died during the interval are included in the summation. On multi-processor HP-UX systems, processes which have component kernel threads executing simultaneously on different processors could have resource utilization sums over 100%. The maximum percentage is 100% times the number of CPUs online. PROC_CPU_TOTAL_UTIL_CUM -------------------- The total CPU time consumed by a process (or kernel thread, if HP-UX) as a percentage of the total CPU time available over the cumulative collection time. The cumulative collection time is defined from the point in time when either: a) the process (or kernel thread, if HP-UX) was first started, or b) the performance tool was first started, or c) the cumulative counters were reset (relevant only to GlancePlus, if available for the given platform), whichever occurred last. Unlike the global and application CPU metrics, process CPU is not averaged over the number of processors on systems with multiple CPUs. Single-threaded processes can use only one CPU at a time and never exceed 100% CPU utilization. On HP-UX, the total CPU utilization is the sum of the CPU utilization components for a process or kernel thread, including system, user, context switch, interrupts processing, realtime, and nice utilization values. On a threaded operating system, such as HP-UX 11.0 and beyond, process usage of a resource is calculated by summing the usage of that resource by its kernel threads. If this metric is reported for a kernel thread, the value is the resource usage by that single kernel thread. If this metric is reported for a process, the value is the sum of the resource usage by all of its kernel threads. Alive kernel threads and kernel threads that have died during the interval are included in the summation. On multi-processor HP-UX systems, processes which have component kernel threads executing simultaneously on different processors could have resource utilization sums over 100%. The maximum percentage is 100% times the number of CPUs online. PROC_CPU_USER_MODE_TIME -------------------- The time, in seconds, the process (or kernel threads, if HP-UX) was using the CPU in user mode during the interval. User CPU is the time spent in user mode at a normal priority, at real-time priority (on HP-UX, AIX, and Windows systems), and at a nice priority. On a threaded operating system, such as HP-UX 11.0 and beyond, process usage of a resource is calculated by summing the usage of that resource by its kernel threads. If this metric is reported for a kernel thread, the value is the resource usage by that single kernel thread. If this metric is reported for a process, the value is the sum of the resource usage by all of its kernel threads. Alive kernel threads and kernel threads that have died during the interval are included in the summation. PROC_CPU_USER_MODE_UTIL -------------------- The percentage of time the process (or kernel thread, if HP-UX) was using the CPU in user mode during the interval. User CPU is the time spent in user mode at a normal priority, at real-time priority (on HP-UX, AIX, and Windows systems), and at a nice priority. Unlike the global and application CPU metrics, process CPU is not averaged over the number of processors on systems with multiple CPUs. Single-threaded processes can use only one CPU at a time and never exceed 100% CPU utilization. On a threaded operating system, such as HP-UX 11.0 and beyond, process usage of a resource is calculated by summing the usage of that resource by its kernel threads. If this metric is reported for a kernel thread, the value is the resource usage by that single kernel thread. If this metric is reported for a process, the value is the sum of the resource usage by all of its kernel threads. Alive kernel threads and kernel threads that have died during the interval are included in the summation. On multi-processor HP-UX systems, processes which have component kernel threads executing simultaneously on different processors could have resource utilization sums over 100%. The maximum percentage is 100% times the number of CPUs online. PROC_GROUP_ID -------------------- On most systems, this is the real group ID number of the process. On AIX, this is the effective group ID number of the process. On HP-UX, this is the effective group ID number of the process if not in setgid mode. On HP-UX, this metric is specific to a process. If this metric is reported for a kernel thread, the value for its associated process is given. PROC_INTEREST -------------------- A field of flags indicating why the process was considered interesting enough to be logged. Scope determines the interest reason by comparing the activity of the process to the threshold criteria set in the parm file. New or Killed are treated differently, no matter what NONEW and NOKILLED options are set to, you may see an N or K flag if the process was interesting for another reason. This field consists of 12 independent columns. Each column contains a blank or a character representing a process INTEREST code as shown below. Position Char Meaning 1 N New Process 2 K Killed (terminated) process 3 C CPU percentage used exceeded threshold PROC_INTERVAL_ALIVE -------------------- The number of seconds that the process (or kernel thread, if HP-UX) was alive during the interval. This may be less than the time of the interval if the process (or kernel thread, if HP-UX) was new or died during the interval. PROC_MAJOR_FAULT -------------------- Number of major page faults for this process (or kernel thread, if HP-UX) during the interval. On HP-UX, major page faults and minor page faults are a subset of vfaults (virtual faults). Stack and heap accesses can cause vfaults, but do not result in a disk page having to be loaded into memory. PROC_MEM_RES -------------------- The size (in KB) of resident memory allocated for the process. On HP-UX, the calculation of this metric differs depending on whether this process has used any CPU time since the midaemon process was started. This metric is less accurate and does not include shared memory regions in its calculation when the process has been idle since the midaemon was started. On HP-UX, for processes that use CPU time subsequent to midaemon startup, the resident memory is calculated as RSS = sum of private region pages + (sum of shared region pages / number of references) The number of references is a count of the number of attachments to the memory region. Attachments, for shared regions, may come from several processes sharing the same memory, a single process with multiple attachments, or combinations of these. This value is only updated when a process uses CPU. Thus, under memory pressure, this value may be higher than the actual amount of resident memory for processes which are idle because their memory pages may no longer be resident or the reference count for shared segments may have changed. On HP-UX, this metric is specific to a process. If this metric is reported for a kernel thread, the value for its associated process is given. A value of “na” is displayed when this information is unobtainable. This information may not be obtainable for some system (kernel) processes. It may also not be available for processes. On AIX, this is the same as the RSS value shown by “ps v”. On Windows, this is the number of KBs in the working set of this process. The working set includes the memory pages touched recently by the threads of the process. If free memory in the system is above a threshold, then pages are left in the working set even if they are not in use. When free memory falls below a threshold, pages are trimmed from the working set, but not necessarily paged out to disk from memory. If those pages are subsequently referenced, they will be page faulted back into the working set. Therefore, the working set is a general indicator of the memory resident set size of this process, but it will vary depending on the overall status of memory on the system. Note that the size of the working set is often larger than the amount of pagefile space consumed (PROC_MEM_VIRT). PROC_MEM_VIRT -------------------- The size (in KB) of virtual memory allocated for the process. On HP-UX, this consists of the sum of the virtual set size of all private memory regions used by this process, plus this process' share of memory regions which are shared by multiple processes. For processes that use CPU time, the value is divided by the reference count for those regions which are shared. On HP-UX, this metric is less accurate and does not reflect the reference count for shared regions for processes that were started prior to the midaemon process and have not used any CPU time since the midaemon was started. On HP-UX, this metric is specific to a process. If this metric is reported for a kernel thread, the value for its associated process is given. On all other Unix systems, this consists of private text, private data, private stack and shared memory. The reference count for shared memory is not taken into account, so the value of this metric represents the total virtual size of all regions regardless of the number of processes sharing access. Note also that lazy swap algorithms, sparse address space malloc calls, and memory-mapped file access can result in large VSS values. On systems that provide Glance memory regions detail reports, the drilldown detail per memory region is useful to understand the nature of memory allocations for the process. A value of “na” is displayed when this information is unobtainable. This information may not be obtainable for some system (kernel) processes. It may also not be available for processes. On Windows, this is the number of KBs the process has used in the paging file(s). Paging files are used to store pages of memory used by the process, such as local data, that are not contained in other files. Examples of memory pages which are contained in other files include pages storing a program's .EXE and .DLL files. These would not be kept in pagefile space. Thus, often programs will have a memory working set size (PROC_MEM_RES) larger than the size of its pagefile space. PROC_MINOR_FAULT -------------------- Number of minor page faults for this process (or kernel thread, if HP-UX) during the interval. On HP-UX, major page faults and minor page faults are a subset of vfaults (virtual faults). Stack and heap accesses can cause vfaults, but do not result in a disk page having to be loaded into memory. PROC_PAGEFAULT -------------------- The number of page faults that occurred during the interval for the process. PROC_PAGEFAULT_RATE -------------------- The number of page faults per second that occurred during the interval for the process. PROC_PARENT_PROC_ID -------------------- The parent process' PID number. On HP-UX, this metric is specific to a process. If this metric is reported for a kernel thread, the value for its associated process is given. PROC_PRI -------------------- On Unix systems, this is the dispatch priority of a process (or kernel thread, if HP-UX) at the end of the interval. The lower the value, the more likely the process is to be dispatched. On Windows, this is the current base priority of this process. On HP-UX, whenever the priority is changed for the selected process or kernel thread, the new value will not be reflected until the process or kernel thread is reactivated if it is currently idle (for example, SLEEPing). On HP-UX, the lower the value, the more the process or kernel thread is likely to be dispatched. Values between zero and 127 are considered to be “real-time” priorities, which the kernel does not adjust. Values above 127 are normal priorities and are modified by the kernel for load balancing. Some special priorities are used in the HP-UX kernel and subsystems for different activities. These values are described in /usr/include/sys/param.h. Priorities less than PZERO 153 are not signalable. Note that on HP-UX, many network-related programs such as inetd, biod, and rlogind run at priority 154 which is PPIPE. Just because they run at this priority does not mean they are using pipes. By examining the open files, you can determine if a process or kernel thread is using pipes. For HP-UX 10.0 and later releases, priorities between -32 and - 1 can be seen for processes or kernel threads using the Posix Real-time Schedulers. When specifying a Posix priority, the value entered must be in the range from 0 through 31, which the system then remaps to a negative number in the range of -1 through -32. Refer to the rtsched man pages for more information. On a threaded operating system, such as HP-UX 11.0 and beyond, this metric represents a kernel thread characteristic. If this metric is reported for a process, the value for its last executing kernel thread is given. For example, if a process has multiple kernel threads and kernel thread one is the last to execute during the interval, the metric value for kernel thread one is assigned to the process. On AIX, values for priority range from 0 to 127. Processes running at priorities less than PZERO (40) are not signalable. On Windows, the higher the value the more likely the process or thread is to be dispatched. Values for priority range from 0 to 31. Values of 16 and above are considered to be “realtime” priorities. Threads within a process can raise and lower their own base priorities relative to the process's base priority. On Sun Systems this metric is only available on 4.1.X. PROC_PROC_ARGV1 -------------------- The first argument (argv[1]) of the process argument list or the second word of the command line, if present. The OV Performance Agent logs the first 32 characters of this metric. For releases that support the parm file javaarg flag, this metric may not be the first argument. When javaarg=true, the value of this metric is replaced (for java processes only) by the java class or jar name. This can then be useful to construct parm file java application definitions using the argv1= keyword. PROC_PROC_ID -------------------- The process ID number (or PID) of this process that is used by the kernel to uniquely identify this process. Process numbers are reused, so they only identify a process for its lifetime. On HP-UX, this metric is specific to a process. If this metric is reported for a kernel thread, the value for its associated process is given. PROC_PROC_NAME -------------------- The process program name. It is limited to 16 characters. On Unix systems, this is derived from the 1st parameter to the exec(2) system call. On HP-UX, this metric is specific to a process. If this metric is reported for a kernel thread, the value for its associated process is given. On Windows, the “System Idle Process” is not reported by OVPA since Idle is a process that runs to occupy the processors when they are not executing other threads. Idle has one thread per processor. PROC_RUN_TIME -------------------- The elapsed time since a process (or kernel thread, if HP-UX) started, in seconds. This metric is less than the interval time if the process (or kernel thread, if HP-UX) was not alive during the entire first or last interval. On a threaded operating system such as HP-UX 11.0 and beyond, this metric is available for a process or kernel thread. PROC_STOP_REASON -------------------- A text string describing what caused the process (or kernel thread, if HP-UX) to stop executing. For example, if the process is waiting for a CPU while higher priority processes are executing, then its block reason is PRI. A complete list of block reasons follows: String Reason for Process Block ------------------------------------ died Process terminated during the interval. new Process was created (via the exec() system call) during the interval. NONE Process is ready to run. It is not apparent that the process is blocked. OTHER Waiting for a reason not decipherable by the measurement software. PRI Process is on the run queue. SLEEP Waiting for an event to complete. TRACE Received a signal to stop because parent is tracing this process. ZOMB Process has terminated and the parent is not waiting. PROC_THREAD_COUNT -------------------- The total number of kernel threads for the current process. On Linux systems, every thread has its own process ID so this metric will always be 1. On Solaris systems, this metric reflects the total number of Light Weight Processes (LWPs) associated with the process. PROC_TTY -------------------- The controlling terminal for a process. This field is blank if there is no controlling terminal. On HP-UX, Linux, and AIX, this is the same as the “TTY” field of the ps command. On all other Unix systems, the controlling terminal name is found by searching the directories provided in the /etc/ttysrch file. See man page ttysrch(4) for details. The matching criteria field (“M”, “F” or “I” values) of the ttysrch file is ignored. If a terminal is not found in one of the ttysrch file directories, the following directories are searched in the order here: “/dev”, “/dev/pts”, “/dev/term” and “dev/xt”. When a match is found in one of the “/dev” subdirectories, “/dev/” is not displayed as part of the terminal name. If no match is found in the directory searches, the major and minor numbers of the controlling terminal are displayed. In most cases, this value is the same as the “TTY” field of the ps command. On HP-UX, this metric is specific to a process. If this metric is reported for a kernel thread, the value for its associated process is given. PROC_USER_NAME -------------------- On Unix systems, this is the login account of a process (from /etc/passwd). If more than one account is listed in /etc/passwd with the same user ID (uid) field, the first one is used. If an account cannot be found that matches the uid field, then the uid number is returned. This would occur if the account was removed after a process was started. On Windows, this is the process owner account name, without the domain name this account resides in. On HP-UX, this metric is specific to a process. If this metric is reported for a kernel thread, the value for its associated process is given. RECORD_TYPE -------------------- ASCII string that identifies the record. Possibilities include: GLOB for global 5 minute detail GSUM for global hourly summary APPL for application 5 minute detail ASUM for application hourly summary CONF for configuration TRAN for transaction tracker detail TSUM for transaction tracker summary Except for Windows Desktop, this also includes: PROC for process 1 minute detail DISK for disk device 5 minute detail DSUM for disk device summary On HP-UX, this also includes: VOLS for logical volume disk detail VSUM for logical volume disk summary TBL_FILE_LOCK_USED -------------------- The number of file or record locks currently in use. One file can have multiple locks. Files and/or records are locked by calls to lockf(2). On Unix systems, this metric is updated every 30 seconds or the sampling interval, whichever is greater. TBL_FILE_TABLE_AVAIL -------------------- The number of entries in the file table. On HP-UX and AIX, this is the configured maximum number of the file table entries used by the kernel to manage open file descriptors. On HP-UX, this is the sum of the “nfile” and “file_pad” values used in kernel generation. On AAN, this is the number of entries in the file cache. This is a size. All entries are not always in use. The cache size is dynamic. Entries in this cache are used to manage open file descriptors. They are reused as files are closed and new ones are opened. The size of the cache will go up or down in chunks as more or less space is required in the cache. On AIX, the file table entries are dynamically allocated by the kernel if there is no entry available. These entries are allocated in chunks. TBL_FILE_TABLE_UTIL -------------------- The percentage of file table entries currently used by file descriptors. On Unix systems, this metric is updated every 30 seconds or the sampling interval, whichever is greater. TBL_INODE_CACHE_AVAIL -------------------- On HP-UX, this is the configured total number of entries for the incore inode tables on the system. For HP-UX releases prior to 11.2x, this value reflects only the HFS inode table. For subsequent HP-UX releases, this value is the sum of inode tables for both HFS and VxFS file systems (ninode plus vxfs_ninode). On HP-UX, file system directory activity is done through inodes that are stored on disk. The kernel keeps a memory cache of active and recently accessed inodes to reduce disk IOs. When a file is opened through a pathname, the kernel converts the pathname to an inode number and attempts to obtain the inode information from the cache based on the filesystem type. If the inode entry is not in the cache, the inode is read from disk into the inode cache. On HP-UX, the number of used entries in the inode caches are usually at or near the capacity. This does not necessarily indicate that the configured sizes are too small because the tables may contain recently used inodes and inodes referenced by entries in the directory name lookup cache. When a new inode cache entry is required and a free entry does not exist, inactive entries referenced by the directory name cache are used. If after freeing inode entries only referenced by the directory name cache does not create enough free space, the message “inode: table is full” message may appear on the console. If this occurs, increase the size of the kernel parameter, ninode. Low directory name cache hit ratios may also indicate an underconfigured inode cache. On HP-UX, the default formula for the ninode size is: ninode = ((nproc+16+maxusers)+32+ (2*npty)+(4*num_clients)) On all other Unix systems, this is the number of entries in the inode cache. This is a size. All entries are not always in use. The cache size is dynamic. Entries in this cache are reused as files are closed and new ones are opened. The size of the cache will go up or down in chunks as more or less space is required in the cache. Inodes are used to store information about files within the file system. Every file has at least two inodes associated with it (one for the directory and one for the file itself). The information stored in an inode includes the owners, timestamps, size, and an array of indices used to translate logical block numbers to physical sector numbers. There is a separate inode maintained for every view of a file, so if two processes have the same file open, they both use the same directory inode, but separate inodes for the file. TBL_INODE_CACHE_USED -------------------- The number of inode cache entries currently in use. On HP-UX, this is the number of “non-free” inodes currently used. Since the inode table contains recently closed inodes as well as open inodes, the table often appears to be fully utilized. When a new entry is needed, one can usually be found by reusing one of the recently closed inode entries. On HP-UX, file system directory activity is done through inodes that are stored on disk. The kernel keeps a memory cache of active and recently accessed inodes to reduce disk IOs. When a file is opened through a pathname, the kernel converts the pathname to an inode number and attempts to obtain the inode information from the cache based on the filesystem type. If the inode entry is not in the cache, the inode is read from disk into the inode cache. On HP-UX, the number of used entries in the inode caches are usually at or near the capacity. This does not necessarily indicate that the configured sizes are too small because the tables may contain recently used inodes and inodes referenced by entries in the directory name lookup cache. When a new inode cache entry is required and a free entry does not exist, inactive entries referenced by the directory name cache are used. If after freeing inode entries only referenced by the directory name cache does not create enough free space, the message “inode: table is full” message may appear on the console. If this occurs, increase the size of the kernel parameter, ninode. Low directory name cache hit ratios may also indicate an underconfigured inode cache. On HP-UX, the default formula for the ninode size is: ninode = ((nproc+16+maxusers)+32+ (2*npty)+(4*num_clients)) On Unix systems, this metric is updated every 30 seconds or the sampling interval, whichever is greater. TBL_MSG_TABLE_AVAIL -------------------- The configured maximum number of message queues that can be allocated on the system. A message queue is allocated by a program using the msgget(2) call. Refer to the ipcs(1) man page for more information. TBL_MSG_TABLE_USED -------------------- On HP-UX, this is the number of message queues currently in use. On all other Unix systems, this is the number of message queues that have been built. A message queue is allocated by a program using the msgget(2) call. See ipcs(1) to list the message queues. On Unix systems, this metric is updated every 30 seconds or the sampling interval, whichever is greater. TBL_MSG_TABLE_UTIL -------------------- The percentage of configured message queues currently in use. On Unix systems, this metric is updated every 30 seconds or the sampling interval, whichever is greater. TBL_SEM_TABLE_AVAIL -------------------- The configured number of semaphore identifiers (sets) that can be allocated on the system. TBL_SEM_TABLE_USED -------------------- On HP-UX, this is the number of semaphore identifiers currently in use. On all other Unix systems, this is the number of semaphore identifiers that have been built. A semaphore identifier is allocated by a program using the semget(2) call. See ipcs(1) to list semaphores. On Unix systems, this metric is updated every 30 seconds or the sampling interval, whichever is greater. TBL_SEM_TABLE_UTIL -------------------- The percentage of configured semaphores identifiers currently in use. On Unix systems, this metric is updated every 30 seconds or the sampling interval, whichever is greater. TBL_SHMEM_ACTIVE -------------------- The size (in KBs unless otherwise specified) of the shared memory segments that have running processes attached to them. This may be less than the amount of shared memory used on the system because a shared memory segment may exist and not have any process attached to it. On Unix systems, this metric is updated every 30 seconds or the sampling interval, whichever is greater. TBL_SHMEM_TABLE_AVAIL -------------------- The configured number of shared memory segments that can be allocated on the system. TBL_SHMEM_TABLE_USED -------------------- On HP-UX, this is the number of shared memory segments currently in use. On all other Unix systems, this is the number of shared memory segments that have been built. This includes shared memory segments with no processes attached to them. A shared memory segment is allocated by a program using the shmget(2) call. Also refer to ipcs(1). On Unix systems, this metric is updated every 30 seconds or the sampling interval, whichever is greater. TBL_SHMEM_TABLE_UTIL -------------------- The percentage of configured shared memory segments currently in use. On Unix systems, this metric is updated every 30 seconds or the sampling interval, whichever is greater. TBL_SHMEM_USED -------------------- The size (in KBs unless otherwise specified) of the shared memory segments. Additionally, it includes memory segments to which no processes are attached. If a shared memory segment has zero attachments, the space may not always be allocated in memory. See ipcs(1) to list shared memory segments. On Unix systems, this metric is updated every 30 seconds or the sampling interval, whichever is greater. TIME -------------------- The local time of day for the start of the interval. The time is an ASCII field in hh:mm 24-hour format. This field will always contain 5 characters in ASCII files. The two subfields (hh, mm) will contain a leading zero if the value is less than 10. This metric is extracted from GBL_STATTIME, which is obtained using the time() system call at the start of the interval. This field responds to language localization. In binary files this field contains four byte size subfields. The most significant byte contains the hour, the next most significant byte contains the minute, then the seconds and finally the tenths of a second. The left two bytes can be isolated by dividing by 65536. HHMM = TIME/65536. Then HOUR = HHMM/256 and MINUTE = HHMM mod 256. TTBIN_TRANS_COUNT_1 -------------------- The number of completed transactions in this range during the last interval. On AAN systems, this metric is only available on 5.X or later. TTBIN_TRANS_COUNT_10 -------------------- The number of completed transactions in this range during the last interval. On AAN systems, this metric is only available on 5.X or later. TTBIN_TRANS_COUNT_2 -------------------- The number of completed transactions in this range during the last interval. On AAN systems, this metric is only available on 5.X or later. TTBIN_TRANS_COUNT_3 -------------------- The number of completed transactions in this range during the last interval. On AAN systems, this metric is only available on 5.X or later. TTBIN_TRANS_COUNT_4 -------------------- The number of completed transactions in this range during the last interval. On AAN systems, this metric is only available on 5.X or later. TTBIN_TRANS_COUNT_5 -------------------- The number of completed transactions in this range during the last interval. On AAN systems, this metric is only available on 5.X or later. TTBIN_TRANS_COUNT_6 -------------------- The number of completed transactions in this range during the last interval. On AAN systems, this metric is only available on 5.X or later. TTBIN_TRANS_COUNT_7 -------------------- The number of completed transactions in this range during the last interval. On AAN systems, this metric is only available on 5.X or later. TTBIN_TRANS_COUNT_8 -------------------- The number of completed transactions in this range during the last interval. On AAN systems, this metric is only available on 5.X or later. TTBIN_TRANS_COUNT_9 -------------------- The number of completed transactions in this range during the last interval. On AAN systems, this metric is only available on 5.X or later. TTBIN_UPPER_RANGE_1 -------------------- The upper range (transaction time) for this bin. On AAN systems, this metric is only available on 5.X or later. TTBIN_UPPER_RANGE_10 -------------------- The upper range (transaction time) for this bin. On AAN systems, this metric is only available on 5.X or later. TTBIN_UPPER_RANGE_2 -------------------- The upper range (transaction time) for this bin. On AAN systems, this metric is only available on 5.X or later. TTBIN_UPPER_RANGE_3 -------------------- The upper range (transaction time) for this bin. On AAN systems, this metric is only available on 5.X or later. TTBIN_UPPER_RANGE_4 -------------------- The upper range (transaction time) for this bin. On AAN systems, this metric is only available on 5.X or later. TTBIN_UPPER_RANGE_5 -------------------- The upper range (transaction time) for this bin. On AAN systems, this metric is only available on 5.X or later. TTBIN_UPPER_RANGE_6 -------------------- The upper range (transaction time) for this bin. On AAN systems, this metric is only available on 5.X or later. TTBIN_UPPER_RANGE_7 -------------------- The upper range (transaction time) for this bin. On AAN systems, this metric is only available on 5.X or later. TTBIN_UPPER_RANGE_8 -------------------- The upper range (transaction time) for this bin. On AAN systems, this metric is only available on 5.X or later. TTBIN_UPPER_RANGE_9 -------------------- The upper range (transaction time) for this bin. On AAN systems, this metric is only available on 5.X or later. TT_ABORT -------------------- The number of aborted transactions during the last interval for this transaction. TT_ABORT_WALL_TIME_PER_TRAN -------------------- The average time, in seconds, per aborted transaction during the last interval. On AAN systems, this metric is only available on 5.X or later. TT_APP_NAME -------------------- The registered ARM Application name. TT_APP_TRAN_NAME -------------------- A concatenation of TT_APP_NAME and TT_NAME. This provides a way to uniquely identify a specific transaction. The field is limited to 60 characters. TT_CLIENT_ADDRESS -------------------- The correlator address. This is the address where the child transaction originated. TT_CLIENT_ADDRESS_FORMAT -------------------- The correlator address format. This shows the protocol family for the client network address. Refer to the ARM API Guide for the list and description of supported address formats. TT_CLIENT_TRAN_ID -------------------- A numerical ID that uniquely identifies the transaction class in this correlator. TT_COUNT -------------------- The number of completed transactions during the last interval for this transaction. TT_FAILED -------------------- The number of Failed transactions during the last interval for this transaction name. TT_INFO -------------------- The registered ARM Transaction Information for this transaction. TT_NAME -------------------- The registered transaction name for this transaction. TT_NUM_BINS -------------------- The number of distribution ranges. On AAN systems, this metric is only available on 5.X or later. TT_SLO_COUNT -------------------- The number of completed transactions that violated the defined Service Level Objective (SLO) by exceeding the SLO threshold time during the interval. TT_SLO_PERCENT -------------------- The percentage of transactions which violate service level objectives. The percentage of transactions which violate service level objectives. On AAN systems, this metric is only available on 5.X or later. TT_SLO_THRESHOLD -------------------- The upper range (transaction time) of the Service Level Objective (SLO) threshold value. This value is used to count the number of transactions that exceed this user-supplied transaction time value. TT_TRAN_1_MIN_RATE -------------------- For this transaction name, the number of completed transactions calculated to a 1 minute rate. For example, if you completed five of these transactions in a 5 minute window, the rate is one transaction per minute. For this transaction name, the number of completed transactions calculated to a 1 minute rate. For example, if you completed five of these transactions in a 5 minute window, the rate is one transaction per minute. On AAN systems, this metric is only available on 5.X or later. TT_TRAN_ID -------------------- The registered ARM Transaction ID for this transaction class as returned by arm_getid(). A unique transaction id is returned for a unique application id (returned by arm_init), tran name, and meta data buffer contents. TT_UNAME -------------------- The registered ARM Transaction User Name for this transaction. If the arm_init function has NULL for the appl_user_id field, then the user name is blank. Otherwise, if “*” was specified, then the user name is displayed. For example, to show the user name for the armsample1 program, use: appl_id = arm_init(“armsample1”,“*”,0,0,0); To ignore the user name for the armsample1 program, use: appl_id = arm_init(“armsample1”,NULL,0,0,0); TT_USER_MEASUREMENT_AVG -------------------- If the measurement type is a numeric or a string, this metric returns “na”. If the measurement type is a counter, this metric returns the average counter differences of the transaction or transaction instance during the last interval. The counter value is the difference observed from a counter between the start and the stop (or last update) of a transaction. If the measurement type is a gauge, this returns the average of the values passed on any ARM call for the transaction or transaction instance during the last interval. TT_USER_MEASUREMENT_AVG_2 -------------------- If the measurement type is a numeric or a string, this metric returns “na”. TT_USER_MEASUREMENT_AVG_3 -------------------- If the measurement type is a numeric or a string, this metric returns “na”. If the measurement type is a counter, this metric returns the average counter differences of the transaction or transaction instance during the last interval. The counter value is the difference observed from a counter between the start and the stop (or last update) of a transaction. If the measurement type is a gauge, this returns the average of the values passed on any ARM call for the transaction or transaction instance during the last interval. TT_USER_MEASUREMENT_AVG_4 -------------------- If the measurement type is a numeric or a string, this metric returns “na”. If the measurement type is a counter, this metric returns the average counter differences of the transaction or transaction instance during the last interval. The counter value is the difference observed from a counter between the start and the stop (or last update) of a transaction. If the measurement type is a gauge, this returns the average of the values passed on any ARM call for the transaction or transaction instance during the last interval. TT_USER_MEASUREMENT_AVG_5 -------------------- If the measurement type is a numeric or a string, this metric returns “na”. If the measurement type is a counter, this metric returns the average counter differences of the transaction or transaction instance during the last interval. The counter value is the difference observed from a counter between the start and the stop (or last update) of a transaction. If the measurement type is a gauge, this returns the average of the values passed on any ARM call for the transaction or transaction instance during the last interval. TT_USER_MEASUREMENT_AVG_6 -------------------- If the measurement type is a numeric or a string, this metric returns “na”. If the measurement type is a counter, this metric returns the average counter differences of the transaction or transaction instance during the last interval. The counter value is the difference observed from a counter between the start and the stop (or last update) of a transaction. If the measurement type is a gauge, this returns the average of the values passed on any ARM call for the transaction or transaction instance during the last interval. TT_USER_MEASUREMENT_MAX -------------------- If the measurement type is a numeric or a string, this metric returns “na”. If the measurement type is a counter, this metric returns the highest measured counter value over the life of the transaction or transaction instance. The counter value is the difference observed from a counter between the start and the stop (or last update) of a transaction. If the measurement type is a gauge, this metric returns the highest value passed on any ARM call over the life of the transaction or transaction instance. TT_USER_MEASUREMENT_MAX_2 -------------------- If the measurement type is a numeric or a string, this metric returns “na”. If the measurement type is a counter, this metric returns the highest measured counter value over the life of the transaction or transaction instance. The counter value is the difference observed from a counter between the start and the stop (or last update) of a transaction. If the measurement type is a gauge, this metric returns the highest value passed on any ARM call over the life of the transaction or transaction instance. TT_USER_MEASUREMENT_MAX_3 -------------------- If the measurement type is a numeric or a string, this metric returns “na”. If the measurement type is a counter, this metric returns the highest measured counter value over the life of the transaction or transaction instance. The counter value is the difference observed from a counter between the start and the stop (or last update) of a transaction. If the measurement type is a gauge, this metric returns the highest value passed on any ARM call over the life of the transaction or transaction instance. TT_USER_MEASUREMENT_MAX_4 -------------------- If the measurement type is a numeric or a string, this metric returns “na”. If the measurement type is a counter, this metric returns the highest measured counter value over the life of the transaction or transaction instance. The counter value is the difference observed from a counter between the start and the stop (or last update) of a transaction. If the measurement type is a gauge, this metric returns the highest value passed on any ARM call over the life of the transaction or transaction instance. TT_USER_MEASUREMENT_MAX_5 -------------------- If the measurement type is a numeric or a string, this metric returns “na”. If the measurement type is a counter, this metric returns the highest measured counter value over the life of the transaction or transaction instance. The counter value is the difference observed from a counter between the start and the stop (or last update) of a transaction. If the measurement type is a gauge, this metric returns the highest value passed on any ARM call over the life of the transaction or transaction instance. TT_USER_MEASUREMENT_MAX_6 -------------------- If the measurement type is a numeric or a string, this metric returns “na”. If the measurement type is a counter, this metric returns the highest measured counter value over the life of the transaction or transaction instance. The counter value is the difference observed from a counter between the start and the stop (or last update) of a transaction. If the measurement type is a gauge, this metric returns the highest value passed on any ARM call over the life of the transaction or transaction instance. TT_USER_MEASUREMENT_MIN -------------------- If the measurement type is a numeric or a string, this metric returns “na”. If the measurement type is a counter, this metric returns the lowest measured counter value over the life of the transaction or transaction instance. The counter value is the difference observed from a counter between the start and the stop (or last update) of a transaction. If the measurement type is a gauge, this metric returns the lowest value passed on any ARM call over the life of the transaction or transaction instance. TT_USER_MEASUREMENT_MIN_2 -------------------- If the measurement type is a numeric or a string, this metric returns “na”. If the measurement type is a counter, this metric returns the lowest measured counter value over the life of the transaction or transaction instance. The counter value is the difference observed from a counter between the start and the stop (or last update) of a transaction. If the measurement type is a gauge, this metric returns the lowest value passed on any ARM call over the life of the transaction or transaction instance. TT_USER_MEASUREMENT_MIN_3 -------------------- If the measurement type is a numeric or a string, this metric returns “na”. If the measurement type is a counter, this metric returns the lowest measured counter value over the life of the transaction or transaction instance. The counter value is the difference observed from a counter between the start and the stop (or last update) of a transaction. If the measurement type is a gauge, this metric returns the lowest value passed on any ARM call over the life of the transaction or transaction instance. TT_USER_MEASUREMENT_MIN_4 -------------------- If the measurement type is a numeric or a string, this metric returns “na”. If the measurement type is a counter, this metric returns the lowest measured counter value over the life of the transaction or transaction instance. The counter value is the difference observed from a counter between the start and the stop (or last update) of a transaction. If the measurement type is a gauge, this metric returns the lowest value passed on any ARM call over the life of the transaction or transaction instance. TT_USER_MEASUREMENT_MIN_5 -------------------- If the measurement type is a numeric or a string, this metric returns “na”. If the measurement type is a counter, this metric returns the lowest measured counter value over the life of the transaction or transaction instance. The counter value is the difference observed from a counter between the start and the stop (or last update) of a transaction. If the measurement type is a gauge, this metric returns the lowest value passed on any ARM call over the life of the transaction or transaction instance. TT_USER_MEASUREMENT_MIN_6 -------------------- If the measurement type is a numeric or a string, this metric returns “na”. If the measurement type is a counter, this metric returns the lowest measured counter value over the life of the transaction or transaction instance. The counter value is the difference observed from a counter between the start and the stop (or last update) of a transaction. If the measurement type is a gauge, this metric returns the lowest value passed on any ARM call over the life of the transaction or transaction instance. TT_USER_MEASUREMENT_NAME -------------------- The name of the user defined transactional measurement. The length of the string complies with the ARM 2.0 standard, which is 44 characters long (there are 43 usable characters since this is a NULL terminated character string). TT_USER_MEASUREMENT_NAME_2 -------------------- The name of the user defined transactional measurement. The length of the string complies with the ARM 2.0 standard, which is 44 characters long (there are 43 usable characters since this is a NULL terminated character string). TT_USER_MEASUREMENT_NAME_3 -------------------- The name of the user defined transactional measurement. The length of the string complies with the ARM 2.0 standard, which is 44 characters long (there are 43 usable characters since this is a NULL terminated character string). TT_USER_MEASUREMENT_NAME_4 -------------------- The name of the user defined transactional measurement. The length of the string complies with the ARM 2.0 standard, which is 44 characters long (there are 43 usable characters since this is a NULL terminated character string). TT_USER_MEASUREMENT_NAME_5 -------------------- The name of the user defined transactional measurement. The length of the string complies with the ARM 2.0 standard, which is 44 characters long (there are 43 usable characters since this is a NULL terminated character string). TT_USER_MEASUREMENT_NAME_6 -------------------- The name of the user defined transactional measurement. The length of the string complies with the ARM 2.0 standard, which is 44 characters long (there are 43 usable characters since this is a NULL terminated character string). TT_WALL_TIME_PER_TRAN -------------------- The average transaction time, in seconds, during the last interval for this transaction. YEAR -------------------- The year, including the century, the data in this record was captured. This metric will contain 4 digits, such as 2002.