Linux Performance Metrics

The Uptime Infrastructure Monitor Linux agent collects the follo= wing performance metrics from the systems on which it is installed:

Each set of performance metrics is averaged over an interval of one seco= nd.

CPU

The Uptime Infrastructure Monitor agent uses the sar -urWqR 1 command to compare t= he system counters during a one-second interval. The statistics returned by= the agent are averaged for all CPUs on the system.

Metric=	Explanation	Source=
% USR	The percentage of time = that the processor spends in user mode (a processing mode for applications = and subsystems).	/proc/cpuinfo
% SYS	The percentage of time = that the kernel spends processing system calls.	/proc/cpuinfo
% WIO	The amount of waiting t= ime that a runable process for a device takes to perform an I/O operation.<= /td>	/proc/cpuinfo
% Total	The total amount of Use= r %, System %, and Wait I/O %	/proc/cpuinfo
Run Queue Length	The percentage of time = that one or more services or processes are waiting to be served by the CPU.=	/proc/cpuinfo

Multi-CPU

The Uptime Infrastructure Monitor agent uses the sar and mpstat utilities on a Linux system to collect the metrics in the= table below from Linux systems with multiple CPUs. The agent averages the = statistics from each CPU using the sar -x SELF -I SUM -P ALL -wu 1command, which compares the system counters during a one-second interv= al. The statistics that the agent returns are for the entire system, per CP= U.






Metric=

Explanation


User %
The percentage of CPU u=
ser processes that are in use.


System %
The percentage of CPU k=
ernel processes that are in use.


Wait I/O %
The percentage of time =
that a process which can be run must wait for a device to perform an I/O op=
eration.


SMTX
The number of read or w=
rite locks that a thread was not able to acquire on the first attempt, as r=
eported by the mpstat command.


XCAL
The number of interproc=
ess cross-calls. In a multi-processor environment, one processor sends cros=
s-calls to another processor to get that processor to do work. Cross-calls =
can also be used to ensure consistency in virtual memory. Heavy file system=
 activity (such as NFS) can result in a high number of cross-calls.


Interrupts
The number of CPU inter=
rupts.


Total %
The total amount of Use=
r %, System %, and Wait I/O%.




Memory
The Uptime Infrastructure Monitor agent uses the free command to collect the Free =
Memory metric from a Linux system. The rest of the memory related metrics a=
re gathered by the 



Metric=

Explanation
Source=



Free Memory
The amount of physical =
memory available to the operating system, system library files, and applica=
tions.
/proc/meminfo


Cache Hit Rate
How often the system ac=
cesses the CPU cache.
/proc/meminfo


PageOut per Second
The rate at which pages=
 were written to disk.
/proc/meminfo


PageIn per Second
The rate at which pages=
 were read from or written to the disk.
/proc/meminfo


PageFree per Second
The number of pages tha=
t are freed from memory each second.
/proc/meminfo


PageScan per Second
The average number of p=
ages that are scanned each second.
/proc/meminfo


Free Swap
The amount of available=
 free swap space, as a percentage of total available free swap space.
/proc/meminfo


Disk
The Uptime Infrastructure Monitor agent gathers file system statistics f=
or each file system using the df -lk command. Disk statistics =
(e.g. %busy, reads per second and writes per second) are output per disk an=
d compared between polling intervals using the iostat -d -x 1 2=
 command.




Metric=

Explanation
Source=



Disk (Spindle) Name
The names of each disk =
on the system.
/proc/diskstats


Usage (% Busy)
The percentage of time =
during which the disk drive is handling read or write requests.
/proc/diskstats


Throughput (Blk/s)
The number of read and =
write operations on the disk that occur each second.
/proc/diskstats


Read/Writes/s
The average number of b=
ytes that have been transferred to or from the disk during write or read op=
erations.
/proc/diskstats


Average Queue Length
The number of threads t=
hat are waiting for processor time.
/proc/diskstats


Average Service Time
The average amount of t=
ime, in milliseconds, that is required for a request to be carried out.
/proc/diskstats


Average Wait Time
The average time, in mi=
lliseconds, that a transaction is waiting in a queue. The wait time is dire=
ctly proportional to the length of the queue.
/proc/diskstats




Network
The Uptime Infrastructure Monitor agent uses the netstat -s=
 command to retrieve a combined total of TCP Retransmits for all network in=
terfaces. Other network statistics (e.g. kbps, errors and collisions) are a=
veraged, per interface, using the sar -n DEV -n EDEV 1 command=
, which compares the system counters during a one-second interval.




Metric=

Explanation
Source=



In Kbps
The rate, in kilobytes =
per seconds, at which data is received over a specific network adapter.
/proc/net


Out Kbps
The rate, in kilobytes =
per seconds, at which data is sent over a specific network adapter.
/proc/net


In Errors
The number of inbound p=
ackets that contained errors, which preventing those packets from being del=
ivered to a higher-layer protocol.
/proc/net


Out Errors
The number of outbound =
packets that could not be transmitted because of errors.
/proc/net


Collisions
The number of signals f=
rom two separate nodes on the network that have collided.
/proc/net


TCP Retransmits
The number of packets t=
hat have been re-sent over a network interface. The agent returns a combine=
d total for all interfaces.
/proc/net




Process
The Uptime Infrastructure Monitor agent uses the ps -eo com=
mand to collect the process information listed in the table below from a Li=
nux system. By default, the agent gathers the top 20 processes and sorts th=
em by the highest CPU usage.




Metric=

Explanation
Source=



PID
The unique identifier o=
f a specific process.
/proc/stat


PPID
The identifier of the p=
rocess that the process that is currently running.
/proc/stat


UID
A value that identifies=
 the current user.
/proc/stat


GID
A value that identifies=
 a group of users.
/proc/stat


Memory Consumed
The amount of memory th=
at is being used by a process.
/proc/stat


RSS
The amount of physical =
memory that is being used by a process.
/proc/stat


CPU % Utilization by Pr=
ocess
The percentage of CPU t=
ime that is being used by individual processes.
/proc/stat


Memory % Utilization by=
 Process
The amount of physical =
memory that is being used by individual processes.
/proc/stat


Process Start Time
The time at which the p=
rocess started.
/proc/stat


Process Run Time
The time at which the p=
rocess started.
/proc/stat


Number of Processes Run=
ning
The total number of pro=
cesses that are currently running on the system.
/proc/stat


Number of Blocked Proce=
sses
The total number of pro=
cesses that are blocking resources.
/proc/stat


Number of Waiting Proce=
sses
The total number of pro=
cesses that are waiting to be executed by the CPU.
/proc/stat


Execs per Second
The total number of sys=
tem calls that are executed each second.
/proc/stat


Process Creation Rate
The total number of pro=
cesses that are being spawned over a specified time period.
/proc/stat




Workload
The Uptime Infrastructure Monitor agent uses the ps utility=
 to collect workload information from a Linux system. Workload statistics (=
based on the same 20 processes that were gathered from the Process method) are sorted within Uptime Infrastructure=
 Monitor's core. The workload processes that the agent gathers include the =
user/group/process name and their individual statistics, which can be sorte=
d based on the user's desired graph presentation (e.g. user, group or proce=
ss name).




Metric=

Explanation
Source=



Workload by Process
The demand that network=
 and local services are putting on a system, based on the processes that ar=
e running.
/proc/load


Workload by User
The demand that network=
 and local services are putting on the system, based on the IDs of the user=
s who are logged into a system.
/proc/load


Workload by Group
The demand that network=
 and local services are putting on the system, based on the IDs of the user=
 groups that are logged into a system.
/proc/load


Workload Top 10 by Proc=
ess
The 10 processes that a=
re consuming the most CPU resources.
/proc/load


Workload Top 10 by User=

The 10 processes the ar=
e consuming the most CPU resources, based on user ID.
/proc/load


Workload Top 10 by Grou=
p
The 10 processes the ar=
e consuming the most CPU resources, based on group ID.
/proc/load




User
The Uptime Infrastructure Monitor agent uses the following commands to c=
ollect user statistics from a system:

ps -eo
last | head 10 (login history for the last 10 users on the=
 system)
who (lists who is currently logged into the system)





Metric=

Explanation


Login History
The number of times or =
frequency at which a user has logged into a system during any 30 minute tim=
e interval.


Sessions
The number of sessions =
or number of distinct users who are logged into a system during any 30 minu=
te time interval.




    


------=_Part_2732_1353179413.1749926602940--

Metric=	Explanation
User %	The percentage of CPU u= ser processes that are in use.
System %	The percentage of CPU k= ernel processes that are in use.
Wait I/O %	The percentage of time = that a process which can be run must wait for a device to perform an I/O op= eration.
SMTX	The number of read or w= rite locks that a thread was not able to acquire on the first attempt, as r= eported by the mpstat command.
XCAL	The number of interproc= ess cross-calls. In a multi-processor environment, one processor sends cros= s-calls to another processor to get that processor to do work. Cross-calls = can also be used to ensure consistency in virtual memory. Heavy file system= activity (such as NFS) can result in a high number of cross-calls.
Interrupts	The number of CPU inter= rupts.
Total %	The total amount of Use= r %, System %, and Wait I/O%.

Metric=	Explanation
Login History	The number of times or = frequency at which a user has logged into a system during any 30 minute tim= e interval.
Sessions	The number of sessions = or number of distinct users who are logged into a system during any 30 minu= te time interval.