Computer Performance Monitoring

This is my concise reference on analyzing and tuning Microsoft Windows and Unix/Linux machines using counters and other tools for performance profiling testing and tuning

Sound: “Away, away, I tell you. You are overloading the machine!”
Sound: Galloping horse
"Warning: Warp core collapse in 10 seconds" (from Star Trek, The Movie)

Topics this page:

Log Analysis

Site Map
About this site

System Analysis Utilities downloads
directory.google.com/ Top/ Computers/ Performance_and_Capacity

Types of Monitoring

Internal Probes/Agents that run within the server (as a process/daemon) and sends counter values to an application like the Windows Task Manager or out to a Diagnostics Mediator/Server

External Monitors (such as the unix rstad daemon) that respond to operator commands sent using SSH (Secure Shell) protocol through the network to the server being monitored, which then responds with another transmission over the network.

External Monitors

The amount of resources used by each server in an J2EE or .NET application can be monitored externally by sending requests to Windows performance monitor or commands as a Unix Secure Shell (SSH) session user.

Since all monitoring requests orginate from outside the server being monitored, some call this a "black box" approach to testing.

This is the approach used by Mercury SiteScope, LoadRunner and Business Process Monitor.

Just because a monitor is "agent-less" doesn't mean that it is "non-intrusive" in that it imposes overhead on the system being tested.

Probes/Agents Within Servers

This approach is classified as "white box" testing because probes are installed inside each app server under test (SUT).

ReliAgent

embedded "narks"

After a probe is "instrumented" to recognize methods running on its app server, it makes an announcement whenever it detects invocations of servlets, JSPs, EJBs, JNDI, JDBC, JMS, and Struts.

Probes report on JVM memory heap usage and the memory consumption of Java collections.

Microsoft Windows System Monitors

Task Manager lists major user-selected objects in real-time.
System Monitor MMC snap-in to the Performance console — which Windows 2000 renamed from the Windows NT4 Perfmon — graphs, in real-time, objects on remote computers as well as the local computer.
Network Monitor does the same exclusively for network activity on each NIC card.
Performance Logs and Alerts are configured to create logs which can later be analyzed together using MS-Excel, SAS, or other statistical analysis application.
Servers configured with a Simple Network Management Protocol (SNMP) agent service send SNMP messages to a central SNMP sink machine which holds the messages for analysis by some 3rd party SNMP management system.
Enterprise infrastructure management software catch business transaction instrumentation messages issued from C and Java programs using the ARM (Application Response Measurement) API to measure application availability, application performance, application usage, and end-to-end transaction response time.

Measuring .NET Application Performance - Chapter 15 of Microsoft's Improving .NET Application Performance and Scalability series. May 2004

"Flapping" occurs when a monitored resource quickly alternates between states.

MS Windows Task Manager

Press Ctrl-Shift-Esc keys at the same time.
Press Ctrl-Alt-Del, then select Task Manager.
Press and R at the same time or
press then R for Run and press Enter,
then type taskmgr and click OK.

Perfmon screen

This sample screen shows both green User mode and red kernel mode CPU Usage because View, Show Kernel Times has been selected:

Perfmon menu

By default, the update speed is set to “Normal”, which means once per second.

I prefer the "Low" setting when I see what is hogging up CPU cycles (by clicking the "CPU" heading):

Perfmon menu

Metrics shown in the Processes tab can be selected from View, Select Columns:

The above are defaults. Session ID and User Name are new since Windows XP.

Perfmon at Microsoft.com

Windows Performance Secrets by Van Name, Mark L.; Catchings, Bill.; Butner, Richard. Que, 1998.

Save as Webpage

The graphic displayed in the resulting page is dynamically generated by an ActiveX control (the OBJECT CLASSID refers to sysmon.ocx in %WINDIR%\system32\) on Windows 200X machines previously named Performance Monitor (Perfmon.exe) in Windows NT 4.0.

tiny res meter from PEsoft is a stand-alone exe (not installed).

MS Windows System Monitor

immediately

System Monitor is an ActiveX control, so can be displayed in applications that support this object type. For example, System Monitor can be displayed in Internet Explorer or in a Microsoft Office application, such as Microsoft Word. The simplest way to export the OLE Custom eXtension (OCX) is to select a Counter log and on the Action menu, click Save Settings As. The file is then saved as a standard .html file and viewed in Internet Explorer. System Monitor is fully functional whether it is running as part of the Performance console or in another application.

View data in System Monitor in one of three views: chart, histogram, or report. The chart and histogram views are typically used for analyzing real-time data. The report view is ideal for viewing a summary of data collected in counter logs. You can use all views to see real-time and logged data.

typeperf.exe (a variant of perfmin) that comes with Win2k3 dumps perf data to CSV, TSV or a database.

Ganglia's monitoring architecture takes less resources.

Configuring Remote Windows Machine Monitoring by User Accounts

According to Q158438: If you are not using an account with no administror privileges to an NT/Win 2000 machine being monitored, you must first grant read permission to certain files and registry entries. The required steps are:

Using Explorer or File Manager, give the user READ access to: %windir%\system32\PERFCxxx.DAT
%windir%\system32\PERFHxxx.DAT
where xxx is the basic language ID for the system. For example, 009 for English. If these files are missing or corrupt, expand them off of the installation cd.
Using REGEDT32, give the user READ access to: HKEY_LOCAL_MACHINE\Software\Microsoft\Windows NT\CurrentVersion\Perflib and all sub keys of that key.
Using REGEDT32, give the user at least READ access to: HKEY_LOCAL_MACHINE\System\CurrentControlSet\ Control\SecurePipeServers\winreg
Give the user at least READ access to the following key and allow Read permission to propagate down to the Services subkeys: HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services
With Windows 2000, in addition to the access described above, the user must also have access granted by the following Group Policies: Computer Configuration\Windows Settings\Security Settings\Local Policies\User Rights Assignment
- Profile Single Process
If the user is neither a power user nor an administrator, additional permissions might be needed to access SysMonLog services. To grant full access to SysMonLog services, run the subinacl /service sysmonlog /grant=tester=f command, where tester is the user account.

A system is described as "quiescent" (dormant; in a state of tranquil repose; at rest; resting; still; inactive; quiet;) when its CPU is running no active user tasks.

A system is described as "pegged" when its CPU utilization remains at or near 100% -- the maximum.

Unix Systems Monitoring

This section discusses performance monitoring tools among variants of the UNIX operating system which consists of: The kernel, the shell, and the file system.

Nagios is a popular open-source performance monitoring.
Pro Nagios 2.0 (Apress, 2006, 424 pages) by James Turnbull

Site Scope

Mercury Site Scope

monitors

Colorado based Freshwater Software

Mercury

So Sitescope monitors can be viewed among other LoadRunner Controller System Resource Graphs:

CPU Utilization Monitor
DNS Monitor
Directory Monitor
Disk Space Monitor
Log File Monitor
Memory Monitor
Network Monitor
Ping Monitor
Port Monitor
Script Monitor
Service Monitor
URL Monitor
URL List Monitor
URL Sequence Monitor
Web Server Monitor
WebLogic Application Server Monitor

SiteScope is called an "agent-less" technology because it sends native UNIX commands (defined in \SiteScope\templates.os).

Sitescope exerts about a 10% overhead on servers responding to remote queries. However, it duplicates the same requests made by LoadRunner's UNIX monitors (hitting servers with twice as much monitoring traffic).

Perhaps for this reason the SiteScope monitor has a default measurement update rate of once every 10 minutes. Unless you change this default to 15 seconds (the most frequent allowed), you won't see measurements in Controller graphs.

Invoke SiteScope.
Click on the name of the monitor.
Click on "Edit" next to the Counter name.
In the "Update Every" entry, change the value to 15 seconds.
Click on the "Update" button to register the change.

Linux Status Commands

uptime provides an instantaneous summary such as
11:42pm up 18 days, 8:45, 5 users, load average: 0.01, 0.03, 0.07
For the current time, the Number of days up since last boot, the number of users currently logged in, and the load average for the last 1, 5, and 15 minute intervals.
The load average (LA) is the average number of processes (the sum of the run queue length and the number of jobs currently running) that are ready to run, but are waiting for access to a busy CPU. Averages from 0 to 1.0 are acceptable for a single CPU. As a general rule of thumb, a machine is being overworked if load averages consistently exceed three times the number of CPUs.
top provides load average with auto refresh and additional data (sorted by %CPU):
68 processes: 67 sleeping, 1 running, 0 zombie, 0 stopped CPU states: 12.2% user, 1.6% system, 0.0% nice, 86.1% idle
Solaris comes with the prstat command to provide this info. Graphical versions of this include gtop within Gnome and the KDE Process Monitor.
Linux Load Average Not Your Average Average by Neil J. Gunther
For memory usage, press M.
For CPU info, press P.
To stop display, enter q.
The 'SIZE' field is the total virtual memory size of each process, including all code, data, stack, mapped files, libraries etc.
procinfo -fn30 is used to gather system data from the /proc directory every 30 seconds.
- Last Boot time
- Load Average
  - average number of jobs running
  - number of runnable processes
  - total number of processes
  - PID of the last process run (idem)
- Swap info
- Memory resources
- Number of disks
- IRQ info
- Installed modules (with the -a or -m option)
- File Systems (with the -a or -m option)
xos provides a constantly updated colorful summary view of various components.

Monitoring processes on UNIX by Alexander Podelko

The US$6,375 SarCheck for Sun Solaris, HP-UX, AIX, and SCO systems produces a report productizes concepts from the ISBN 0130952494 US$58/$22 book Sun Performance and Tuning: Java and the Internet (2nd Edition) by Adrian Cockcroft and Richard Pettit (Sun Microsystems Press, April 7, 1998 -- now outdated)

IBM Linux Technology Center

IBM developerWorks Linux Zone

Adrian Cockcroft's SymbEL toolkit (SE) toolkit, based on the C language, provides direct access to many of the data sources in the kernel. His FAQ

$42 Capacity Planning for Internet Services by by Adrian Cockcroft & Bill Walker (Prentice Hall: January 15, 2001) focuses on Sun Microsystems servers and utilities, but advice is generic on strategies for documenting user demand, mathematical models for making predictions, and processes for planning capacity.

$55 Resource Management by Richard Mc Dougall, Adrian Cockcroft, Evert Hoogendoorn, Enrique Vargas, Tom Bialaski (Prentice Hall; Sep. 1999) ties resource management control loop to service level management.

$36 Configuration and Capacity Planning for Solaris Servers by Brian L. Wong (Prentice Hall; Jan. 1997)

$36 Solaris Performance Administration : Performance Measurement, Fine Tuning, and Capacity Planning for Releases 2.5.1 and 2.6 (McGraw-Hill; 1998) by Cervone, H. Frank

$28 System Performance Tuning, 2nd Edition (O'Reilly System Administration) by Gian-Paolo D. Musumeci, Mike Loukides (O'Reilly; February 2002)

$31/10 Web Performance Tuning, 2nd Edition (O'Reilly Internet) ( O'Reilly; 2nd Edition: March 2002) by Patrick Killelea, who maintains patrick.net

$45 Linux Performance Tuning and Capacity Planning by Jason R Fink, Matthew D. Sherer (SAMS; Aug. 2001)

Process stats

ps -a lists by ID processes spawned by the current user from the current shell.
Option -f shows child processes. For each Unique Process ID (PID):
In solaris, option -l (for long) shows these additional columns:
ps -Af displays a full list of all processes on the system, including additional details.
ps -a lists the most frequently requested processes.
Other flags include:
To filter only processes of the current user in Linux:
To scroll up and down in Linux:

DOWNLOAD: Microsoft's Process Monitor(ProcMon.exe) (v2.8 released by Mark Russinovich and Bryce Cogswell Nov. 2009). It captures in real-time and combines in one GUI every file system, Registry, and process/thread activity, including those of low-level programs (such as lsass, svchost, etc.) and background apps such as anti-virus. Unlike the legacy Sysinternals utilities Filemon and Regmon it replaces, it provides rich and non-destructive filtering, simultaneous logging to a file, and comprehensive session IDs and user names event properties.

It highlights the issues with system operations, such as "BUFFER OVERFLOW", "BUFFER TOO SMALL", "FAST IO DISABLED", "NAME NOT FOUND", "FILE LOCKED WITH ONLY READERS".

Click on an activity for its full thread stacks, with integrated symbol support for each operation,

Shell script commands, pipes and other commands.

The Most Executed Code in Solaris ... the CPU Idle Loop by Bill Holler

ps [-a] [-A] [-c] [-d] [-e] [-f] [-j] [-l] [-L] [-P] [-y] [ -g grplist ] [ -n namelist ] [-o format ] [ -p proclist ] [ -s sidlist ] [ -t term] [ -u uidlist ] [ -U uidlist ] [ -G gidlist ]

-a	List information about all processes most frequently requested: all those except process group leaders and processes not associated with a terminal.
-A	List information for all processes. Identical to -e, below.
-c	Print information in a format that reflects scheduler properties as described in priocntl. The -c option affects the output of the -f and -l options, as described below.
-d	List information about all processes except session leaders.
-e	List information about every process now running.
-f	Generate a full listing.
-j	Print session ID and process group ID.
-l	Generate a long listing.
-L	Print information about each light weight process (lwp) in each selected process.
-P	Print the number of the processor to which the process or lwp is bound, if any, under an additional column header, PSR.
-y	Under a long listing (-l), omit the obsolete F and ADDR columns and include an RSS column to report the resident set size of the process. Under the -y option, both RSS and SZ will be reported in units of kilobytes instead of pages.
-g grplist	List only process data whose group leader's ID number(s) appears in grplist. (A group leader is a process whose process ID number is identical to its process group ID number.)
-n namelist	Specify the name of an alternative system namelist file in place of the default. This option is accepted for compatibility, but is ignored.
-o format	Print information according to the format specification given in format. This is fully described in DISPLAY FORMATS. Multiple -o options can be specified; the format specification will be interpreted as the space-character-separated concatenation of all the format option-arguments.
-p proclist	List only process data whose process ID numbers are given in proclist.
-s sidlist	List information on all session leaders whose IDs appear in sidlist.
-t term	List only process data associated with term. Terminal identifiers are specified as a device file name, and an identifier. For example, term/a, or pts/0.
-u uidlist	List only process data whose effective user ID number or login name is given in uidlist. In the listing, the numerical user ID will be printed unless you give the -f option, which prints the login name.
-U uidlist	List information for processes whose real user ID numbers or login names are given in uidlist. The uidlist must be a single argument in the form of a blank- or comma-separated list.
-G gidlist	List information for processes whose real group ID numbers are given in gidlist. The gidlist must be a single argument in the form of a blank- or comma-separated list.

rpc.rstatd stats

rpc.rstatd

/etc/inet/inetd.conf

Get it from SourceForge, then install it using Joel Griffth's instructions:

Build and install rstatd:

$ tar xvzf rstatd.tar.gz
$ cd rpc.rstatd
$ ./configure --prefix=/usr
$ make
# sudo su
# make install

Add a line to the hosts.allow file within /etc/ to specify the subnet(s) allowed to make rstatd requests. For example:
```
rpc.rstatd: 10.0.95.0/255.255.255.0 10.0.8.0/255.255.255.0
```
Alternately, if you want to live dangerously:
```
rpc.rstatd: ALL
```

Add rstatd entry in /etc/xinetd.d/rstatd:

# default: off
# description: An xinetd internal service which rstatd's characters back to clients.

service rstatd
{
    type            = RPC
    rpc_version     = 2-4
    socket_type     = dgram
    protocol        = udp
    wait            = yes
    user            = root
    only_from       = 10.0.95.0/24
    log_on_success  += USERID
    log_on_failure  += USERID
    server          = /usr/sbin/rpc.rstatd
    disable         = no
}

Restart xinetd:
To start rpc.rstatd under Red Hat Linux, run as root

Rstatd vs. SAR

rstatd SAR

Collisions rate - Collisions per second detected on the Ethernet network wire.
Incoming packets errors rate - Errors per second while receiving Ethernet packets.
Incoming packets rate - Incoming Ethernet packets per second.
Outgoing packets rate - Outgoing Ethernet packets per second.
Paging rate - Number of pages read to physical memory or written to pagefile(s), per second.
Page-in rate - Number of pages read to physical memory, per second.
Page-out rate - Number of pages written to pagefile(s) and removed from physical memory, per second.
Average load - Average number of processes simultaneously in 'Ready' state during the last minute.
Context switches rate - Number of switches between processes or threads, per second.
Swap-in rate - Number of processes being swapped into memory, per second.
Swap-out rate - Number of processes being swapped out from memory, per second.
CPU utilization - Percent of time that the CPU is utilized.
System mode CPU utilization - Percent of time that the CPU is utilized in system mode.
User mode CPU utilization - Percent of time that the CPU is utilized in user mode.
Disk rate - Rate of disk transfers, per second.
Interrupts rate - Number of device interrupts per second.

These statistics are queried and displayed using the perfmeter OpenWindows XView utility

SAR (System Activity Reporter)

To interactively obtain (-A) all counters to the (-o outfile) after a timeslice of 5 repeated over 2000 samples:

sar -A -o outfile 5 2000

If the "5 2000" frequency is not provided, the default is to write one record.

The background Report Package can be enabled by uncommenting the appropriate lines in the sys crontab or activated with: svcadm enable sar.

su sys -c "/usr/lib/sa/sadc /var/adm/sa/sa`date +%d`"

This uses the sadc (data collector) in the /usr/lib/sa folder to write to dated files in the /var/adm/sa folder. On Solaris 10 the service is "svc:/system/sar:default", which reads the same kstat data that iostat uses.

Other Monitors

Quest SQL Server Dashboard

Quest Spotlight on SQL Server Enterprise offers this dashboard, which provides a visual approach to organizing metrics. Throughput rate metrics are shown with arrows. Critical issues are in orange.

vxstat utility in HP-UX

MeasureWare Agent in Hewlett-Packard's OpenView network management suite

Perform Agent, previously named BEST/1 Agent, for BMC Patrol Perform/Predict Performance assurance

UNIX rstat.d daemon counters

Counters.chm file from the Windows 2000 Resource Kit.
Counters.hlp file from the Windows NT Workstation 4.0 Resource Kit.

Object	Potential Issue	Summary Counter	Subset	Total	Threshold for Action	Potential Remedies
Network (Ethernet) Interface W: Netmon	Collisions	T: Collisions rate (of packets) per second detected on the wire.			>1%	Reduce # of machines on subnet or use higher bandwidth network
	Utilization Rate in Bytes/sec	W: Current Bandwidth (theoretical bits per second) /8 bits/byte	W: Bytes Received/sec	W: Bytes Total/sec (including framing characters)	-
	Utilization Rate in Bytes/sec		W: Bytes Sent/sec	W: Bytes Total/sec (including framing characters)	-
	Utilization & Error Rate in Packets/sec	T: Incoming packets rate per sec. W: Packets Received/sec sar -y canch/s = Input characters processed by canon (canonical queue)/sec sar -y rawch/s = Input characters (raw queue)/sec	T: Incoming packets errors rate per sec.	W: Packets Received Unicast/sec + W: Packets Received Non-Unicast/sec	-
			Good Packets in/sec		-
		T: Outgoing packets rate per second. W: Packets Sent/sec sar -y outch/s = Output characters (output queue)/sec	-	W: Packets sent Unicast/sec + W: Packets Sent Non-Unicast/sec	-
			Good Packets out/sec		-
	Error Count	W: Packets Received Errors W: Packets Received Discarded W: Packets Received Unknown			> 1	Adjust network buffers
	Error Count	W: Packets Outbound Errors W: Packets Outbound Discarded			> 1	Adjust network buffers
	Interrupts	sar -y [Terminal Activity]: xmtin/s = Transmitter hardware interrupts per second mdmin/s = Modem interrupts per second rcvin/s = Receiver hardware interrupts per second			-
	Delay	W: Output Queue Length			> 2

SAR -y is Terminal Activity

netstat

Picking A Network Monitor

number of error-free packets the system is dropping

total packet throughput

Counters Packets Outbound Errors and Packets Received Errors indicate network card hardware problems.

Don't rely on the Current Bandwidth counter because it shows theoretical rather than actual bandwidth.

A reasonable limit for an Ethernet network is %Network Use less than 30 percent. A higher value means you need to speed up the network or reduce the amount of traffic.

Use the %Broadcast Frames and %Multicast Frames counters to view the percentages of broadcast and multicast traffic. Network cards pass broadcast and multicast frames to a higher-level software component before they act on or discard them. This extra activity results in additional CPU use.

As the requesting computer connects to find the server computer's network address, it generates broadcast traffic. Frame traffic increases as the server transfers the files.

Similarly, don't use the Output Queue Length counter because it's always zero, since transmission requests are not handled by the network card but by network device interface specification (NDIS) software.

The MRTG (Multiple Router Tafffic Grapher)

$1099 Denika from Somix plugs into the MRTG to dispaly 500 different performance trend reports after saving SNMP and IpSwitch WhatsUp Gold logs to a MySQL ODBC database (as a Microsoft service).

Logging Data with Perfmon for MRTG

Network Monitor

Microsoft provides two versions of NetMon. Install the "full" promiscuous version of Netmon from Microsoft's Systems Management Server (SMS) 1.2 and 2.0 product to capture packets on all NICs on remote network subnets.

Otherwise, the network card typically rejects network traffic intended for other network cards.

Install Network Monitor from Control Panel > Add/Remove Programs > Add/Remove Windows Components > Management And Monitoring Tools This puts netmon.exe and its dlls in the %windows%\system32\netmon folder.
Apply patch from MS Security Bulletin MS00-83 to patch the buffer overrun vulnerability from malicious malformed data.
Invoke Netmon from a command prompt or
Start > Programs > Administrative Tools
By default capture files are saved with the .cap file suffix in the My Captures folder under the My Documents folder of the current user.

Netmon filter specifications are stored in the NetMon\Captures subdirectory. Netmon allows filtering by protocol, TCP/IP address, and data pattern.

This activity drains the resources of the computer you're analyzing, limit Network Segment monitoring. Monitoring Network Segment counters increases CPU use. As these counters process network traffic, they use additional system resources.

network monitoring tools

Open Source Ethereal displays and filters network data files captured by sniffers, even while a capture session is in progress. It resolves DNS.

Show Traffic, rewrite of Linux Trafshow.

$450 Network Probe from ObjectPlanet.

Network Sniffers

Open Source WinPcap (installed as System protocol Driver NPF at the same level of tcpip.sys is visible in msinfo32.exe System Information panel, Software Environment). It exports primitive compatible with Unix capture library libpcap
So, to unload it:
```
 net stop npf
```
libpcap tcpdump
Microsoft's Network Monitor,
NAI's SnifferT
SnifferT Pro,
NetXrayT,
Sun snoop and atmsnoop,
Shomiti/Finisar Surveyor,
AIX's iptrace,
Novell's LANalyzer,
RADCOM's WAN/LAN Analyzer,
HP-UX nettl,
i4btrace from the ISDN4BSD project,
Cisco Secure IDS iplog,
pppd log (pppdump-format),
AG Group's/WildPacket's EtherPeek/TokenPeek/AiroPeek,
Visual Networks' Visual UpTime
$1,395 Distinct Network Monitor [A] translates complex protocol negotiation codes into natural language, providing an intelligent interpretation pinpointing where errors occurred.
SNMP4NT SNMPUTIL.EXE SNMP Network Management Station (NMS) to Walk SNMP OID trees to get MIB info from network devices MS perf2mib and mibcc
$200 WebWatchBot

Object	Potential Issue	Summary Counter	Subset	Total	Threshold for Action	Potential Remedies
Memory W:	RAM Installed (see Determining Memory)	-	W: Available K/M/Bytes		< 300 MBytes	Add memory
	RAM Installed (see Determining Memory)	-	W: Committed (Virtual Memory) Bytes (for paging)
	Memory Leakage	W: Pool Paged Bytes			Increases over a long period of time	Java?
	Page Fault operations involving physical (hard) disk activity rather than just in (soft) memory	Page fault recovery operations	W: Page Reads (from disk)/Sec sar -p pgin/s requests/sec	W: (Hard and Soft) Page Faults/Sec	> 20
		Page fault recovery operations	W: Page Writes (to disk)/Sec sar -p atch/s (attaches per second) of page faults satisfied by reclaiming a page in memory.		> 20
		W: Write Copies/sec			none
	Pages involved in Page Fault physical (hard) disk activity rather than just in (soft) memory	U: Paging Rate W: Pages/Sec (Incidents of Hard Faults)	U: Page-in rate W: Pages Input (from disk)/Sec sar -p ppgin/s	W: (Hard and Soft) Page Faults/Sec	> 5	Increase paging file size
		U: Paging Rate W: Pages/Sec (Incidents of Hard Faults)	U: Page-out rate W: Pages Output (to disk)/Sec		> 5	Increase paging file size
		W: (Non-disk/soft Recoverable) Transition Faults/sec sar -p pflt/s page faults from protection errors per sec (illegal access to page) or "copy-on-writes". sar -p vflt/s valid page not in memory (address translation) page faults/sec. sar -p slock/s software lock requests requiring physical I/O faults/sec			none

Is there enough swap space?

How much memory is each process really using (is Private)?

To install RMCmem:

# cd /tmp
# zcat RMCmem3.8.2.tar.gz | tar xvf -
# pkgadd -d .

To obtain private memory totals by mapped file:

# /opt/RMCmem/bin/pmem 361

To obtain private memory by PID:

# /opt/RMCmem/bin/memps

To determine the "Private" bytes that is NOT shared with other processes (excluding application binaries),
Download and install the RMCmem Package from ftp://playground.sun.com/pub/memtool for Solaris 2.5 kernel modules that provides extra instrumentation.

sar -r [unused memory pages and disk blocks]:

sar -b [Buffer activity]:

sar -h [system heap statistics, not available in SiteScope]:

sar -p commands obtain paging activities stats from UNIX systems.
Subsets of vflt/s = address translation page faults (valid page not in memory) [not in SiteScope]

sar -k [kernel memory allocation (bytes)]

sml_mem = bytes the KMA has available in the small memory request pool (of less than 256 bytes).
alloc = bytes the KMA has allocated from its small memory request pool to small memory requests.
fail = number of requests for small amounts of memory that failed.
lg_mem = bytes the KMA has available in the large memory request pool (from 512 bytes to 4 Kbytes).
alloc = bytes the KMA has allocated from its large memory request pool to large memory requests.
fail = number of failed requests for large amounts of memory.
ovsz_alloc = bytes allocated for oversized requests (larger than 4 Kbytes). These requests are satisfied by the page allocator. Thus, there is no pool.
fail = number of failed requests for oversized amounts of memory.

Server

MS IIS 6 Counters of the WWW Service, its Web Service Cache, FTP Service, Internet Information Services Globals, SNMP, Active Server Pages, ASP.NET.

Server -> Pool Nonpaged Failures shows the number of times allocations from nonpaged pool have failed - indicates that the computer `s physical memory is too small.
Server -> Pool Paged Failures indicate that either physical memory or a paging file is near capacity.
Server -> Pool Nonpaged Peak shows the maximum number of bytes in nonpaged pool the server has had in use at any one point. Indicates how much physical memory the computer should have.

	Potential Issue	Summary Counter	Subset	Total	Threshold for Action	Potential Remedies
System Thread Queues W:	Saturation Congestion	W: Processor Queue Length (ready but non-running threads on all CPUs) The Windows Processor Queue Length counter is part of the Windows System object instead of the Processor object because there is only a single queue for processor time (even on multiprocessor computers). Unless the processor is running at very high sustained utilization, this counter is likely to return a result of 0 because it displays only the last observed value, not an average. sar -q [Queue length]: runq-sz (run queue size) = number of kernel threads in memory waiting for a CPU to run.			> 2 - A sustained processor queue of more than two threads is an indication of a processor bottleneck. Consistently higher values mean that the system might be CPU-bound.	More powerful (high Mhz) CPU capacity.
	Occupation	sar -q [Queue length] %runocc (run queue occupied) = processes in memory and runnable.			> 90%	-
	Thrashing	W: Context Switches/sec (all processors)		mpstat csw (context switches)	> 5% of total threads	Less apps per CPU
Latency	prstat -mL LAT column (time waiting for CPU)		mpstat csw (context switches)	-	-

Note: swpq-sz & %swpocc (swap) are no longer reported by sar.

	Potential Issue	Summary Counter	Subset	Total	Threshold for Action	Potential Remedies
Processor W:	Utilization percentage (individual CPU)	U: CPU utilization W: % Processor Time	U: System mode CPU utilization W: % Privileged Time (in kernel-mode) sar -u %sys (system mode)	W: Elasped Time (100%)	> 80%	Scale CPUs
		U: CPU utilization W: % Processor Time	U: User mode CPU utilization W: % User Time (for apps) sar -u %usr (mode)		> 80%	Scale CPUs
		W: % Idle Time sar -u %idle (not waiting for I/O)			-
	Utilization amount	W: Working Set Peak (bytes)		W: Working Set (bytes)	Difference	?

Solaris vmstat presents memory, run-queue, and summarized processor utilization. It uses kstat which maintains CPU utilization for each CPU.

Solaris mpstat presents per-processor stats and utilization.

Solaris

	Potential Issue	Summary Counter	Subset	Total	Threshold for Action	Potential Remedies
Processes W:	Swapping	count of LWP Transfers per sec	sar -w swpin/s	-	> 1	More memory to swap
		count of LWP Transfers per sec	sar -w swpot/s	-	> 1	More memory to swap
		512 byte Blocks	sar -w bswin/s	-	-	-
		512 byte Blocks	sar -w bswot/s	-	-	-
	Switching	sar -w pswch/s = (kernel thread) switches	sar -w pswpout/s = process swapouts/sec [not in SiteScope]	-	-	-
	Size	prstat -s RSS prstat -z (per zone)	-	-	-	-

Solaris ps presents per-process stats.

Solaris prstat presents thread-level microstate accounting (with high-resolution time stamps) and per-project stats for resource management.

sar -v [entries/size for each table, evaluated once at sampling point, not available in SiteScope]:

sar -c [System calls]:

sar -m [Message and semaphore activities (for Interprocess Communication)]:

sar -t [translation lookaside buffer (TLB) activities, not available in SiteScope]:

sar -I [interrupt statistics, not available in SiteScope]:

sar -a [File access system routines]:

Object	Potential Issue	Summary Counter	Subset	Total	Threshold for Action	Potential Remedies
Physical Disk W:
	Bottleneck	W: Current Disk Queue Length	W: Avg. Disk Read Queue Length	W: Avg. Disk Queue Length sar -d avque	>2	More hard drives.
	Bottleneck	W: Current Disk Queue Length	W: Avg. Disk Write Queue Length	W: Avg. Disk Queue Length sar -d avque	>2
	Utilization Percentage	W: % Disk Time sar -d %busy = portion of time device was busy servicing transfer requests.	W: % Disk Read Time	Elasped Time	>90%
			W: % Disk Write Time		>90%
		W: % Idle Time			n/a
	Utilization Per Incident	W: Avg. Disk Bytes/Transfer	W: Avg. Disk Bytes/Read		< 20K may indicate an app. is accessing too little at a time.
	Utilization Per Incident	W: Avg. Disk Bytes/Transfer	W: Avg. Disk Bytes/Write
	Utilization Incident Rate	U: Disk Rate W: Disk Transfers/sec sar -d blks/s (of 512 bytes) sar -d r+w/s (read AND write I/O requests)	W: Disk Writes/sec sar -d read/s		Ratio of writes keeping up with reads?	?
	Utilization Incident Rate		W: Disk Reads/sec sar -d write/s		Ratio of writes keeping up with reads?	?
	Speed (Latency)	W: Avg. Disk sec/Transfer	W: Avg. Disk sec/Read		> 0.3 seconds may indicate disk controller needs too many retrys.	Faster drives.
	Speed (Latency)	W: Avg. Disk sec/Transfer	W: Avg. Disk sec/Write			Faster drives.
	Delay	sar -d avwait = average wait time in milliseconds. avserv = average service time in milliseconds.			-	-
	Fragmentation	W: Split IO/sec			?	Defragment

Disk I/O speeds are about 10-100 times slower than memory. Disk I/O speeds will be very fast when data is store on filer disk arrays because such devices usually have a large amount of memory to cache data.

A single spindle can generally handle 50 accesses per second.

Solaris/UNIX Hard Disk Monitoring

vmstat utility

vmstat 5 5

		
	kthr     memory             page              faults        cpu
----- ----------- ------------------------ ------------ -----------
r  b   avm   fre  re  pi  po  fr   sr  cy  in   sy  cs us sy id wa
0  0 217485   386  0   0   0   4   14   0 202  300 210 14 19 22 45

In the example above, the "wa" (wait) column shows that 45% of the CPU time is being used waiting for database I/O.

To fully understand the nature of I/O, you must first understand Oracle's asynchronous writing mechanism. Then you need to look at the SAP applications and explore how SAP tables are populated and managed from within the SAP application. With this knowledge, you can then make intelligent decisions about the proper file placement.

iostat

Windows Hard Disk Monitoring

Action:	Physical only	Logical only	Both
Enable:	-yd (default)	-yv	-y
Disable:	-nv	-nv	-n

diskperf.exe -y //computername

This table presents diskperf parameters to specify the hard disk performance counters to start when the machine is restarted.

Windows XP, however, displays this message:

Both Logical and Physical Disk Performance counters on this system are automatically enabled on demand. For legacy applications using IOCTL_DISK_PERFORMANCE to retrieve raw counters, you can use -Y or -N to forcibly enable or disable. No reboot is required.

DHCP Audit Alerts

Enable

Server name

Use this command to use the DHCP Server Locator Utility from the Windows 2000 Resource Kit to, every 6000 seconds, detect active DHCP servers on the network and send to the email addresses in file A:\Admins.txt the IP addresses of DHCP servers not listed in the file auth_dhcp_ip_list.txt.

dhcploc.exe -p -a:"A:\Admins.txt" -i:6000 auth_dhcp_ip_list.txt

Other Measurements

Object	Metric	Threshold for Action
Memory Pool Size	handles
Thread pool	Context Switches/sec	.
Temp space	Page Faults/sec	.

Log File Formats

Performance Logs and Alerts

later viewing

System events, such as processes created or deleted, are traced using Trace logs.
The status of data is continuously sampled at fixed intervals (such as every 15 seconds), whether events occur or not. They are saved into Counter logs.

Counter Logs

Perfmon format
binary (.blg) -- the default format used by the Windows 2000 Performance Monitor. This format provides all of the information contained in the Perfmon format, but in a more space-efficient manner.
comma-separated-value (.CSV) containing fields delimited by double-quotes and commas.
tab-separated-value (.TSV) containing fields delimited by double-quotes and tab characters.

Both the Perfmon and binary log file formats are proprietary formats developed by Microsoft.

Two types of binary log files exist: circular and linear.

Circular log files create log files until they reach a user-defined size, then a new log file is generated.
Linear log files can be limited to a user-defined size or be set to grow to 1 GB or to consume all remaining disk space, whichever comes first.

The first line of CSV-format and TSV-format log files serves as the header, providing information about the format of the file, the version of the PDH (Microsoft Performance Data Helper) interface used to create the log file, and the names and paths of each of the counters to the PDH.

The PDH library can open a log file in the Perfmon format only for reading.

Included in all versions of Microsoft Windows XP (excluding Windows XP Home Edition)

relog.exe logfile.blg -f csv -o logfile.csv
converts the .blg file to a .csv file. It can also resample a log file, and then create a new log file based on specified counters, a time period, or a sampling interval.
Download it (429 KB) from the Microsoft Windows 2000 Resource Kit Tools for administrative tasks

logman.exe start Sample_Log
starts Sample_log data collections remotely from a central location by specifying the remote computer name.
logman.exe can also:

Configure a data collection on one computer and then
copy that configuration to multiple computers — from a central location.
Query currently-running logs and traces.

typeperf "Memory\Available Bytes" -s XPPRO -si 00:05
outputs the Available Bytes Memory counter from a remote computer named "XPPRO" every 5 seconds. It can also:

Write performance data to either the command window or to a supported log file format.
Display the counters currently available on a particular local or remote computer.

Trace Logs

.etl

Use the TRACEDMP.EXE utility from the Windows 2000 Server and Professional Resource Kits to read .etl files to create DUMPFILE.CSV files for viewing by other applications. The utility also creates a SUMMARY.TXT file.

Log Codes

Code Type of Information

0 Success

1 Error

2 Warning

4 Information

8 Audit_Success

16 Audit_Failure

Thresholds Trigger Alerts

Performance Logs and Alerts

Alerts node

thresholds

triggers

Statscout Performance Monitor from Enterprise Management Associates

Server Virtualization Tips: Profiling and load distribution for virtual machines Anil Desai, site expert 08.08.2006

Latencies

J2EE Monitoring

So here are the approaches, from the least costly to the most costly:

1. First, find the average end-to-end response time by emulating end-user client exchanges with the web server. Identify the machine and service which consume the most CPU, network bandwidth, and other resources during stress runs which incrementally add users until a server reaches its maximum rate of processing (as measured by the hits/pages processed per second metric).

2. Identify the average response time of key services by emulating calls directly to each service (XML calls to app servers, SQL calls to databases). Watch them during stress runs.

3. Work with developers to add application code which displays key performance information along with user data (like the times that Google displays with each search result). This allows web (HTML) based client scripts to simply obtain the information.

4. Work with developers to add application code which issues transaction-level performance information to a log. Most mature application packages allow administrators to control the verbosity of the application's logs.

Ideally, the alerts are formatted to make it easy for logs to be combined with other logs for analysis after the run is finished. If not, logs may need to be run through a custom parser.

5. Formatting alerts to the ARM (Application Resource Management) standard allows the alerts to be issued (using the free API) and collected in real-time by business service management systems in production. See http://www.wilsonmar.com/1perfmon.htm#ARMz This is the best approach, IMHO.

6. Code LoadRunner scripts in Java to emulate the client. This complex approach I describe briefly in http://wilsonmar.com/1lrscript.htm#JavaTech This approach is time-consuming because parts of the client application needs to be rewritten in the test scripts (user authentication, file encryption, client session and cookie management, calls to servers, etc.). Such scripts needs to precisely identify and specifically format calls to services.

7. Install agents inside J2EE servers which sends status to the Mercury Business Availability Center.

8. The new version of WebLogic integrates with LoadRunner to provide performance monitoring that can be turned on or off dynamically on production machines.

10. J2EE monitoring packages include:

SiteScope from Mercury

Symantec I3 (purchased from Veritas)

Compuware's Vantage Analyzer for J2EE (the JView product acquired from DevStream on October 4, 2004)

Dirig

Quest has added a J2EE monitor to its database monitoring product well-known to DBAs.
Borland's Optimizeit ServerTrace, provides J2EE performance metrics in the testing and deployment. providing monitoring and analysis of a distributed environmen.
Borland's Optimizeit Enterprise Suite used during development to provide individual developers with a focused view into performance issues in their code.

http://www.javaperformancetuning.com/tips/j2ee.shtml

ARM Instrumentation

messages

Enterprise Business Service Management Applications

To measure application availability, application performance, application usage, and end-to-end transaction response time in a vendor-neutral way, the ARM (Application Response Measurement) API (first released June 1996) are created by an Open Source Working Group based on their UMA (Universal Measurement Architecture) which involve ARM API calls received by ARM agents feeding Enterprise Management Applications.

The ARM Technical Standard in July 1998 originally defined a set of six library procedure calls for programmers to call from within their source code to initialize the ARM subsystem and define the beginning and end of Business Transactions being measured:

arm_init	Names an application (with a handle) and initializes the ARM environment for the new application handle.
arm_getid	Names each business transaction a unique transaction identifier monitored within the app
arm_start	Starts the clock for a unique transaction instance.
arm_update	Update statistics for a long running transaction
arm_stop	Stops the clock for (register the end of) a transaction instance.
arm_end	Cleans up the ARM environment prior to shutdown for the app handle associated from a previous arm_start.

ARM 2.0 SDK dated 11/11/97 is offered in UNIX and Windows flavors, along with sample.c source code for each platform. ARM 2.0 added the ability to correlate parent and child transactions, and to collect other measurements associated with the transactions, such as the number of records processed. This SDK (explained in the User Guide) provides:

a libarm4 Microsoft linked dll (copied to the System32 folder) or Linux shared library.
The arm4.h header file for C apps in ARM version 4 added the capability to track the amount of time a transaction is blocked waiting for an external event.
for use in testing instrumentation, the logagent.c source code to a logging agent , which makes use of the armagent.h header file used by agents.
stubs for both C and (since ARM 3.0 in 2001) Java apps to use when an ARM agent is not installed.

ARM 2.01 Patched ARM 2.0 with new arm201.h files.
ARM 3.0 SDK added Java bindings.
ARM 4.0 -- also confusingly called ARM Version 2 because ARM 4.0 is not backward compatible with ARM 2.0. -- on Oct. 2003 published header files and Bindings for C and Java, but no sample source.

Business Service Management Applications

The "Big Four":

IBM's

Business Workload Manager (BWLM) Prototype

HP OpenView Measure Ware glanceplus

Mercury's Diagnostics

BMC Patrol

CA (Computer Associates)

Compuware

ASG (formerly Landmark until February 19, 2002)

Emerging vendors:

Hyperic

ZenOSS

Integration vendors:

ProactiveNet

Martin Haworth at HP Openview shows an alternative mechanism for capturing response time measures with Service Management Using The Application Response Measurement API Without Application Source Code Modification by routing interactions through "dumb" Remote Terminal Emulation (RTE) so that data exchanged can be captured for examination.

Log Analysts

Seagate Crystal Reports 6 Event Log Viewer is a full-featured report writer that provides an easy way to extract, view, save, and publish information from the Windows 2000 system, application, and security event logs in a variety of formats. It integrates new web reporting technology.
CyberSafe Log Analyst is a Windows 2000 Security Event Log analysis tool designed as a snap-in to the Microsoft Management Console (MMC) used with Windows 2000. It organizes and interprets security event logs from Windows 2000, providing more effective, system-wide user activity analysis.

Also:

Foundstone NTLast

Tuning Applications

base priority

process code

To move the paging file to another hard disk on your computer running Windows 2000 Professional, in the System Properties dialog box, click the Advanced tab, and then click the Performance Options button.

Quality of Service

Admission Control Service

Quality of Service (QoS)

restrict or guarantee

$5 Windows 2000 Performance Tuning and Optimization (McGraw-Hill: 2001) by Kenton Gardinier, Chris Amaris

Quiz Questions

What usually has the greatest negative effect on processor performance?
a. paging
b. compression
c. fragmentation
d. network bandwidth
Answer: b
Which command could you use to measure CPU load on a UNIX or Linux system?
a. sar -q
b. strace
c. winstat 5
d. netstat -m
Answer: a
Where should the Linux and Windows swap partition be located to provide the best performance?
a. at the end of the drive
b. in the middle of the drive
c. at the beginning of the drive
d. do not use a swap partition; use a swap file instead
Answer: c

Websites on Perfmon

Runtime profiling The .NET performance counters API
Analyzing Processor Performance
Saving Settings
Troubleshooting Performance Monitor Counter Problems
How to Create a Performance Monitor Log for NT Troubleshooting
Finding Leaks and Bottlenecks with a Windows NT PerfMon COM Object API (16 printed pages)
How to Use Logevent.exe to Log Events From a Batch File
February 2000 Microsoft Systems Journal article: It's Simple to Build PerfMon Support into Your Apps With a Little Help from COM
August 1998 Microsoft Systems Journal article: Custom Performance Monitoring for Your Windows NT Applications
Sample ADO program SmAlert.exe Extends PerfMons Alert Mechanism described by Rick Anderson's March 1999 Alerts Are Cheap Insurance
Confirmed problem: Windows 2000 Performance counters may become corrupted and result in inaccurate data or a total loss of performance data from that counter.
C Programming Code for Transferring Data from a Perfmon-format Log File to a CSV-format Log File
PerfMon counter values can be retrieved by a VB Script file after Windows Management Instrumentation SDK program MOFCOMP compiles and loads a .mof files spcifying providers, classes, keys, and Property Context values conforming to the industry standard Common Information Model (CIM). Download sample files.
How to Monitor Free Space in a User SQL Database with PerfMon
How to Set Up SQL Performance Monitor Database Alerts

Web Metrics

Installing and Configuring Windows 2000

Your rating of this page:
Low High

Your comments on this topic, please:

Publish this comment publicly

Your first name:

Your family name:

Your location (city, country):

Your Email address:

Email me updates

Top of Page

Thank you!

Computer Performance Monitoring

Types of Monitoring

External Monitors

Probes/Agents Within Servers

Microsoft Windows System Monitors

MS Windows Task Manager

Save as Webpage

MS Windows System Monitor

Configuring Remote Windows Machine Monitoring by User Accounts

Unix Systems Monitoring

Site Scope

Linux Status Commands

Process stats

rpc.rstatd stats

Rstatd vs. SAR

SAR (System Activity Reporter)

Other Monitors

Measured Objects

Network (Ethernet) Interface

SAR -y is Terminal Activity

netstat

Picking A Network Monitor

Network Monitor

Network Sniffers

Memory

Is there enough swap space?

How much memory is each process really using (is Private)?

Server

SystemThreadQueues

Processor

Processes

Physical Disk

Solaris/UNIX Hard Disk Monitoring

Windows Hard Disk Monitoring

DHCP Audit Alerts

Other Measurements

Log File Formats

Counter Logs

Trace Logs

Log Codes

Thresholds Trigger Alerts

Latencies

J2EE Monitoring

ARM Instrumentation

Business Service Management Applications

Log Analysts

Tuning Applications

Quality of Service

Quiz Questions

Websites on Perfmon

System
Thread
Queues