Software Performance Project Planning

This page presents the phases, deliverables, roles, and tasks for a full performance test project that makes use of several industry best practices and tools for load testing and performance engineering — one of the activities for capacity management of IT Service Management (ITSM). This is a companion to my Sample Load Testing Report

"If you can't describe what you are doing as a process, you don't know what you're doing." —W. Edwards Deming

Site Map
About this site

Aspects of a Performance Improvement Project

There are two environments for ensuring performance:

The performance of the system under test, which is described in the
sample performance profile report

The process for conducting performance testing described below as:
phases, deliverables, roles, and tasks
These are considered in order to avoid Anti-Patterns.

Pre-requisites

Ideally, all this occurs AFTER considering the organization's quality maturity this site at ensuring smooth team development by clarifying participants' roles and risk-adjusted on this page milestone schedules toward delivering a flow of artifacts created by achieving project process objectives using provisions needed for load-testing

Phases: Define > Measure > Analyze > Improve > Control

I prefer to use this 5-phase approach detailed on this page and on other pages of this website.

Define [Identify] (internal and external) customer concerns issues formally as qualitative requirements and quantitative performance criteria with numerical Targets supporting service levels at optimal TCO.
Study the Stystem/Application Under Test to specify the Test Architecture and Test Behavior. Define Test Timing Constraints. Prepare Test Data.

Measure (quantify/determine) current/existing Actual Load Patterns in response to Actual Usage Patterns by designing experiments and constructing and validating LoadRunner emulation script coding and Test Scenario parameters for the types of performance defects of concern.

Analyze the root cause(s) of bottlenecks (defects) revealed on reports and graphs created from results of Diagnosing load test runs while conducting performance tuning and exploring Various Configurations for Scalability in an analytical model
Conclusions are published along with recommendations for management inquiries and action.

Improve [Optimize] the process by eliminating the root causes of botlenecks (defects) by implementing tuning changes possible to application code and configurations

Control [Verify] future process performance by monitoring production performance in Real Time so that Off-estimate Alerts are issued to perhaps trigger planned Upgrade trigger thresholds previously defined in Predicted Performance Profiles for anticipated loads

The approach above was drawn from several capacity management frameworks:

In the electronics industry:

Engineering Verification Testing (EVT)

After prototyping, and after the product goes though the Design Refinement cycle when engineers revise and improve the design to meet performance and design requirements and specifications, objective, comprehensive Design Verification Testing (DVT) is performed to verify all product specifications, interface standards, OEM requirements, and diagnostic commands.

Process (or Pilot) Verification Test (PVT) is a subset of Design Verification Tests (DVT) performed on pre-production or production units to Verify that the design has been correctly implemented into production.

The Microsoft Operations Framework (MOF) defines this circular process flow of capacity management activities:

Trend Analysis
Modeling
Optimization
Change Initiation
Monitoring

Oracle's Expert Services' Architecture Performance Capacity Scope & Assessment consulting uses these phases and deliverables:

Phase Deliverable

1. Define Assessment Criteria A. Assessment Scope
B. Business and Functional Needs

2. Define High Level Requirements C. High Level Requirements Matrix

3. Document System Baseline D. Current Architecture Diagram (Baseline) Report

4. Generate Findings, Recommendations, and Conceptual Architecture E. Conceptual Architecture Diagram
F. Assessment Report
G. Final Presentation

Software Performance Engineering (SPE)

Assess Performance Risks such as antipatterns (recurring causes of problems). A flood of simultaneous requests is called "The Slashdot Effect" because the many readers of slashdot.org visit a site at the same time after it is mentioned on the online magazine.
Identify critical Use Cases
Select key performance workload scenarios (sequence diagrams detailing key use cases)
Establish performance objectives under specific workload intensities (such as best and worst case response times).
Define performance model(s): Queueing network models (QNM) use Information Processing Graph notation to illustrate the locus of execution among a sequence of queues. Execution Graphs illustrate the probability and frequency of execution of processing steps relevant to performance analysis.
Construct performance model(s)
Add software resource requirements (e.g., messages sent, database accesses, etc.)
Add computer resource requirements (e.g., CPU instruction utilization, disk I/O throughput, network connections, screen draws)
Evaluate performance model(s)

"5S" Kaizen Lean Approach

Sort tools used most often vs. what is infrequent
Stabilize (make tools easy to access)
Shine (make defects easy to catch, keep tools sharp and appropriate)
Standardize (teamwork)
Sustain

Test Policies

The Test Policy is the document which describes an organization's philosophy towards the testing (or quality assurance) of software. The test policy is complementary to, or a component of, the organization's Quality Policy, which describes the basic views and aims of a company, regarding quality, as pursued by management.

The Test Handbook is a framework document which describes the test steps to be excuted to address risks which should be covered by software testing. and the test activities to be carried out.

Risk is a combination of the possibility of the occurrence of a problem and the resulting effect of that problem.

The test concept describes the test steps and test activities to be executed for a particular project.

The test step plan details the procedure for a test step and describes implementation of the test concept for a particular test step.

Deliverables Flow

Click on this image to pop-up a full-sized image.

Pop-up this diagram to its own window

The objects in this flowchart are labeled so that they can be referenced in plans and schedules.

Forms/Types of Performance Testing/Engineering "Approaches to Performance Testing" by Matt Maccaux			Accomplishments
	A. Speed Tests (for Responsiveness) conclusions	During speed testing, the user response time (latency) of each user action is measured. The script for each action will look for some text on each resulting page to confirm that the intended result appears as designed. Since speed testing is usually the first performance test to be performed, issues from installation and configuration are identified during this step. Because this form of performance testing is performed for a single user (under no other load), this form of testing exposes issues with the adequacy of CPU, disk I/O access and data transfer speeds, and database access optimizations. The performance speed profile of an application obtained during speed testing include the time to manually start-up and stop the application on its servers.	Identified the business processes under test. Documented each user action to be measured. Documented production installation configuration instructions and settings. Quantified the start-up, shut-down, and user GUI transaction response (latency) times when the system is servicing only a single user at a time (under no other load) in order to determine whether they are acceptable. Ensured CPU, disk access, data transfer speeds, and database access optimizations are adequate.
	B. ContentionTests (for Robustness) conclusions	This form of performance test aims to find performance bottlenecks (such as lock-outs, memory leaks, and thrashing) caused by a small number of Vusers contending for the same resources. Each run identifies the minimum, average, median, and maximum times for each action. This is done to make sure that data and processing of multiple users are appropriately segregated. Such tests identify the largest burst (spike) of transactions and requests that the application can handle without failing. Such loads are more like the arrival rate to web servers than constant loads.	Identified performance bottlenecks (such as lock-outs, memory leaks, and thrashing) caused by a small number of Vusers contending for the same resources. Ensured that data and processing of multiple users are appropriately segregated. Identified the largest burst (spike) of transactions and requests that the application can handle without failing. Such loads are more like the arrival rate to web servers than constant loads.
	C. Volume Tests (for Extendability) conclusions	This form of performance testing makes sure that the system can handle the maximum size of data values expected. These test runs measure the pattern of response time as more data is added. These tests make sure there is enough disk space and provisions for handling that much data, such as backup and restore.	Quantified the degradation in response time and resource consumption at various levels of simultaneous users. This is done by gradually ramping-up the number of Vusers until the system "chokes" at a breakpoint (when the number of connections flatten out, response time degrades or times out, and errors appear). Determined how well the number of users anticipated can be supported by the hardware budgeted for the application. Quantified the "Job flow balance" achieved when application servers can complete transactions at the same rate new requests arrive. Ensured that there is enough transient memory space and memory management techniques. Make sure that admission control techniques limiting incoming work perform as intended. This may include extent of response to Denial of Service (DoA) attacks.
	D. Stress / Overload Tests (for Sustainability) conclusions	This form of performance testing determines how well the number of users anticipated can be supported by the hardware budgeted for the application. This is done by gradually ramping-up the number of Vusers until the system "chokes" at a breakpoint (when the number of connections flatten out, response time degrades or times out, and errors appear). During tests, the resources used by each server are measured to make sure there is enough transient memory space and adequate memory management techniques. This effort makes sure that admission control techniques limiting incoming work perform as intended. This includes detection of and response to Denial of Service (DoA) attacks.	Quantified the degradation in response time and resource consumption at various levels of simultaneous users. Determined how well the number of users anticipated can be supported by the hardware budgeted for the application. Quantified the "Job flow balance" achieved when application servers can complete transactions at the same rate new requests arrive. Ensured that there is enough transient memory space and memory management techniques. Make sure that admission control techniques limiting incoming work perform as intended. This may include extent of response to Denial of Service (DoA) attacks.
	E. Fail-Over Tests (for Resilience & Recoverability) conclusions	This form of performance testing determines how well (how quickly) the application recovers from overload conditions. For example, this form of performance testing ensures that when one computer of a cluster fails or is taken offline, other machines in the cluster are able to quickly and reliably take over the work being performed by the downed machine. This means this form of performance testing requires multiple identical servers to be configured and using Virtual IP addresses accessed through a load balancer device.	Determined whether the application can recover after overload failure. Measured the time the application needs to recover after overload failure.
	F. Spike "Peak-Rest" or "Daily" Tests conclusions	This form of performance testing involves suddenly adding the maximum sustainable load and then returning to a lower level of load to determine whether the app can obtain memory quickly, and then release that memory when no longer needed. Such runs can involve a "rendevous point" where all users line up to make a specific request at a single moment in time. Such runs enable the analysis of "wave" effects through all aspects of the system. Most importantly, these runs expose the efficacy of load balancing. article article	Determined — through suddenly adding and then completing transactions — that the app releases memory.
	F. Endurance "Soak" "Longevity" Tests (for Reliability) conclusions	This form of performance testing makes sure that the system can sustain -- over at least a 24 hour period -- a consistent number of concurrent Vusers executing transactions using near peak capacity. Because longer tests usually involve use of more disk space, these test runs also measure the pattern of build-up in "cruft" (obsolete logs, intermediate data structures, and statistical data that need to be periodically pruned). Longer runs allow for the detection and measurement of the impact of occasional events (such as Java Full GC and log truncations) and anomalies that occur infrequently. These tests verifies provisions for managing space, such as log truncation "cron" jobs that normally sleeps, but awake at predetermined intervals (such as in the middle of the night).	Ensured that the system can sustain over at least a 24 hour period a consistent number of concurrent Vusers executing transactions using near peak capacity. Measured the pattern of build-up in cruft (logs, data structures, and statistics that need to be periodically pruned). Detected the impact of occasional events (such as automatic cache flushes, Java Full GC and log truncations) and anomalies that occur infrequently. Make sure there is enough disk space and provisions for managing space, such as log truncation jobs that only occur automatically in the middle of the night.
	H. Scalability (Efficiency) or Reconfiguration Tests conclusions	This form of performance testing involves repeating tests above on different server/network hardware configurations to determine the most cost-effective option to support targeted load levels (one aspect of Capacity Planning). The outcome of scalability efforts feeds a spreadsheet to calculate how many servers the application will need based on assumptions about demand.	Determined — through repeated tests on different server/network hardware configurations — the most cost-effective option to support targeted load levels (one aspect of Capacity Planning).
	I. Availability (Schedulability) conclusions	This form of performance testing provides a continuous assessment of the availability and speed of key user actions. These are run on applications in production mode. This provides alerts when thresholds are reached and trends to guage the average and variability of response times.

The Organizational Context of Load Testing: Capacity Management

Testing is important because it is:

a means to obtain objective quality metrics about systems in their target environment.
the central means to relate requirements and specification to the real system.

But testing is

Rarely practiced
Unsystematic
Error-prone
Considered being destructive
Uncool ("If you are a bad programmer, you might be a tester")

Some (mistakenly) think that there is no need for capacity planning and performance engineering since hardware has become so cheap that it takes more time and effort than to "simply" over-engineer architectures or "just" add more when necessary.

However, most of the bottlenecks today are in the software, which are expensive to diagnose. Additionally, many default configuration options are not appropriate for running large servers.

The newness of the profession make its "fit" within organizations subject to individual fears with personal power and control politics overshadowing organizational needs.

In some way, its role is much like people who use wind-tunnels within an aircraft manufacturer. It can be a frustrating politically — if response time is good, you get no credit. If performance degrades, you aren't doing your "job".

I believe that because of the cross-functional coordination nature of capacity management, it best belongs within a PMO (Project Management Office) guiding cross-functional initiatives requiring the cooperation of the entire enterprise, such as ERP, Y2K, and Digitization (elimination of paperwork).

Capacity Management

A META Group study published in September 2003 reveals that capacity planning is the top critical issue for large enterprises (those with more than 1,000 people), with 33.8 percent of respondents identifying this as a critical issue. This high priority will continue through 2005/2006, escalating consolidation and efficiency demands.

Load testing ensures that demand for computing power can be met by the supply of computing power.

Proactive capacity management balances business requirements with IT resources, so you can consistently deliver quality service at minimum cost while minimizing the risks of higher utilization rates.

Both ITIL and MOF (Microsoft Operations Framework) recognize that CM consists of three sub-processes:

Business Capacity Management (BCM) is implemented by
Demand Management activities responsible for ensuring that the future business requirements for IT services are considered, planned, and implemented in a timely manner. The capacity management staff can achieve this by analyzing current resource utilization of the various IT solutions and generating trends and forecasts. These future requirements come from account management, which constantly probes current and future customer needs.
Service Capacity Management (SCM) is implemented by
Workload Management activities responsible for translating customer demands into workloads required by IT solutions (the various applications used to create the actual solution) so that the required resources can be determined from this analysis. The process translates both current and future demands to workloads.
Resource Capacity Management (RCM) is implemented by
Performance Management activities

Articles on performance of IBM zSeries S/390 mainframes at zJournal

The above makes use of concepts from Six Sigma

vs. the "Planning to Implement Service Management" function referenced by ITIL

based SIPs (Service Improvement Plans).

Six Sigma

improve existing

DMAIC

This is one of many Design for Six Sigma (DFSS) methodologies.

The words Identify, Design, Optimize, and Verify in [brakets] are the basis for the acronym to the IDOV methodology for designing new products and services to meet six sigma standards.

I also appreciate the "Define, Measure, Explore, Develop and Implement" steps from the PricewaterhouseCoopers methodology because they treat performance project artifacts with the same controls as "real" developers.

ITIL Service Delivery

Capacity Management function

Service Management standards

MOF (Microsoft Operations Framework)

Visio 2000 flowchart file
In this pseudo usecase diagram of Service Delivery

Business, Customers, and Users
1. make Queries/Enquiries
2. and exchange Communications, Updates, and Reports with the
Service Level Management function which
1. defines Requirements from a Service Catalog quantified
2. into Targets defined in external SLAs (Service Level Agreements) and internal OLAs (Operating Level Agreements) translated into Operating Level Requirements (OLRs).
3. tracked for achievements by SLRs (Service Level Reports)
Availability Management creates an Availability Plan based on Design Criteria. Success at realizing them is tracked in availability Targets/Thresholds reports.
Capacity Management creates a Capacity Plan and Schedules from a CDB. Success at realizing them is tracked in reports of capacity Targets/Threasholds reports
Financial Management for IT Services creates a Financial Plan based on the Costs & Charges of each type and model summarized into Budgets and Forecasts reports
IT Service Continuity Management creates an IT Continuity Plan based on Risk Analysis
All these organizational functions create
- Alerts and Exception Changes
- supported by Management Tools and Infrastructure
managed by the Service Reporting function and supported/monitored by
Information Security Management

Capacity Plan

The capacity plan recommends the resource levels and changes necessary to accomplish operating level requirements that support the service level agreement (SLA). The capacity plan includes the cost and benefit of those resources, reports of their compliance to the IT SLA, and the priority and impact of systems and resources on the overall business and the IT infrastructure.

The Capacity Plan documents:

Actual Capacity Utilization (current levels and service performance of resources being used)
Desired Capacity Utilization (a forecast future resource requirements based on business requirements and the IT services needed to support them)
Basis for Budget Planning

Other Approaches

Mark McWhinney's SEI Load Test Planning Process associates the 6 areas of a Load Test Plan in the sequence they are addressed.

Mark McWhinney's Critical Success Factors for Load Test Projects

Mercury's Capacity Planning product webpage (OEM'ed from HyPerformix) by Sivan Metzger, Product Manager

Capacity Management

Organizational Concerns

presenting

scalability

To developers of software, it could be mean re-architecting which program modules or services performs some aspect of the application's work. Developers tend to start by a profiling report ranking how often specific modules (classes) are executed.
To the DBA (Data Base Administration) organization, improvement could mean optimization of the proc, such as adding indexes.
DBAs tend to start by ranking how often specific database procedures hit the production database.
To data center Operations personnel, it could mean adding hard drives, additional servers, fiber, NAS filer devices, etc. to prevent and recover from crashes.
Such people make use of logs or summaries of logs produced by servers under their control.
To quality assurance management which is separate from the above organizations, a scalability improvement project may be a way for executives to assess (audit) the performance of individuals in the other departments.

The performance engineer's role is often misdefined and misunderstood. He or she can be blamed for not providing information, or scorned for providing unfavorable information.

To operations personnel who are used to doing what they want, their support of capacity management efforts can be perceived as additional unneeded intrusion

Naturally, the role of capacity management is reconciling the needs of all parts of the organization mentioned above at the same time at the most overall cost effective" way.

Organizational design issues can setup performance engineering projects and personnel for either success or failure.

Network Performance Management systems/platforms (such as Avesta's Trinity, Loran Kinnetics, Manage.com's Frontline, and NextPoint's S3) have these capabilities:

SNMP device management software agents gather information from SNMP agents in network devices and systems
RMON/RMON II or probe links for traffic monitoring agents track the overall performance of network connections
Response time measurement agents gauge how well applications, databases, and transactions are performing over the intranet.
Real-time event filtering agents generate warnings and alerts when devices break or traffic conditions deteriorate
Historical trend analysis agents store performance data over time to generate periodic graphical representations of network health and status.

As organizations move toward virtualization of server capacity, the job of capacity management would naturally be more about monitoring and managing costs. balancing the production network and benchmarking

For want of a nail,
the shoe was lost;
For want of a shoe,
the horse was lost;
For want of a horse,
the rider was lost;
For want of a rider,
the battle was lost;
For want of a battle,
the kindgdom was lost;

— Benjamin Franklin (1706-90) in "Poor Richard's Almanack," June 1758, The Complete Poor Richard Almanacks, facsimile ed., vol. 2, pp. 375, 377 (1970)

Information Technology Evaluation Methods and Management (Hershey, Pa. Idea Group Publishing, 2001) by Wim Van Grembergen

Achieving Software Quality Through Teamwork (Boston Artech House, 2004) by Isabel Evans

Essentials of Capacity Management (New York John Wiley & Sons, 2002) by Reginald Tomas Yu-Lee

Six Sigma Team Dynamics : The Elusive Key to Project Success (New York John Wiley & Sons, Inc. 2003) by George Eckes

“It is much easier to make measurements than to know exactly what you are measuring.” —J. W. N. Sullivan (1928)

Operating Level Agreements (OLA)

It is similar to but is normally not as formal as SLAs (Serice Level Agreements) with customers. The OLA should have its metrics stored in the CDB.

The META Group anticipates that through 2005, more than half of IT organizations will invest in formalized IT business plans governed by service-level agreements (SLAs). Unfortunately, fewer than 10 percent of IT organizations have a well-defined service-level management process in place today that can accurately and consistently communicate relevant service levels to the business units.

Artifacts and Information Flows among Roles

This pseudo usecase diagram summarizes the information (artifacts) flowing among people assuming certain roles involved in managing the performance of large applications:

Before any Project:

Management (C.) defines Roles to resources assigned to the project in accordance with Corporate Performance and Capacity Management Strategies (an extension of this document adapted to an organization) sets the process and limits for all performance engineering activities and projects (such as the rationale for tools acquired, equipment scheduling processes, etc.)

During planning:
Estimated Market (User) Usage Patterns and
Budgeted Costs of provisions
- for the project and
- for The application/system under test
both reconciled in a:
Performance Engineering Plan for each application/project (a part of each application's development plan), which includes Project Workflow Steps appropriate to Project objectives customer requirements, characteristics of the product under test, and the interactions among them. The technological aspects of this plan is drawn from
Performance Tuning Ideas and Techniques used both by developers and by the performance engineer to create

During construction:
Data Configurations and Application Releases are installed and exercised using
Test Scripts & Run Parameters which yields
Performance Test Results and Reports which are the basis of
Simulated Load Patterns captured in a
Capacity Model (built using MS-Excel spreadsheet or HyPerformix software) to identify
Predicted Performance Profiles for anticipated loads and
Alert Trigger thresholds to planned Upgrade Points

After deployment:
Actual Load Patterns can be identified in response to
Actual Usage Patterns, which may result in
Off-estimate alerts

Project Mission and Objectives for Customer Satisfaction

WHATs - Requirements Critical to Satisfaction	Average Weight (1-5)	Casual User	Exp. User	Sys Admin
1.1 fast to load	4.5	4	4	5
1.2 quick response after submit	4.3	4	3	5
2.1 accepts batched transactions	3.3	0	2	5
2.1 dependable	3.1	4	4	5
3.1 Does not timeout	1.9	1	3	2
4.1 Quick to Recover	1.9	1	3	2

Customer Requirements

The column to the right of each requirement contains weight ratings that allow certain customer requirements to be weighted higher in priority than others in the list. The example shown here is the average of weights for different sub-groups. The "(1-5)" range in this example can be optionally replaced with ISO/IEC 14589-1 evaluation scales or advanced methods such as Thomas Saaty's "Analytic Hierarchy Process" used to establish scales with precise scales:

Models, Methods, Concepts & Applications of the Analytic Hierarchy Process, with Luis G. Vargas (Springer; November, 2000)

Decision Making for Leaders: The Analytic Hierarchy Process for Decisions in a Complex World (3rd ed. May 1, 1999)

The customer sub-groups shown in this example is for roles working with a computer application:

Casual users (who usually prefer sessions to timeout automatically and don't care about batch transactions)
Experienced / Power users (who need to batch several transactions with each submit)
System/Data Administrators

competitors

QFD graphic programs can add:

a vertical line graph to illustrate differences in ratings among sub-groups or competitors.
a triangle to the right of these items to flag relationships among requirements: Related requirements are noted with "+". Conflicting (mutually exclusive) requirements are noted with "-".

HOWs - Product Requirements - Technical Engineering Design Characteristics

Quality Function Deployment (QFD)

matrix

product

The International TechneGroup, Inc. (ITI) approach for Concurrent Product/Manufacturing Process Development breaks this "WHATs" of the "voice of the customer" (VOC) down further into User Wants, Must Haves, Business Wants, and Provider Wants.

critical-to-quality (CTQ) to users, such as:
- Average response time to fully load a web landing page from a root URL request.
- Average response time to accept/update a user registration page (of 10 data fields)
- Average response time to complete a login.
- Average response time to obtain a 20 item result page from a typical search (of 3 key values)
- Average user time to complete a booking (or other specific business process)
critical-to-delivery (CTD), such as:
- Number of seconds from failure (through reboot and startup) to acceptance of first transaction.
critical-to-cost (CTC), such as:
- "Cumulative number of transactions before manual intervention is required (for data purging, etc.)"
- "Number of peak transactions per hour per application server"

The CMM (Capability Maturity Model developed at Carnegie Mellon University) has 7 measures:

Need Satisfaction measures (Effectiveness, Responsiveness, Correctness, Versatility)
Performance measures (Dependability, Efficiency, Usability, Fidelity)
Maintenance measures (Maintainability, Understandability)
Adaptive measures (Interoperability, Portability, Scalability, Revisablility)
Organizational measures (Cost of ownership, Productivity)

Software Quality Requirements

SEI/CMU's Taxonomy of Quality Measures

quality-in-use: the user's view of the quality of the software product when it is used in a specific environment and specific Context-of-use — the extent users can achieve their goals in a particular environment. Its characteristics:
- Effectiveness
- Productivity
- Safety
- Customer Satisfaction
External quality: of the product, and
Internal quality: applicable to interim products such as documents, and source code.

Quality
Characteristic Sub-characteristics Definition: Attributes of software that bear on the ...

Functionality Suitability presence and appropriateness of a set of functions for specified tasks.

Accurateness provision of right or agreed results or effects.

Interoperability Attributes of software that bear on its ability to interact with specified systems.

Compliance Attributes of software that make the software adhere to application related standards or conventions or regulations in laws and similar prescriptions.

Security Attributes of software that bear on its ability to prevent unauthorized access, whether accidental or deliberate, to programs or data.

Reliability Maturity frequency of failure by faults in the software.

Fault tolerance Attributes of software that bear on its ability to maintain a specified level of performance in case of software faults or of infringement of its specified interface.

Recoverability capability to re-establish its level of performance and recover the data directly affected in case of a failure and on the time and effort needed for it.

Usability Understandability users' effort for recognizing the logical concept and its applicability.

Learnability users'effort for learning its application.

Operability users'effort for operation and operation control.

Efficiency Time behaviour Attributes of software that bear on response and processing times and on throughput rates in performances its function.

Resource behavior amount of resource used and the duration of such use in performing its function.

Maintainability Analyzability effort needed for diagnosis of deficiencies or causes of failures, or for identification of parts to be modified.

Changeability effort needed for modification, fault removal or for environmental change.

Stability risk of unexpected effect of modifications.

Testability effort needed for validating the modified software.

Portability Adaptability opportunity for its adaptation to different specified environments without applying other actions or means than those provided for this purpose for the software considered.

Installability effort needed to install the software in a specified environment.

Conformance Attributes of software that make the software adhere to standards or conventions relating to portability.

Replaceability Attributes of software that bear on opportunity and effort using it in the place of specified other software in the environment of that software.

ISO/IEC 14598 gives methods for measurements, assessment and evaluation of software product quality.

SPICE - Software Process Improvement and Capability dEtermination is a major international standard for Software Process Assessment. There is a thriving SPICE user group known as SUGar. The SPICE initiative is supported by both the Software Engineering Institute and the European Software Institute. The SPICE standard is currently in its field trial stage.

Project Numerical Business Goals

Capacity planning saves money by balancing two conflicting conditions:

the expense of unused capacity (in specific componets or system-wide) and, on the other extreme,
the loss of profits from not having enough capacity to meet demand, especially during busy periods such as during Christmas shopping, Superbowl, and other peak seasons and events.

Balanced Scorecard Diagnostics: Maintaining Maximum Performance (John Wiley & Sons © 2005, 224 pages) by Paul R. Niven presentis a step-by-step methodology for analyzing the effectiveness of a company's balanced scorecard, with tools to reevaluate measures for driving maximum organizational performance.

"Balanced Scorecard" Metrics

The "Balanced Scorecard" (BSC) was introduced in 1996 by a popular book written by Robert Kaplan and David Norton (consultants and professors at the Harvard Business School).

This lists the perspectives of a Balanced Scorecard, and some activity and Key Process Indicators (KPIs) which are most relevant to capacity and performance management:

Perspective &
Core Measures Sample Metrics Relevant Metrics

Customer
How do our customers see us?

Market share

Customer acquisition

Customer retention

Customer profitability

Customer satisfaction

Satisfaction, retention, market, and account share

Over 12 months old customer sales

Under 12 months old customer sales

Reports on contact with larger customers

Sales of new/improved products

# Customers (Users)

# Customer Interactions (linked transactions)

% User Wait Times vs. an expected statistical profile

Response Time for various user actions

Network delay Time from various locations

User Think Time between specific user actions

Abandonment rate per interaction

% Returning Customers conducting interactions

% Complaints per million transactions

Financial (Results)
How do we look to shareholders?

Return-on-investment/
economic value-added

Profitability

Revenue growth/mix

Cost reduction productivity

Return on investment and economic value-added:

Sales

Gross margin after external purchases

Gross wages as % of sales

End month sales ledger

End month purchase ledger

Minimum cash available during month

% Asset Utilization:
% of capacity utilized during time frame

# Hours worked

Revenue, Cost, and Profit Per
Customer

Revenue, Cost, and Profit Per
Customer Interaction

Internal (Efficiency)
What must we excel at?

Quality, response time, cost, and new product introductions:

Sales per Employee

"Value added" per Employee

% days lost to illness

% reserve capacity available

# Planned labor hours spent (on maintenance)

# Unplanned labor hours spent (fixing breakdowns)

Predictability: % Unplanned vs. Planned labor hours

Manageability: # Unplanned events (such as server reboots & restores)

Reliability: Mean Time Between Failures or other Unplanned Events

Maintainability: Mean Time To Repair or other action (such as upgrade a server)

Controllability: % of unplanned events expected statistically

Learning and Growth
(Agility, Innovation)
How can we continue to improve and create value?

Employee satisfaction

Employee retention

Employee productivity

Employee satisfaction and information system availability:

No. of current improvement projects

No. of benchmarking visits

No. of new/improved products introduced

% of Employee days on training/learning

# innovation hours planned

% of total time spent on innovation (20% at Google)

% of capabilties (metrics) available as planned.

These Balanced Scorecard metrics imply these business strategies:

To increase product quality and reliability, invest time on innovation rather than mere maintenance.
Find a way to maintain a precarious balance between seemingly conflicting ends:
- To maintain customer satisfaction:
  Yet
- To decrease project cost (labor hours):
To increase profit, increase the percentage of returning customers.
To shorten project cycle time, reduce the number of unplanned events.
To decrease project risk, increase the percentage of hours that are planned vs. unplanned.

Management Dashboards

This ...

Last ...

This vs. Last ...

Cumulative ...

Hour

Day

Week

4 Weeks

Month

Quarter

Year

2 Year

Decade

by Customer

by Product/Service

by Organizational Level

by Location

by System/Application

by Asset

by Transaction Type

Performance Project Plan

MS Project 2003

This information supplements (and in some cases contradict) the body of knowledge for QAI Certified Software Project Managers (CSPMs).

Performance Within Development LifeCycles

Unit Test between Code and Integrate
Integration Test after Code
Load Test before Deploy
Load Test after Deploy
Analyze Performance

Test Type Timing (When)

A. Speed Parallel with coding construction, as this provides developers feedback on the impact of their choice of application architecture.

B. Contention

C. Data Volume On each release when app components are being integrated.

D. Stress/Overload Pre-Production for each new application version or hardware configuration.

E. Fail-over Pre-Production for each new application version or hardware configuration.

G. Scalability Pre-Production for each new application version or hardware configuration.

H. Availability In Production for each new application version or hardware configuration.

The Context of Performance and Scope of Performance Tuning

Performance improvement and management projects can be considered in the context of these architectural layers and components:

Context Components Tuning Options

A. Business

Products
Departments and hierarchy
Individual Users & Permissions

Number and types of users
Business hours
Batch (cron) jobs
Report schedules

B. Applications

Systems (HR, Accounting, Marketing, Sales, Legal, etc.)
Databases (Products, Locations, Customers, Vendors, Employees, etc.)
Front-facing Server apps (StoreFront, Login, etc.)
Back-end Management apps
Client apps (installer, DLLs, JVM)

Software architecture
Database layout (schemas)
Architecture: N-tiered legacy application-oriented architecture (AOA) or service-oriented architecture (SOA) that may includes a virtualized deployment.
Access and security methods

C. Operating
System

Paging File
Parameters

File size & location
Kernel tuning
OS revisions
Disk volume layouts

D. Server
Hardware
Devices

See Provisions below

number of CPU's vs. App threads
Disk (local, NAS/NSF filers)
RAM (availability and usage)

E. Telecommunications
Infrastructure

InterNetwork backbone (weather)
WAN Routers
External Firewalls
Switches
Telephone, faxes, and other communications equipment
Handheld wireless devices

Telecom Network architecture

F. Data Center
Operations
Facilities: Buildings, Furniture, Electrical, Lighting, Plumbing, HVAC, Wiring for Phones, Data, Speakers

NOC (Network Operations Center)
Backup/restore strategies
Capacity Management
Security

CMG (the Computer Measurement Group)

Provisions for Performance Testing

One of the key "Success factors" of a performance measurement project is the availability of resources when needed. Waiting for resources (or working around the lack of resources) is one of the major reasons for project delays.

There are two areas of provision (two sets of budgeted costs):

Provisions for the application under test (AUT)
Provisions for the project analyzing the AUT

I. Support Expertise

role

Executive Sponsor
Engagement Manager
Performance Engineer
Database consultant
Development consultant
Operations/Network/Security consultant

II. Knowledge

Infrastructure Typology describing each server (the versions of software in it, the data it uses and creates, the protocols and frequency of communications between machines)
Database schemas and how software access them.
Installation and configuration notes for each server.
Requirements, design documents, and developer notes on application logic.
Instructions to end users (help screens, User Manuals, trainer notes)
Sequences of user actions (use cases cross referenced to associated load testing scripts and software component names.
Results from Functional Testing ( defect reports on what doesn't work and workarounds)

III. Hardware

dimensions

Environments (clusters of machines), such as

Specific machines used by the application
Specific machines used for load testing

Component Resources within servers

Environments

Live Production Environment
Production Staging / Failover Environment
Training Environment
Load Test Environment
Functional QA Environment
Development Environment

Specific machines on the technology "stack"

LDAP and authentication servers.
Load Balancer Virtual IP addresses (VIPs)
Ancillary servers and services such as Spam filters, firewalls, and other appliances
Web servers
Application servers (such as WebLogic or Websphere servers).
Incoming/Outgoing Messaging (email) servers
Web service servers (such as media distribution servers)
Database servers housing Oracle or SQL servicing database queries
External storage (NAS/SAN filers) Central file repositories
Printers, scanners, and other peripheral devices

Accelerator boards (for encryption, etc.), if applicable

Machines Specfic to the Load Test Environment

Workstations for tester to develop scripts and scenarios. Tester productivity is greatly improved with multiple monitors.
Controller collecting dynamic monitoring data.
Web server to hold load test results and analysis reports (so that viewing of results do not disturb tests being run)
Load Generator agents (injectors) emulating clients under test
Loadtest servers emulating servers under test. Examples:
Loadclient machines to access the test network and to stage application files.

Component resources within each server

Resource Type Metric

Motherboard Volatile Peak % CPU (CPU Cycles)

Memory Volatile Peak KB Cache

JVM Heap Volatile Peak KB Usage

Local Hard Disk Static Peak KB/sec I/O Bandwidth

Shared Archive Space Static Peak GB Peak Usage

DB Disk Space Static Peak GB Peak Usage

KB/sec Network Volatile Peak Network Bandwidth

OS file handles Volatile Peak # file handles

Shared server Volatile Peak # J2EE Containers

IV. Data and permissions

On the LAN firewall, open port 54345 between the LoadRunner controller and its load generators.
Accounts and passwords to test environment machines.
Accounts and passwords for administrators, users, and testers of the application under test.
Test data extracted (and sanitized) from production.
Test data to be generated.
Secure certificates, if any.
Server images (backups)

V. Software

Load Testing Products

Installation package storage and target locations (folders and file names).
Network performance (instantaneous response time, throughput, & hop count) checker QCheck freeware download (a subset of IxChariot) — installed as "performance endpoints" on servers being tested/used
Access to source libraries (in CVS/SourceSafe).
Special programming associated with the application under test:
- Program(s) to prepare data needed for testing.
- SQL calls to clear data destroyed during each run.
- Driver programs for testing calls to lower-level components.
- Stub programs to emulate availability of lower-level components.
- Logic that LoadRunner scripts need to emulate internally (such as data siging).
Special programming associated with the LoadRunner product:
- Utilities to archive/backup and restore test assets (in and out of ClearCase).
- Utilities to archive/backup and restore test results.
- Utilities to transfer data between LoadRunner and external utilities such as graphing
- Utilities to display test results and analysis.

Risk Contingency Adjustments

Potential Obstacle / Risk	Likelihood	Avoidance / Mitigation
1. Servers not available early during the project.	High (80%)	a. Use dev. environment to develop single-user scripts.
2. Difficulty with Controller licensing, capacity, etc.	Medium (50%)	b. Identify issues early by beginng to use controller as soon as the first small script (such as login only) is coded.
3. Developers not available	Medium (50%)	c. Perform thorough system analysis to identify issues before scripting. d. develop scripts with likely issues early.
4. Not enough capacity in front-end (portal/login) servers.	Medium (40%)	e. Quanitfy capacity of front-end servers with login_only scripts.
5. Change of personnel during the project.	Medium-High (60%)	f. Take notes. Conduct formal peer walk-throughs. g. Make assignments for skill development.
6. Servers become unavailable late during the project	Low (20%)	h. Use production staging environment at night. i. Instead of going through load balancer, test directly against one server taken off its cluster.
7. Changes in server hardware	Low (10%)	j. Conduct benchmark tests on hardware as part of project. k. Save server configuration files for historical comparisons.

Usage Patterns Trend Analysis

transactions per second

The estimates are based on Estimated Market (User) Usage Patterns which feed Budgets & Forecasts of Costs and revenues

Sudden peaks are common, as illustrated by this graph of search interest about a movie title:

Resource Load Patterns

MB of data transferred

service capacity business impact analysis (BIA).

Capacities

Requests
per day Average
Requests Maximum
Requests

10,000 1 6

50,000 3 8

100,000 5 14

215,000 10 21

300,000 14 27

450,000 20 35

500,000 24 40

648,000 30 47

864,000 40 59

1,000,000 47 68

1,080,000 50 71

1,500,000 70 94

2,000,000 93 120

2,160,000 100 128

Each piece of equipment has a limit on how much it can produce.

An assembly can only handle as much as its smallest channel.

For example, a web server has an input buffer, an internal queue, and an output buffer.

Predicted Performance Profiles for anticipated loads

peak

Rather than "average" loads, it's "maximum" values during various blocks of time.

Bottlenecks

weakest link

Cash memory is the part of the computer that remembers how much money you spent on your computer. The more you spend on your computer, the faster it will work. That's why the million dollar computers work so fast - they have more cash memory than you do.

Issues

Contention (data, file, memory, processor)
Inappropriate distribution of workload across available resources
Inappropriate locking strategy
Inefficiencies in the application design
Unexpected increase in transaction rate
Inefficient use of memory

So the capacity manager must involve him/herself in a large scope of all categories of the entire CIT architecture supporting the organization's Service Catalog:

Operating Systems
Client applications (IE, Java Runtimes, utilities, etc.)
Facilities (cabling)
Network equipment (LANs, WANs, bridges, routers, and so on)
Hardware:
- Web Servers
- Peripheral devices (SANs, etc.)
- Middle Tier Servers (middleware, Tivoli, BMC, etc.)
- Database Servers
Egress (service supplied outside of IT-for example, power and water)

The components within each server:

Speed of Network Interface Card
Speed of CPU (Mhz) and buffers (L1, L2)
Speed of disk data transfer speed and cache
Speed of application processing
Speed of security processing (encryption and decryption)
etc.

The impact of bottlenecks is included in the metric percentage utilization of resources.

Capacity Management Database (CDB)

capacity plan

capacity management database (CDB)

The CDB contain the detailed technical, business, and service level management data that supports the capacity management process. The resource and service performance data in the database can be used for trend analysis and for forecasting and planning.

Mainframe based data collection methods, tools and techniques include MXG, SMF, SAS and quantitative analysis techniques.

verifier.exe /flags 2 /driver drivername

Linux Test Project

list of open-source Tools

Linux

$40/$2 Professional Web Site Optimization Wrox Press. February 1, 1997 by Michael Tracy, Scott Ware, Robert Barker, and Louis Slothouber

Microsoft's Open Wiki Forum for performance and scalability discussions

Computer Systems Performance Evaluation and Prediction (Digital Press; October 20, 2002/2003) by Dartmouth professors Paul Fortier and Howard Michel. This textbook fills the void between engineering practice and the academic domain's treatments of computer systems performance evaluation and assessment by providing a single source on how to perform computer systems engineering tradeoff analysis which allows managers to realize cost effective yet optimal computer systems tuned to a specific application.

List of Web Site Test Tools and Site Management Tools maintained by Rick Hower

Intel Performance

UML 2.0 Test Profile

Information about each Test Context — the collection of a test configuration on which test cases within test suites are executed.

Testing Objectives -> Test Generation Directives
Specifications -> Behavioral (Dynamic) Model

component-level and system-level tests.

Time Concepts

specification of tests for structural (static) and behavioral (dynamic) aspects of computational UML models,

A test context is just a top-level test case. Annotate the model with testing information:

Coverage criteria
The purpose of specific tests
Testing constraints
The interface for testing

Off-Estimate Alerts

So if during production operation loads become higher than expected, those monitoring the data center would issue alerts for reactive action and additional analysis.

"Missed it by that much"

Performance Criteria (Supplemental Requirements)

Supplementary Requirements

performance criteria

measurement goal

SLA

“When 500 customers access the application at the same time (with normal think times), response time for 95% of transactions must be

within 4 seconds for data-intensive requests (such as searches through the database) and
within 2 seconds for data-intensive requests local to the user client.”
within 1 second for all other requests local to the user client.”

Variations on this include response time degradation expected for different number of users exercising business and administrative tasks described in the application's Use Cases document.

$95/49 The Art of Computer Systems Performance Analysis : Techniques for Experimental Design, Measurement, Simulation, and Modeling by R. K. Jain (John Wiley: 1991) is a seminal classic must-read for its clarity.

$45/15 Performance Solutions: A Practical Guide to Creating Responsive, Scalable Software (Addison-Wesley: September 17, 2001) by Dr. Connie Smith and Lloyd Williams of Performance Engineering focuses on object-oriented systems and alignment of Software Performance Engineering (SPE) with RUP. It notes performance patterns and anti-patterns.

$40 Measuring Computer Performance : A Practitioner's Guide (Cambridge University Press: September 2000) by David J. Lilja (Professor at U. of Minnesota and author) is a more gentle introduction than Jain's, which is more quantitative.

User Steps

Examples of user steps/actions:

1.	Invoke URL
2.	Login/Logon
3.	Pick/Start Application
4.	Add one
5.	Import batch
6.	Search
7.	Export batch
8.	Archive/backup
9.	Delete
10.	Retrieve/undelete
11.	Restore from archive
12.	Exit application
13.	Logout/Logoff

Start-up from machine cold-boot.
Application invocation (each mode of operation)
For each type of user (user role)
- Register
- Login
- UnRegister
- ReRegister
- Password recovery
- Login
Select each menu item
For each data structure:
- Browse
- Sort display
- Add
- Change (Edit)
- UnDo
- Delete
- UnDelete
- Search
- Print
- Import batches
- Export batches
- View Properties
Logout
Stopping

Result Design

$800 SPECweb99

SPECweb99_SSL

Test results for each iteration (median is shaded)
Iteration Conforming Connections Percent Conform Throughput ops/sec Response msec ops/sec/ loadgen kbit/ sec

1 4130 100.0% 11619.9 355.4 2.81 335.4

2 4130 100.0% 11583.8 356.5 2.80 334.4

3 4130 100.0% 11610.2 355.7 2.81 335.1

Design the tests

Performance Test Plan

Examples of categories (and actions) include:

Installation (Load data)
Configuration (scale, languages/locales)
Administration (List, Add, Update, Delete user; backup and recovery)
Business Transactions (Login, Browse, Search, Logout)
Reporting
"Security" and "Help" could be a separate category or a part of each category.

Each possible action during user sessions (the paths through the application) can be graphically depicted with lines when using the industry-common User Community Modeling Language (UCML) UCML sample

Parentheses after the action indicate the likelihood of that action occuring. Dotted lines under the action identify additional optional actions.
The UCML encloses in circles the percentage of occurance or number of times of occurance per iteration.

Get these symbols

When Scott Barber first defind UCML in 1999, he also proposed a format for defining additional information, such as:

The parameters include how many virtual testers will be used and when.

The factors used during testing define what is varied.

Variations in configurations and database-access scenarios for each iteration. Mock-ups of the statistics and graphs to be generated after each test (for each build) are created at this time. The production test physical Environment of servers and networking devices are also assembled at this time based on the same Installation Procedures as used during actual Deployment.

To avoid delay later, it helps to identify early what the techniques needed to handle complexities in the application or environment (such as the use of firewalls, fail-over, load balancing, session identifiers, cookies, XML/XSLT transforms, etc.).

Web protocol agents do not include processing time for plug-ins like Macromedia Flash and Real players. For timings on how much time it takes Flash to paint the screen, you need an additional license from Mercury for the Flash protocol emulator.

Test of web servers typically include use of HTTP caching such as

Sun's Network Cache Accelerator (NCA) or
Microsoft Scalable Web Cache 3.0 HTTP 1.0 compliant caching front-end kernel-mode driver for IIS Web requests and Trusted user Web Cache (TWC) API KB Q324355
Red Hat Content Accelerator 2/2.2

This document makes use of the terminology from UML 2.0 Testing Profile specifications v1.0 (July 7, 2005)

This enables the test definition and test generation based on structural (static) and behavioral (dynamic) aspects of UML models,

UML2 Testing Profile was developed based on several predecessors: SDL-2000, MSC-2000, and TTCN-3 (Testing and Test Control Notation version 3) — also published as ITU-T Recommendation Z.140 — (Developed during 1999 - 2002 at the ETSI (European Telecommunications Standards Institute)) is a widely accepted standard in the telecommunication and data communication industry as a protocol test system development specification and implementation language to define test procedures for black-box testing of distributed systems.

ETSI European Standard (ES) 201 873-1 version 2.2.1 (2003-02): The Testing and Test Control Notation version 3 (TTCN-3); Part 1: TTCN-3 Core Language.

J. Grabowski, D. Hogrefe, G. Réthy, I. Schieferdecker, A. Wiles, C. Willcock. An Introduction into the Testing and Test Control Notation (TTCN-3). Computer Networks, Volume 42, Issue 3, Elsevier, June 2003.

Neil J. Gunther

Professor of Computer Science at the Federal University of Minas Gerais, Brazil, Xerox PARC & Pyramid (Fujitsu) alumnus, founder of Performance Dynamics and developer of the PARCbench multiprocessor benchmark and

C-language Open Source performance analyzer called PDQ queueing model solver.

$90 The Practical Performance Analyst: Performance-By-Design Techniques for Distributed Systems (McGraw Hill: February 1998). This has been obsoleted by

$46 The Practical Performance Analyst, 2nd Edition (Authors Choice Press, October, 2000) &
Browse this on iUniverse

A complete rewrite of the above is now underway with Guerrilla Capacity Planning: Hit and Run Tactics for Sizing UNIX, Windows and Web Applications Springer-Verlag)

Provisioning Milestones

Time to ensure proper (coordinated) installation of hardware and software is chronically underestimated for performance projects.

For whatever reason, it is often assumed that performance testers do not need the same amount of time as operations staff to install an application. Additionally, information about installation issues often do not get to performance testers.

Yet, even small mistakes in installation can invalidate test results.

The installation milestones (below) are repeated for each hardware configuration (m) to be tested :

m.1.allocated > m.2.delivered > m.3.assembled > m.4.installed > m.5.configured > m.6.available > m.7.operational > m.8.benchmarked

Completion
Nickname Milestone Description Target Date

m.1.allocated
m.2.delivered
m.3.assembled

m.4.installed
m.5.configured
m.6.available

m.7.operational

m.8.benchmarked

Provisions (Equipment, IP addresses, SSL certificates, etc.) have been allocated.

Provisions have been delivered for installation work.

Installation packages and instructions assembled for installation and integration work. This ends with inventory check-in of boxes.

App. installer packages (including anti-virus, utilities such as Adobe Reader, and patches) have been verified as installed on the first server.

App. has been configured with modified configuration settings (IP, DNS, time server, rstatd, NAS, etc.) as planned. This ends with the creation of a snapshot archive image file.

App. has been proved useable at completing initial transactions processing test data, and available for use as individual machines on the target network.

All servers are operational as a completely integrated whole. This is verified by the first performance test run (usually Speed Testing).

The performance of various configuration settings are benchmarked for basis of comparison.

Construction: Create and validate Speed Test scripts

Performance Test Scripts

Sequence Diagram

This corresponds to SPE steps:

Verify and validate model(s)
Evaluate performance model(s) [until performance is acceptable]
- if feasible, modify product concept, then modify performance model(s)
- if not feasible, revise performance objectives

Repetitive load and stress tests automate the steps defined during preliminary performance testing. Actions automatically captured into a load testing script are modified several ways:

Insert comments to annotate user actions to cross-reference planned steps documented or to identify the source of info at key steps.
Insert lr_ general functions to log vuser computer name and other info about each test run.
Add or remove cookies on client machines if the app assume a single user.
Add "correlation statements" to save and retrieve values.

Changes to loadrunner scripts include:

Define the format of parameters in the Parameter List. The default braces character to designate a parameter depend on the Vuser type, but can be specified in the General Options screen.
Replace fixed text in captured script code with a <parameter>.
Format internal data generated while a Vuser runs (shown in the Sample Value box) (date/time, Vuser Group, Iteration number, Load Generator Name, Random and Unique numbers). Specifying a format code of %05s pads a string value of 3 as "00003". %3s pads a three character string with spaces.
Define redeveous points

Levels of Scripting Capability

The purposes of test runs during a load testing project typically follow this sequence of increasing capability levels over time:

Test Level	A. Config.	B. # Users	C.Range of values	D.Prior Data	E.Length of run	Purpose (Type of Testing)
1.	Initial	One	Static	None	Short	Initial script creation through recording and playback, plus addition of error checking, transaction definition, etc.
2.	Initial	Few	Dynamic	None	Short	Initial data parametization and monitor debugging toward Speed testing
3.	Initial	Few	Complex	None	Short	Initial scenario debugging and report template formatting toward Contention testing
4.	Initial	Few	Complex	Much	Short	Data volume tests to determine whether the amount of stored data impacts performance.
5.	Baseline	Many	Complex	Much	Short	Stress tests and Test Report presentation previews. Failover tests after an overload occurs.
6.	Baseline	Many	Complex	Much	Long	Longevity tests to establish the current baseline.
7.	Altered	Many	Complex	Much	Short	Application Regression smoke testing vs. the stress-tested baseline above.
8.	Altered	Many	Complex	Much	Long	Comparison testing (vs. the Current Baseline) for scalability testing

Possible impacts to performance

Here are the major variables to track the capability of load test scripts:

The configuration under test (the combination of application code version and parameters used to configure servers).
Work with an "initial" configuration is necessary in order to carefully defining a "baseline" configuration because the scripting process usually reveals some changes to the precise configuration planned,
Configurations usually become "altered" due to conclusions reached during a full set of load tests using the baseline configuration.
The load placed on the application, determined by the number of virtual users doing work.
The range of data values used during the test run. "Complex" describes different groups of users active at various portions of runs.
The volume of prior data stored when tests begin.
The Length of time for test runs. This is the last column because capability in this aspect is the consequence of capability in the prior areas to the left. Rushing to produce longer runs would only provide appearance of progress if a script is fundamentally flawed in other aspects of capability.

Here are the complexities of a load script:

Different types of transactions
which access different tables to look-up values or store transactional data.
Different data values
(such as different departments, vendors, projects, etc.) stored in different areas of the hard disk.
Different number of line items
which use different logic or data paths.
Error recovery
which use different logic or data paths.

Process Milestones

Completion
Nickname Milestone Description Target Date

t.1.defined
t.2.drafted
t.3.repeatable
t.4.validated
t.5.results confirmed
t.6.analyzed
t.7.concluded
The scripting milestones (below) are repeated for each form/type of performance testing (t):

User actions and expected responses are defined.

Scripts to simulate user actions is drafted — and exhibits no application functional errors.

Run-time settings and parameter data to be used in test runs have been established such that scripts can be repeated successfully for several virtual users.

Run-time settings and parameter data to be used in test runs have been established such that scripts can be repeated successfully for several virtual users.

Scripts have been run to produce results (run logs and statistics obtained during runs).

Run result data have been analyzed to produce tables of data and visual graphs to support statement of issues and findings.

Conclusions about the app's performance profile, — and recommendations for performance tuning — are available for review and acceptance by management.

Note: There may be several rounds of testing to determine the impact of each change to run parameters or application code.

Models of Usage and Capacity

The objective of modeling is to create a mathematical model. For example, create a spreadsheet such as Exch_Calc.xls from the Microsoft Capacity Planning and Topology Calculator to predict the scalability of Exchange 2000 email messaging infrastructure deployment. The spreadsheet calculates the expected number of Windows 2000 Active Directory Global Catalogs, domains, and sites.

The model would take into account the software clients used to access mail, server transactions with that client, the hardware (number/speed of processors), and the physical deployment itself.

back-end "user-per-server" numbers are not very useful taken out of the context of the whole deployment.

As part of Microsoft's Dynamic Systems Initiative (DSI) that supports SOA (Service Oriented Architecture) -- the Windows Communication Foundation (WCF) within Microsoft, code-named Indigo -- on Vista 2007 servers is Microsoft System Center Capacity Planner Manager 2006 simulates deployment sizing forensic simulation "what-if" analysis.

This product uses a common, central SDM (Systems Definition Model) used by all System Center software packages, starting with Microsoft Operations Manager (MOM) 2005, built for use with the MS.Net Framework version 2.0

To diagnose the root causes of performance problems in a Microsoft Windows Server 2003 deployment, Microsoft provides Microsoft ® Windows Server ™ 2003 Performance Advisor 6/17/2005 .NET 1.1 Framework replacement of the 5/24/2004 Server Performance Advisor V1.0. These run on Windows 2003 SP1 (not Windows 2000 or Windows XP).

It provides several specialized reports, including a System Overview (focusing on CPU usage, Memory usage, busy files, busy TCP clients, top CPU consumers) and reports for server roles such as Active Directory, Internet Information System (IIS), DNS, Terminal Services, SQL, print spooler, and others.

TeamQuest analytic modeling software claims to find the optimal configuration based on business forecasts and to handle spikes in demand by experimenting what what-if analysis in a virtual environment. But I would not recommend them because they don't seem to willing to talk to me.

104707 eBook ISBN: 1417507810 IT Performance Management (Oxford, Burlington, Mass Butterworth-Heinemann, 2004) by Peter Wiggers, Kok, Henk.; De Boer-de Wit, Maritha.

Format data for Presentation

In Analysis, set the filter to not include Think Time.
In Analysis, set granularity of charts.
Perform additonal calculations within a MS-Excel spreadsheet by copying LoadRunner's Summary report (and using "Paste Special" to avoid carrying over the formatting), then adding formulas on empty columns.
If changes were made, save a new report template.

Refine Stress Test Scenario parameters to Identify Bottlenecks

Early detection of bottlenecks improve the efficiency of developers, so Performance testing parallel to application Construction (rather than deployment) can be very cost efficient.

Performance testers can make testing scenarios and scripts more realistic by refining scripts to be invoked on a random, sequential or synchronized basis -- emulating more and more complex (and negative/conflicting) scenarios:

Home page visit only, then no other action.
Home page, user registration (successful).
Home page, successful login, then no other action.
Home page, unsuccessful login (repeated), and password recovery.
Login page, successful user login, input transaction (with and without errors).
Login page, successful administrator login, and update transactions.
Login page, successful administrator login, and reporting.

These tests may be repeated for each set of installation options (such as different brands/capacities of hardware and software) and different configuration settings (support of different locales or database tuning settings).

Additional functionality can be tested as new builds add additional functionality or stubs and drivers can be created to simulate actual application functionality.

These tests quantify the two basic parameters used to predict performance capacity:

contention delays such as time spent waiting on a database lock,
pairwise coherency mismatches such as time to fetch a cache-miss.

Tuning Java

javaperformancetuning.com has a complete list of books, resources, and everything else.

by Jack Shirazi, author of: $45/31 Java Performance Tuning (2nd Edition) (O'Reilly: January 2003)

$50/25 Performance Analysis for Java Web sites by Stacy Joines, Ruth Willenborg, and Ken Hygh (Addison-Wesley: 2002, 464 pages) from consultants and developers at IBM Software Group at Research Triangle Park, North Carolina. [Review]

$32 J2EE Best Practices: Java Design Patterns, Automation, and Performance (Wiley Application Development Series: November 8, 2002) by Darren Broemmer

$31/6 Java Platform Performance: Strategies and Tactics (Addison-Wesley: May, 2000) by Steve Wilson and Jeff Kesselman

Among White Papers by Mercury Interactive:

Mercury Interactive White Paper Monitoring J2EE Applications

Mercury Interactive White Paper Diagnosing J2EE Performance Problems Throughout the Application Lifecycle presents techniques for delivering high performance applications to production, managing and measuring the performance of applications, and diagnosing the toughest J2EE problems throughout the entire application lifecycle. The paper will examine the various types of performance issues that need to be dealt with at each stage of the lifecycle and what different diagnostic tools and techniques can best resolve them.

Diagnosing results for Longevity Load Tests

server CPU utilization

other monitored indicators

The green line in the middle can represent response time. Some performance test tools can divide the total average response time down into how much time was spent in each aspect of the environment (network, application server, database, etc.). Such analysis identifies performance bottlenecks, such as the bandwidth capacity of a CPU, a network device, application component, or database tuning parameter. The result of this analysis is summarized and formatted for presentation to developers and management. Examples of recommendations include the tuning of run-time parameters on servers and network devices or upgrading of hardware to meet expected loads.

Testing Various Configurations for Scalability

percentage capacity

The top-most curved line represents the highest estimate of resource usage. The lowest estimates of usage are represented by the bottom trend line.

For example, under the heaviest usage, an additional server should be added before actual usage reaches 100% at month 8 and another should be added before 200% is reached around month 30. However, if the lowest level of usage is actually encountered, no additional server is needed until month 24.

These curves combine the result of parameters determined during scalability testing multiplied by expected product sales growth estimates.

This capacity-performance 3D surface [from Neil J. Gunther] predicts the user response based on the number of "m" processors running at various levels of load (load factors).

Capacity planning saves money by avoiding the expense of too much unused capacity (in specific componets or system-wide) and, on the other extreme, avoiding loss of profits from not having enough capacity to meet demand.

Performance Engineering Laboratory operated by Dr. Liam Murphy at the University College Dublin and Dublin City University

http://www.ejbperformance.org COMPAS Performance Prediction

Your rating of this page:
Low High

Your comments on this topic, please:

Publish this comment publicly

Your first name:

Your family name:

Your location (city, country):

Your Email address:

Email me updates

Top of Page

Thank you!

Software Performance Project Planning

Aspects of a Performance Improvement Project

Pre-requisites

Phases: Define > Measure > Analyze > Improve > Control

Software Performance Engineering (SPE)

"5S" Kaizen Lean Approach

Test Policies

Deliverables Flow

Forms/Types of Performance Testing/Engineering

Accomplishments

A. Speed Tests(for Responsiveness)

B. ContentionTests (for Robustness)

C. Volume Tests (for Extendability)

D. Stress / OverloadTests (for Sustainability)

E. Fail-OverTests (for Resilience & Recoverability)

F. Spike "Peak-Rest" or"Daily" Tests

F. Endurance"Soak" "Longevity" Tests (for Reliability)

H. Scalability (Efficiency) or Reconfiguration Tests

I. Availability (Schedulability)

The Organizational Context of Load Testing: Capacity Management

Capacity Management

Six Sigma

ITIL Service Delivery

Capacity Plan

Other Approaches

Organizational Concerns

Operating Level Agreements (OLA)

Artifacts and Information Flows among Roles

Before any Project:

During planning:

During construction:

After deployment:

Project Mission and Objectives for Customer Satisfaction

HOWs - Product Requirements - Technical Engineering Design Characteristics

Software Quality Requirements

Project Numerical Business Goals

"Balanced Scorecard" Metrics

Customer

Financial (Results)

Internal (Efficiency)

Learning and Growth (Agility, Innovation)

Management Dashboards

Performance Project Plan

Performance Within Development LifeCycles

The Context of Performance and Scope of Performance Tuning

Provisions for Performance Testing

I. Support Expertise

II. Knowledge

III. Hardware

Environments

Specific machines on the technology "stack"

Machines Specfic to the Load Test Environment

Component resources within each server

IV. Data and permissions

V. Software

Risk Contingency Adjustments

Usage Patterns Trend Analysis

Resource Load Patterns

Capacities

Predicted Performance Profiles for anticipated loads

Bottlenecks

Issues

Capacity Management Database (CDB)

UML 2.0 Test Profile

Time Concepts

Off-Estimate Alerts

Performance Criteria (Supplemental Requirements)

User Steps

Result Design

Design the tests

Neil J. Gunther

Provisioning Milestones

Construction: Create and validate Speed Test scripts

Levels of Scripting Capability

Possible impacts to performance

Process Milestones

Models of Usage and Capacity

Mercury Capacity Planning (MCP)CDB (Capacity Data Base)

Virgilio A.F. Almeida

Daniel A. Menasce

A. Speed Tests
(for Responsiveness)

D. Stress / Overload
Tests (for Sustainability)

E. Fail-Over
Tests (for Resilience & Recoverability)

F. Spike
"Peak-Rest" or
"Daily" Tests

F. Endurance
"Soak"
"Longevity"
Tests (for Reliability)

H. Scalability
(Efficiency) or
Reconfiguration
Tests

I. Availability
(Schedulability)

Learning and Growth
(Agility, Innovation)