![]() ![]() ![]() |
|
|
![]() ![]() ![]() |
|
![]() ![]() ![]() |
|
![]() |
Study the Stystem/Application Under Test to specify the Test Architecture and Test Behavior. Define Test Timing Constraints. Prepare Test Data.
Conclusions are published along with recommendations for management inquiries and action.
![]()
The approach above was drawn from several capacity management frameworks: In the electronics industry:
After prototyping, and after the product goes though the Design Refinement cycle when engineers revise and improve the design to meet performance and design requirements and specifications, objective, comprehensive Design Verification Testing (DVT) is performed to verify all product specifications, interface standards, OEM requirements, and diagnostic commands. Process (or Pilot) Verification Test (PVT) is a subset of Design Verification Tests (DVT) performed on pre-production or production units to Verify that the design has been correctly implemented into production. The Microsoft Operations Framework (MOF) defines this circular process flow of capacity management activities:
Oracle's Expert Services'
Software Performance Engineering (SPE)Smith's Software Performance Engineering (SPE) approach begins with these detailed steps:
"5S" Kaizen Lean ApproachSort > Stabilize (Set in order) > Shine > Standardize > Sustain
Sort tools used most often vs. what is infrequent
|
![]() ![]() ![]() |
|
![]() ![]() ![]() |
|
|
![]() ![]() ![]() | ||
![]() |
A. Speed Tests
conclusions |
During speed testing, the user response time (latency)
of each user action The script for each action will look for some text on each resulting page to confirm that the intended result appears as designed. Since speed testing is usually the first performance test to be performed, issues from installation and configuration are identified during this step. Because this form of performance testing is performed for a single user (under no other load), this form of testing exposes issues with the adequacy of CPU, disk I/O access and data transfer speeds, and database access optimizations.
The performance speed profile |
|
![]() ![]() ![]() |
![]() |
B. ContentionTests (for Robustness)
conclusions![]() | This form of performance test aims to find performance bottlenecks (such as lock-outs, memory leaks, and thrashing) caused by a small number of Vusers contending for the same resources. Each run identifies the minimum, average, median, and maximum times for each action. This is done to make sure that data and processing of multiple users are appropriately segregated. Such tests identify the largest burst (spike) of transactions and requests that the application can handle without failing. Such loads are more like the arrival rate to web servers than constant loads. |
|
![]() ![]() ![]() |
![]() |
C. Volume Tests (for Extendability)conclusions![]() |
These test runs measure the pattern of response time as more data is added. These tests make sure there is enough disk space and provisions for handling that much data, such as backup and restore. |
|
![]() ![]() ![]() |
![]() |
D. Stress / Overload
conclusions |
This is done by gradually ramping-up the number of Vusers until the system "chokes" at a breakpoint (when the number of connections flatten out, response time degrades or times out, and errors appear). During tests, the resources used by each server are measured to make sure there is enough transient memory space and adequate memory management techniques. This effort makes sure that admission control techniques limiting incoming work perform as intended. This includes detection of and response to Denial of Service (DoA) attacks. |
|
![]() ![]() ![]() |
![]() |
E. Fail-Over
conclusions |
For example, this form of performance testing ensures that when one computer of a cluster fails or is taken offline, other machines in the cluster are able to quickly and reliably take over the work being performed by the downed machine. This means this form of performance testing requires multiple identical servers to be configured and using Virtual IP addresses accessed through a load balancer device. |
|
![]() ![]() ![]() |
![]() |
F. Spike |
Such runs can involve a "rendevous point" where all users line up to make a specific request at a single moment in time. Such runs enable the analysis of "wave" effects through all aspects of the system. Most importantly, these runs expose the efficacy of load balancing. |
|
![]() ![]() ![]() |
![]() |
F. Endurance
conclusions |
Because longer tests usually involve use of more disk space, these test runs also measure the pattern of build-up in "cruft" (obsolete logs, intermediate data structures, and statistical data that need to be periodically pruned). Longer runs allow for the detection and measurement of the impact of occasional events (such as Java Full GC and log truncations) and anomalies that occur infrequently. These tests verifies provisions for managing space, such as log truncation "cron" jobs that normally sleeps, but awake at predetermined intervals (such as in the middle of the night). |
|
![]() ![]() ![]() |
![]() |
H. Scalability |
The outcome of scalability efforts feeds a spreadsheet to calculate how many servers the application will need based on assumptions about demand. |
|
![]() ![]() ![]() |
![]() |
I. Availability
conclusions |
These are run on applications in production mode. This provides alerts when thresholds are reached and trends to guage the average and variability of response times. |
![]() ![]() ![]() |
| ![]() |
![]() ![]() ![]() |
![]() |
The above makes use of concepts from
Six Sigma![]() ![]()
|
![]() ![]() ![]() |
Traditional "Six Sigma" projects aim to improve existing products and processes
using a methodology with an acryonym of DMAIC (Commonly pronounced duh-may-ick,
for Define, Measure, Analyze, Improve, and Control).
|
![]() ![]() ![]() |
Load Testing is a sub-process of the Capacity Management function
within the
Service Management standards |
![]() |
The capacity plan is the consolidated output (deliverable) from the capacity management process.
|
![]() ![]() ![]() |
|
![]() ![]() ![]() |
|
|
![]() ![]() ![]() |
|
![]() ![]() ![]() |
|
![]()
|
This pseudo usecase diagram
summarizes the information (artifacts) flowing among people assuming certain
roles
involved in managing the performance of large applications:
|
WHATs - Requirements Critical to Satisfaction | Average Weight (1-5) | Casual User | Exp. User | Sys Admin |
---|---|---|---|---|
1.1 fast to load | 4.5 | 4 | 4 | 5 |
1.2 quick response after submit | 4.3 | 4 | 3 | 5 |
2.1 accepts batched transactions | 3.3 | 0 | 2 | 5 |
2.1 dependable | 3.1 | 4 | 4 | 5 |
3.1 Does not timeout | 1.9 | 1 | 3 | 2 |
4.1 Quick to Recover | 1.9 | 1 | 3 | 2 |
The column to the right of each requirement contains weight ratings that allow certain customer requirements to be weighted higher in priority than others in the list. The example shown here is the average of weights for different sub-groups. The "(1-5)" range in this example can be optionally replaced with ISO/IEC 14589-1 evaluation scales or advanced methods such as Thomas Saaty's "Analytic Hierarchy Process" used to establish scales with precise scales:
The customer sub-groups shown in this example is for roles working with a computer application:
QFD graphic programs can add:
The International TechneGroup, Inc. (ITI) approach for Concurrent Product/Manufacturing Process Development breaks this "WHATs" of the "voice of the customer" (VOC) down further into User Wants, Must Haves, Business Wants, and Provider Wants.
The CMM (Capability Maturity Model developed at Carnegie Mellon University) has 7 measures:
Associated with |
Quality Characteristic | Sub-characteristics | Definition: Attributes of software that bear on the ... |
---|---|---|
Functionality | Suitability | presence and appropriateness of a set of functions for specified tasks. |
Accurateness | provision of right or agreed results or effects. | |
Interoperability | Attributes of software that bear on its ability to interact with specified systems. | |
Compliance | Attributes of software that make the software adhere to application related standards or conventions or regulations in laws and similar prescriptions. | |
Security | Attributes of software that bear on its ability to prevent unauthorized access, whether accidental or deliberate, to programs or data. | |
Reliability | Maturity | frequency of failure by faults in the software. |
Fault tolerance | Attributes of software that bear on its ability to maintain a specified level of performance in case of software faults or of infringement of its specified interface. | |
Recoverability | capability to re-establish its level of performance and recover the data directly affected in case of a failure and on the time and effort needed for it. | |
Usability | Understandability | users' effort for recognizing the logical concept and its applicability. |
Learnability | users'effort for learning its application. | |
Operability | users'effort for operation and operation control. | |
Efficiency | Time behaviour | Attributes of software that bear on response and processing times and on throughput rates in performances its function. |
Resource behavior | amount of resource used and the duration of such use in performing its function. | |
Maintainability | Analyzability | effort needed for diagnosis of deficiencies or causes of failures, or for identification of parts to be modified. |
Changeability | effort needed for modification, fault removal or for environmental change. | |
Stability | risk of unexpected effect of modifications. | |
Testability | effort needed for validating the modified software. | |
Portability | Adaptability | opportunity for its adaptation to different specified environments without applying other actions or means than those provided for this purpose for the software considered. |
Installability | effort needed to install the software in a specified environment. | |
Conformance | Attributes of software that make the software adhere to standards or conventions relating to portability. | |
Replaceability | Attributes of software that bear on opportunity and effort using it in the place of specified other software in the environment of that software. |
ISO/IEC 14598 gives methods for measurements, assessment and evaluation of software product quality.
SPICE - Software Process Improvement and Capability dEtermination is a major international standard for Software Process Assessment. There is a thriving SPICE user group known as SUGar. The SPICE initiative is supported by both the Software Engineering Institute and the European Software Institute. The SPICE standard is currently in its field trial stage.
| ![]() ![]() | ![]() ![]() ![]() |
|
Perspective & Core Measures | Sample Metrics | Relevant Metrics |
---|---|---|
CustomerHow do our customers see us? |
Satisfaction, retention, market, and account share
|
|
Financial (Results)How do we look to shareholders? |
Return on investment and economic value-added:
|
|
Internal (Efficiency)What must we excel at? |
Quality, response time, cost, and new product introductions:
|
|
Learning and Growth How can we continue to improve and create value?
|
Employee satisfaction and information system availability:
|
|
These Balanced Scorecard metrics imply these business strategies:
|
|
|
|
![]() ![]() ![]() |
|
Test Type | Timing (When) |
---|---|
A. Speed | Parallel with coding construction, as this provides developers feedback on the impact of their choice of application architecture. |
| |
C. Data Volume | On each release when app components are being integrated. |
D. Stress/Overload | Pre-Production for each new application version or hardware configuration. |
E. Fail-over | Pre-Production for each new application version or hardware configuration. |
G. Scalability | Pre-Production for each new application version or hardware configuration. |
H. Availability | In Production for each new application version or hardware configuration. |
|
Context | Components | Tuning Options | |
---|---|---|---|
A. | Business | | |
B. | Applications | | |
C. | Operating System | | |
D. | Server Hardware Devices | | |
E. | Telecommunications Infrastructure | | |
F. | Data Center Operations |
| |
|
![]() ![]() ![]() |
|
![]() ![]() ![]() |
|
![]() ![]() ![]() |
EnvironmentsSpecific machines on the technology "stack"
Machines Specfic to the Load Test Environment |
Component resources within each server
|
![]() ![]() ![]() |
|
![]() ![]() ![]() |
|
![]() ![]() ![]() |
|
Potential Obstacle / Risk | Likelihood |
![]() |
---|---|---|
1. Servers not available early during the project. | High (80%) | a. Use dev. environment to develop single-user scripts. |
2. Difficulty with Controller licensing, capacity, etc. | Medium (50%) | b. Identify issues early by beginng to use controller as soon as the first small script (such as login only) is coded. |
3. Developers not available | Medium (50%) | c. Perform thorough system analysis to identify issues before scripting.
d. develop scripts with likely issues early. |
4. Not enough capacity in front-end (portal/login) servers. | Medium (40%) | e. Quanitfy capacity of front-end servers with login_only scripts. |
5. Change of personnel during the project. | Medium-High (60%) | f. Take notes. Conduct formal peer walk-throughs.
g. Make assignments for skill development. |
6. Servers become unavailable late during the project | Low (20%) | h. Use production staging environment at night.
i. Instead of going through load balancer, test directly against one server taken off its cluster. |
7. Changes in server hardware | Low (10%) | j. Conduct benchmark tests on hardware as part of project.
k. Save server configuration files for historical comparisons. |
|
Requests per day | Average Requests | Maximum Requests |
---|---|---|
10,000 | 1 | 6 |
50,000 | 3 | 8 |
100,000 | 5 | 14 |
215,000 | 10 | 21 |
300,000 | 14 | 27 |
450,000 | 20 | 35 |
500,000 | 24 | 40 |
648,000 | 30 | 47 |
864,000 | 40 | 59 |
1,000,000 | 47 | 68 |
1,080,000 | 50 | 71 |
1,500,000 | 70 | 94 |
2,000,000 | 93 | 120 |
2,160,000 | 100 | 128 |
Each piece of equipment has a limit on how much it can produce.
An assembly can only handle as much as its smallest channel.
For example, a web server has an input buffer, an internal queue, and an output buffer.
Rather than "average" loads, it's "maximum" values during various blocks of time.
Cash memory is the part of the computer that remembers how much money you
spent on your computer. The more you spend on your computer, the faster it will
work. That's why the million dollar computers work so fast - they have more
cash memory than you do.
So the capacity manager must involve him/herself in a large scope of all categories of the entire CIT architecture supporting the organization's Service Catalog:
The components within each server:
The impact of bottlenecks is included in the metric percentage utilization of resources.
The CDB contain the detailed technical, business, and service level management data that supports the capacity management process. The resource and service performance data in the database can be used for trend analysis and for forecasting and planning.
Mainframe based data collection methods, tools and techniques include MXG, SMF, SAS and quantitative analysis techniques.
verifier.exe /flags 2 /driver drivername
$40/$2 Professional Web Site Optimization
Wrox Press. February 1, 1997
by Michael Tracy, Scott Ware, Robert Barker, and Louis Slothouber
Microsoft's Open Wiki Forum for performance and scalability discussions
Computer Systems Performance Evaluation and Prediction
(Digital Press; October 20, 2002/2003)
by Dartmouth professors
Paul Fortier and
Howard Michel. This textbook fills the void between engineering practice and the academic domain's treatments of computer systems performance evaluation and assessment
by providing a single source on how to perform computer systems engineering tradeoff analysis
which allows managers to realize cost effective yet optimal computer systems tuned to a specific application.
List of Web Site Test Tools and Site Management Tools maintained by Rick Hower
|
![]() ![]() ![]() |
| ![]() "Would you believe ... ?" |
![]() ![]() ![]() |
| ![]() ![]()
|
![]() ![]() ![]() |
|
Examples of user steps/actions:
1. Invoke URL 2. Login/Logon 3. Pick/Start Application 4. Add one 5. Import batch 6. Search 7. Export batch 8. Archive/backup 9. Delete 10. Retrieve/undelete 11. Restore from archive 12. Exit application 13. Logout/Logoff |
An important outcome of the design phase is how results will be organized and
presented to various audiences.
Results from the |
Iteration | Conforming Connections | Percent Conform | Throughput ops/sec | Response msec | ops/sec/ loadgen | kbit/ sec |
---|---|---|---|---|---|---|
1 | 4130 | 100.0% | 11619.9 | 355.4 | 2.81 | 335.4 |
2 | 4130 | 100.0% | 11583.8 | 356.5 | 2.80 | 334.4 |
3 | 4130 | 100.0% | 11610.2 | 355.7 | 2.81 | 335.1 |
| This document makes use of the terminology from UML 2.0 Testing Profile specifications v1.0 (July 7, 2005) ![]() This enables the test definition and test generation based on structural (static) and behavioral (dynamic) aspects of UML models, UML2 Testing Profile was developed based on several predecessors: SDL-2000, MSC-2000, and TTCN-3 (Testing and Test Control Notation version 3) — also published as ITU-T Recommendation Z.140 — (Developed during 1999 - 2002 at the ETSI (European Telecommunications Standards Institute)) is a widely accepted standard in the telecommunication and data communication industry as a protocol test system development specification and implementation language to define test procedures for black-box testing of distributed systems.
Neil J. GuntherProfessor of Computer Science at the Federal University of Minas Gerais, Brazil, Xerox PARC & Pyramid (Fujitsu) alumnus, founder of Performance Dynamics and developer of the PARCbench multiprocessor benchmark and![]()
|
![]() ![]() ![]() |
|
Completion Nickname | Milestone Description | Target Date |
---|---|---|
m.1.allocated m.2.delivered m.3.assembled m.4.installed m.5.configured m.6.available m.7.operational m.8.benchmarked |
|
|
![]() ![]() ![]() |
The IBM Rational Process (RUP) for Performance tracks the progress of performance testing
by the maturity of scripting assets.
|
Test Level | A. Config. | B. # Users | C.Range of values | D.Prior Data | E.Length of run | Purpose (Type of Testing) |
---|---|---|---|---|---|---|
1. | Initial | One | Static | None | Short | Initial script creation through recording and playback, plus addition of error checking, transaction definition, etc. |
2. | Initial | Few | Dynamic | None | Short | Initial data parametization and monitor debugging toward
![]() |
3. | Initial | Few | Complex | None | Short | Initial scenario debugging and report template formatting toward
![]() |
4. | Initial | Few | Complex | Much | Short | ![]() |
5. | Baseline | Many | Complex | Much | Short | ![]() ![]() |
6. | Baseline | Many | Complex | Much | Long | ![]() |
7. | Altered | Many | Complex | Much | Short | Application Regression smoke testing vs. the stress-tested baseline above. |
8. | Altered | Many | Complex | Much | Long | Comparison testing (vs. the Current Baseline) for
![]() |
|
![]() ![]() ![]() |
|
Completion Nickname | Milestone Description | Target Date |
---|---|---|
t.1.defined t.2.drafted t.3.repeatable t.4.validated t.5.results confirmed t.6.analyzed t.7.concluded |
The scripting milestones (below) are repeated for each
form/type of performance testing (t):
Note: There may be several rounds of testing to determine the impact of each change to run parameters or application code. |
| ![]() ![]()
Virgilio A.F. Almeida![]() ![]() ![]()
Daniel A. Menasce![]() ![]()
104707
eBook ISBN: 1417507810
|
![]() ![]() ![]() |
|
![]() ![]() ![]() |
| Tuning Java![]() ![]()
Among White Papers by Mercury Interactive:
|
![]() ![]() ![]() |
|
![]() ![]() ![]() |
|
![]() ![]() ![]() |
Related Topics:
Load Testin Products
Mercury LoadRunner
Mercury LoadRunner Scripting
NT Perfmon / UNIX rstatd Counters
WinRunner
Rational Robot
Free Training!
Tech Support
![]()
| Your first name: Your family name: Your location (city, country): Your Email address: |
Top of Page ![]() Thank you! |