How I may help
Email me!
Call me using Skype client on your machine

Software Performance Project Planning

This page presents the phases, deliverables, roles, and tasks for a full performance test project that makes use of several industry best practices and tools for load testing and performance engineering — one of the activities for capacity management of IT Service Management (ITSM). This is a companion to my Sample Load Testing Reportgo to another page on this site

"If you can't describe what you are doing as a process, you don't know what you're doing." —W. Edwards Deming

 

RSS XML feed for load testers RSS preview for load testers Site Map List all pages on this site 
About this site About this site 
Go to first topic Go to Bottom of this page

Set screen Aspects of a Performance Improvement Project

Go to Top of this page.
Previous topic this page
Next topic this page

Set screen Pre-requisites

Go to Top of this page.
Previous topic this page
Next topic this page

Set screen Phases: Define > Measure > Analyze > Improve > Control

Go to Top of this page.
Previous topic this page
Next topic this page
Set screen

    The approach above was drawn from several capacity management frameworks:

    In the electronics industry:

      Near the end of the Prototyping stage, after engineers create actual working samples of the product they plan to produce, Engineering Verification Testing (EVT) uses prototypes to verify that the design meets pre-determined specifications and design goals. This is done to validate the design as is, or identify areas that need to be modified.

      After prototyping, and after the product goes though the Design Refinement cycle when engineers revise and improve the design to meet performance and design requirements and specifications, objective, comprehensive Design Verification Testing (DVT) is performed to verify all product specifications, interface standards, OEM requirements, and diagnostic commands.

      Process (or Pilot) Verification Test (PVT) is a subset of Design Verification Tests (DVT) performed on pre-production or production units to Verify that the design has been correctly implemented into production.

    The Microsoft Operations Framework (MOF) defines this circular process flow of capacity management activities:

    1. Trend Analysis
    2. Modeling
    3. Optimization
    4. Change Initiation
    5. Monitoring

    Oracle's Expert Services' Architecture Performance Capacity Scope & Assessment consulting uses these phases and deliverables:

    Phase Deliverable
    1. Define Assessment Criteria A. Assessment Scope
    B. Business and Functional Needs
    2. Define High Level Requirements C. High Level Requirements Matrix
    3. Document System Baseline D. Current Architecture Diagram (Baseline) Report
    4. Generate Findings, Recommendations, and Conceptual Architecture E. Conceptual Architecture Diagram
    F. Assessment Report
    G. Final Presentation

    Software Performance Engineering (SPE)

    Smith's Software Performance Engineering (SPE) approach begins with these detailed steps:

    1. Assess Performance Risks such as antipatterns (recurring causes of problems). A flood of simultaneous requests is called "The Slashdot Effect" because the many readers of slashdot.org visit a site at the same time after it is mentioned on the online magazine.
    2. Identify critical Use Casesgo to another page on this site
    3. Select key performance workload scenarios (sequence diagrams detailing key use cases)
    4. Establish performance objectives under specific workload intensities (such as best and worst case response times).
    5. Define performance model(s): Queueing network models (QNM) use Information Processing Graph notation to illustrate the locus of execution among a sequence of queues. Execution Graphs illustrate the probability and frequency of execution of processing steps relevant to performance analysis.
    6. Construct performance model(s)
    7. Add software resource requirements (e.g., messages sent, database accesses, etc.)
    8. Add computer resource requirements (e.g., CPU instruction utilization, disk I/O throughput, network connections, screen draws)
    9. Evaluate performance model(s)

    "5S" Kaizen Lean Approach

    Sort > Stabilize (Set in order) > Shine > Standardize > Sustain

    Sort tools used most often vs. what is infrequent
    Stabilize (make tools easy to access)
    Shine (make defects easy to catch, keep tools sharp and appropriate)
    Standardize (teamwork)
    Sustain

Go to Top of this page.
Previous topic this page
Next topic this page

Set screen Test Policies

    The Test Policy is the document which describes an organization's philosophy towards the testing (or quality assurance) of software. The test policy is complementary to, or a component of, the organization's Quality Policy, which describes the basic views and aims of a company, regarding quality, as pursued by management.

    The Test Handbook is a framework document which describes the test steps to be excuted to address risks which should be covered by software testing. and the test activities to be carried out.

      Risk is a combination of the possibility of the occurrence of a problem and the resulting effect of that problem.

    The test concept describes the test steps and test activities to be executed for a particular project.

    The test step plan details the procedure for a test step and describes implementation of the test concept for a particular test step.

Go to Top of this page.
Previous topic this page
Next topic this page

Set screen Deliverables Flow

Go to Top of this page.
Previous topic this page
Next topic this page

Set screen Forms/Types of Performance Testing/Engineering

Set screen Accomplishments

Go to Top of this page.
Previous topic this page
Next topic this page
Set screen

A. Speed Tests
(for Responsiveness)

conclusionsgo to another page on this site

    During speed testing, the user response time (latency) of each user actionon this page is measured.

    The script for each action will look for some text on each resulting page to confirm that the intended result appears as designed.

    Since speed testing is usually the first performance test to be performed, issues from installation and configuration are identified during this step.

    Because this form of performance testing is performed for a single user (under no other load), this form of testing exposes issues with the adequacy of CPU, disk I/O access and data transfer speeds, and database access optimizations.

    The performance speed profile go to another page on this site of an application obtained during speed testing include the time to manually start-up and stop the application on its servers.

  1. Identified the business processes under test.
  2. Documented each user actionon this page to be measured.
  3. Documented production installation configuration instructions and settings.
  4. Quantified the start-up, shut-down, and user GUI transaction response (latency) times when the system is servicing only a single user at a time (under no other load) in order to determine whether they are acceptable.
  5. Ensured CPU, disk access, data transfer speeds, and database access optimizations are adequate.
Go to Top of this page.
Previous topic this page
Next topic this page
Set screen

B. ContentionTests (for Robustness)

conclusionsgo to another page on this site

    This form of performance test aims to find performance bottlenecks (such as lock-outs, memory leaks, and thrashing) caused by a small number of Vusers contending for the same resources.

    Each run identifies the minimum, average, median, and maximum times for each action. This is done to make sure that data and processing of multiple users are appropriately segregated.

    Such tests identify the largest burst (spike) of transactions and requests that the application can handle without failing. Such loads are more like the arrival rate to web servers than constant loads.

  1. Identified performance bottlenecks (such as lock-outs, memory leaks, and thrashing) caused by a small number of Vusers contending for the same resources.
  2. Ensured that data and processing of multiple users are appropriately segregated.
  3. Identified the largest burst (spike) of transactions and requests that the application can handle without failing. Such loads are more like the arrival rate to web servers than constant loads.
Go to Top of this page.
Previous topic this page
Next topic this page
Set screen

C. Volume Tests (for Extendability)

conclusionsgo to another page on this site
    This form of performance testing makes sure that the system can handle the maximum size of data values expected.

    These test runs measure the pattern of response time as more data is added.

    These tests make sure there is enough disk space and provisions for handling that much data, such as backup and restore.

  1. Quantified the degradation in response time and resource consumption at various levels of simultaneous users. This is done by gradually ramping-up the number of Vusers until the system "chokes" at a breakpoint (when the number of connections flatten out, response time degrades or times out, and errors appear).
  2. Determined how well the number of users anticipated can be supported by the hardware budgeted for the application.
  3. Quantified the "Job flow balance" achieved when application servers can complete transactions at the same rate new requests arrive.
  4. Ensured that there is enough transient memory space and memory management techniques.
  5. Make sure that admission control techniques limiting incoming work perform as intended. This may include extent of response to Denial of Service (DoA) attacks.
Go to Top of this page.
Previous topic this page
Next topic this page
Set screen

D. Stress / Overload
Tests (for Sustainability)

conclusionsgo to another page on this site
    This form of performance testing determines how well the number of users anticipated can be supported by the hardware budgeted for the application.

    This is done by gradually ramping-up the number of Vusers until the system "chokes" at a breakpoint (when the number of connections flatten out, response time degrades or times out, and errors appear).

    During tests, the resources used by each server are measured to make sure there is enough transient memory space and adequate memory management techniques.

    This effort makes sure that admission control techniques limiting incoming work perform as intended. This includes detection of and response to Denial of Service (DoA) attacks.

  1. Quantified the degradation in response time and resource consumption at various levels of simultaneous users.

  2. Determined how well the number of users anticipated can be supported by the hardware budgeted for the application.

  3. Quantified the "Job flow balance" achieved when application servers can complete transactions at the same rate new requests arrive.

  4. Ensured that there is enough transient memory space and memory management techniques.

  5. Make sure that admission control techniques limiting incoming work perform as intended. This may include extent of response to Denial of Service (DoA) attacks.

Go to Top of this page.
Previous topic this page
Next topic this page
Set screen

E. Fail-Over
Tests (for Resilience & Recoverability)

conclusionsgo to another page on this site
    This form of performance testing determines how well (how quickly) the application recovers from overload conditions.

    For example, this form of performance testing ensures that when one computer of a cluster fails or is taken offline, other machines in the cluster are able to quickly and reliably take over the work being performed by the downed machine.

    This means this form of performance testing requires multiple identical servers to be configured and using Virtual IP addresses accessed through a load balancer device.

  1. Determined whether the application can recover after overload failure.

  2. Measured the time the application needs to recover after overload failure.

Go to Top of this page.
Previous topic this page
Next topic this page
Set screen

F. Spike
"Peak-Rest" or
"Daily" Tests

conclusionsgo to another page on this site
    This form of performance testing involves suddenly adding the maximum sustainable load and then returning to a lower level of load to determine whether the app can obtain memory quickly, and then release that memory when no longer needed.

    Such runs can involve a "rendevous point" where all users line up to make a specific request at a single moment in time.

    Such runs enable the analysis of "wave" effects through all aspects of the system.

    Most importantly, these runs expose the efficacy of load balancing.

    article article

  1. Determined — through suddenly adding and then completing transactions — that the app releases memory.

Go to Top of this page.
Previous topic this page
Next topic this page
Set screen

F. Endurance
"Soak"
"Longevity"
Tests (for Reliability)

conclusionsgo to another page on this site
    This form of performance testing makes sure that the system can sustain -- over at least a 24 hour period -- a consistent number of concurrent Vusers executing transactions using near peak capacity.

    Because longer tests usually involve use of more disk space, these test runs also measure the pattern of build-up in "cruft" (obsolete logs, intermediate data structures, and statistical data that need to be periodically pruned).

    Longer runs allow for the detection and measurement of the impact of occasional events (such as Java Full GC and log truncations) and anomalies that occur infrequently.

    These tests verifies provisions for managing space, such as log truncation "cron" jobs that normally sleeps, but awake at predetermined intervals (such as in the middle of the night).

  1. Ensured that the system can sustain over at least a 24 hour period a consistent number of concurrent Vusers executing transactions using near peak capacity.

  2. Measured the pattern of build-up in cruft (logs, data structures, and statistics that need to be periodically pruned).

  3. Detected the impact of occasional events (such as automatic cache flushes, Java Full GC and log truncations) and anomalies that occur infrequently.

  4. Make sure there is enough disk space and provisions for managing space, such as log truncation jobs that only occur automatically in the middle of the night.

Go to Top of this page.
Previous topic this page
Next topic this page
Set screen

H. Scalability
(Efficiency) or
Reconfiguration
Tests

conclusionsgo to another page on this site
    This form of performance testing involves repeating tests above on different server/network hardware configurations to determine the most cost-effective option to support targeted load levels (one aspect of Capacity Planning).

    The outcome of scalability efforts feeds a spreadsheet to calculate how many servers the application will need based on assumptions about demand.

  1. Determined — through repeated tests on different server/network hardware configurations — the most cost-effective option to support targeted load levels (one aspect of Capacity Planning).

Go to Top of this page.
Previous topic this page
Next topic this page
Set screen

I. Availability
(Schedulability)

conclusionsgo to another page on this site
    This form of performance testing provides a continuous assessment of the availability and speed of key user actions.

    These are run on applications in production mode.

    This provides alerts when thresholds are reached and trends to guage the average and variability of response times.

Go to Top of this page.
Previous topic this page
Next topic this page

Set screen The Organizational Context of Load Testing: Capacity Management

    "Wham!" by Roy Lichtenstein
    Load Testing is relatively new profession. So more often than not, there is misunderstanding.

    Testing is important because it is:

    • a means to obtain objective quality metrics about systems in their target environment.
    • the central means to relate requirements and specification to the real system.

    But testing is

    • Rarely practiced
    • Unsystematic
    • Error-prone
    • Considered being destructive
    • Uncool ("If you are a bad programmer, you might be a tester")

    Some (mistakenly) think that there is no need for capacity planning and performance engineering since hardware has become so cheap that it takes more time and effort than to "simply" over-engineer architectures or "just" add more when necessary.

    However, most of the bottlenecks today are in the software, which are expensive to diagnose. Additionally, many default configuration options are not appropriate for running large servers.

    The newness of the profession make its "fit" within organizations subject to individual fears with personal power and control politics overshadowing organizational needs.

    In some way, its role is much like people who use wind-tunnels within an aircraft manufacturer. It can be a frustrating politically — if response time is good, you get no credit. If performance degrades, you aren't doing your "job".

    I believe that because of the cross-functional coordination nature of capacity management, it best belongs within a PMO (Project Management Office) guiding cross-functional initiatives requiring the cooperation of the entire enterprise, such as ERP, Y2K, and Digitization (elimination of paperwork).

Go to Top of this page.
Previous topic this page
Next topic this page

Set screen Capacity Management

    Capacity is a capability that is required for delivering an agreed-upon performance at the required service level and cost.

    A META Group study published in September 2003 reveals that capacity planning is the top critical issue for large enterprises (those with more than 1,000 people), with 33.8 percent of respondents identifying this as a critical issue. This high priority will continue through 2005/2006, escalating consolidation and efficiency demands.

    Load testing ensures that demand for computing power can be met by the supply of computing power.

    Proactive capacity management balances business requirements with IT resources, so you can consistently deliver quality service at minimum cost while minimizing the risks of higher utilization rates.

    Both ITIL and MOF (Microsoft Operations Framework) recognize that CM consists of three sub-processes:

    1. Business Capacity Management (BCM) is implemented by
      Demand Management activities responsible for ensuring that the future business requirements for IT services are considered, planned, and implemented in a timely manner. The capacity management staff can achieve this by analyzing current resource utilization of the various IT solutions and generating trends and forecasts. These future requirements come from account management, which constantly probes current and future customer needs.

    2. Service Capacity Management (SCM) is implemented by
      Workload Management activities responsible for translating customer demands into workloads required by IT solutions (the various applications used to create the actual solution) so that the required resources can be determined from this analysis. The process translates both current and future demands to workloads.

    3. Resource Capacity Management (RCM) is implemented by
      Performance Management activities

 

Go to Top of this page.
Previous topic this page
Next topic this page
Set screen The above makes use of concepts from Six Sigmaon this page vs. the "Planning to Implement Service Management" function referenced by ITILon this page based SIPs (Service Improvement Plans).

Go to Top of this page.
Previous topic this page
Next topic this page

    Set screen Six Sigma

    Traditional "Six Sigma" projects aim to improve existing products and processes using a methodology with an acryonym of DMAIC (Commonly pronounced duh-may-ick, for Define, Measure, Analyze, Improve, and Control).

    This is one of many Design for Six Sigma (DFSS) methodologies.

    The words Identify, Design, Optimize, and Verify in [brakets] are the basis for the acronym to the IDOV methodology for designing new products and services to meet six sigma standards.

    I also appreciate the "Define, Measure, Explore, Develop and Implement" steps from the PricewaterhouseCoopers methodology because they treat performance project artifacts with the same controls as "real" developers.

Go to Top of this page.
Previous topic this page
Next topic this page

    Set screen ITIL Service Delivery

    Load Testing is a sub-process of the Capacity Management function within the Service Management standardsgo to another page on this site ITIL (Information Technology Infrastructure Library) and its derivatives the BS 15000 and the MOF (Microsoft Operations Framework)go to another page on this site

      Visio 2000 flowchart file

    In this pseudo usecase diagramgo to another page on this site of Service Delivery

    1. Business, Customers, and Users
      1. make Queries/Enquiries
      2. and exchange Communications, Updates, and Reports with the
    2. Service Level Management function which
      1. defines Requirements from a Service Catalog quantified
      2. into Targets defined in external SLAs (Service Level Agreements) and internal OLAs (Operating Level Agreements) translated into Operating Level Requirements (OLRs).
      3. tracked for achievements by SLRs (Service Level Reports)

    3. Availability Management creates an Availability Plan based on Design Criteria. Success at realizing them is tracked in availability Targets/Thresholds reports.
    4. Capacity Managementon this page creates a Capacity Plan and Schedules from a CDB. Success at realizing them is tracked in reports of capacity Targets/Threasholds reports
    5. Financial Management for IT Services creates a Financial Plan based on the Costs & Charges of each type and model summarized into Budgets and Forecasts reports
    6. IT Service Continuity Management creates an IT Continuity Plan based on Risk Analysis

    7. All these organizational functions create
      • Alerts and Exception Changes
      • supported by Management Tools and Infrastructure
      managed by the Service Reporting function and supported/monitored by
    8. Information Security Management

Go to Top of this page.
Previous topic this page
Next topic this page

    Set screen Capacity Plan

    The capacity plan is the consolidated output (deliverable) from the capacity management process.

    The capacity plan recommends the resource levels and changes necessary to accomplish operating level requirements that support the service level agreement (SLA). The capacity plan includes the cost and benefit of those resources, reports of their compliance to the IT SLA, and the priority and impact of systems and resources on the overall business and the IT infrastructure.

    The Capacity Plan documents:

    • Actual Capacity Utilization (current levels and service performance of resources being used)
    • Desired Capacity Utilization (a forecast future resource requirements based on business requirements and the IT services needed to support them)
    • Basis for Budget Planning

Go to Top of this page.
Previous topic this page
Next topic this page
Go to Top of this page.
Previous topic this page
Next topic this page

Set screen Organizational Concerns

    The presenting problem with scalability issues is different depending on your position in a large organization.

    • To developers of software, it could be mean re-architecting which program modules or services performs some aspect of the application's work. Developers tend to start by a profiling report ranking how often specific modules (classes) are executed.

    • To the DBA (Data Base Administration) organization, improvement could mean optimization of the proc, such as adding indexes.
      DBAs tend to start by ranking how often specific database procedures hit the production database.

    • To data center Operations personnel, it could mean adding hard drives, additional servers, fiber, NAS filer devices, etc. to prevent and recover from crashes.
      Such people make use of logs or summaries of logs produced by servers under their control.

    • To quality assurance management which is separate from the above organizations, a scalability improvement project may be a way for executives to assess (audit) the performance of individuals in the other departments.

    The performance engineer's role is often misdefined and misunderstood. He or she can be blamed for not providing information, or scorned for providing unfavorable information.

    To operations personnel who are used to doing what they want, their support of capacity management efforts can be perceived as additional unneeded intrusion

    Naturally, the role of capacity management is reconciling the needs of all parts of the organization mentioned above at the same time at the most overall cost effective" way.

    Organizational design issues can setup performance engineering projects and personnel for either success or failure.

    Network Performance Management systems/platforms (such as Avesta's Trinity, Loran Kinnetics, Manage.com's Frontline, and NextPoint's S3) have these capabilities:

    • SNMP device management software agents gather information from SNMP agents in network devices and systems
    • RMON/RMON II or probe links for traffic monitoring agents track the overall performance of network connections
    • Response time measurement agents gauge how well applications, databases, and transactions are performing over the intranet.
    • Real-time event filtering agents generate warnings and alerts when devices break or traffic conditions deteriorate
    • Historical trend analysis agents store performance data over time to generate periodic graphical representations of network health and status.

    As organizations move toward virtualization of server capacity, the job of capacity management would naturally be more about monitoring and managing costs. balancing the production network and benchmarking

   


For want of a nail,
the shoe was lost;
For want of a shoe,
the horse was lost;
For want of a horse,
the rider was lost;
For want of a rider,
the battle was lost;
For want of a battle,
the kindgdom was lost;

— Benjamin Franklin (1706-90) in "Poor Richard's Almanack," June 1758, The Complete Poor Richard Almanacks, facsimile ed., vol. 2, pp. 375, 377 (1970)

eBook ISBN 0585376344 Information Technology Evaluation Methods and Management (Hershey, Pa. Idea Group Publishing, 2001) by Wim Van Grembergen

eBook ISBN 1580536638 Achieving Software Quality Through Teamwork (Boston Artech House, 2004) by Isabel Evans

eBook ISBN 0471272795 Essentials of Capacity Management (New York John Wiley & Sons, 2002) by Reginald Tomas Yu-Lee

eBook Six Sigma Team Dynamics : The Elusive Key to Project Success (New York John Wiley & Sons, Inc. 2003) by George Eckes


“It is much easier to make measurements than to know exactly what you are measuring.” —J. W. N. Sullivan (1928)

 

Go to Top of this page.
Previous topic this page
Next topic this page

Set screen Operating Level Agreements (OLA)

    An operating level agreement (OLA) formalizes agreements between two or more (usually internal) IT entities into measurable service metric "targets" of specific levels of quality and quantity (cost) of prescribed services.

    It is similar to but is normally not as formal as SLAs (Serice Level Agreements) with customers. The OLA should have its metrics stored in the CDB.

    The META Group anticipates that through 2005, more than half of IT organizations will invest in formalized IT business plans governed by service-level agreements (SLAs). Unfortunately, fewer than 10 percent of IT organizations have a well-defined service-level management process in place today that can accurately and consistently communicate relevant service levels to the business units.

Go to Top of this page.
Previous topic this page
Next topic this page

Set screen Artifacts and Information Flows among Roles

      Visio 2000 flowchart file

    This pseudo usecase diagramgo to another page on this site summarizes the information (artifacts) flowing among people assuming certain roleson this page involved in managing the performance of large applications:

      Before any Project:

    1. Management (C.) defines Roleson this page to resources assigned to the project in accordance with Corporate Performance and Capacity Management Strategies (an extension of this document adapted to an organization) sets the process and limits for all performance engineering activities and projects (such as the rationale for tools acquired, equipment scheduling processes, etc.)

      During planning:

    2. Estimated Market (User) Usage Patterns and
    3. Budgeted Costs of provisions
      • for the project and
      • for The application/system under test
      both reconciled in a:
    4. Performance Engineering Plango to another page on this site for each application/project (a part of each application's development plan), which includes Project Workflow Stepson this page appropriate to Project objectiveson this page customer requirements, characteristics of the product under test, and the interactions among them. The technological aspects of this plan is drawn from
    5. Performance Tuning Ideas and Techniquesgo to another page on this site used both by developers and by the performance engineer to create

      During construction:

    6. Data Configurations and Application Releases are installed and exercised using
    7. Test Scripts & Run Parameterson this page which yields
    8. Performance Test Resultsgo to another page on this site and Reportsgo to another page on this site which are the basis of
    9. Simulated Load Patternsgo to another page on this site captured in a
    10. Capacity Modelon this page (built using MS-Excelgo to another page on this site spreadsheet or HyPerformix software) to identify
    11. Predicted Performance Profiles for anticipated loads and
    12. Alert Trigger thresholds to planned Upgrade Points

      After deployment:

    13. Actual Load Patterns can be identified in response to
    14. Actual Usage Patterns, which may result in
    15. Off-estimate alerts

Go to Top of this page.
Previous topic this page
Next topic this page

Set screen Project Mission and Objectives for Customer Satisfaction

    WHATs - Requirements
    Critical to Satisfaction
    Average
    Weight (1-5)
    Casual
    User
    Exp.
    User
    Sys
    Admin
    1.1 fast to load 4.5 4 4 5
    1.2 quick response after submit 4.3 4 3 5
    2.1 accepts batched transactions 3.3 0 2 5
    2.1 dependable 3.1 4 4 5
    3.1 Does not timeout 1.9 1 3 2
    4.1 Quick to Recover 1.9 1 3 2
    Organizations implementing Design for Six Sigma (DFSS) are wise to scope projects in terms of Customer Requirements (also called "demanded" wants/delights satisfying needs). Kai Yang[2] calls these Critical-to-satisfaction (CTS) items.

    The column to the right of each requirement contains weight ratings that allow certain customer requirements to be weighted higher in priority than others in the list. The example shown here is the average of weights for different sub-groups. The "(1-5)" range in this example can be optionally replaced with ISO/IEC 14589-1 evaluation scales or advanced methods such as Thomas Saaty's "Analytic Hierarchy Process" used to establish scales with precise scales:

    The customer sub-groups shown in this example is for roles working with a computer application:

    • Casual users (who usually prefer sessions to timeout automatically and don't care about batch transactions)
    • Experienced / Power users (who need to batch several transactions with each submit)
    • System/Data Administrators

      Alternately, each item may be rated against competitors and alternatives in a "benchmarking" study, from which gaps (marketplace strengths and weakenesses) can be identified and quantified.

    QFD graphic programs can add:

    • a vertical line graph to illustrate differences in ratings among sub-groups or competitors.

    • a triangle to the right of these items to flag relationships among requirements: Related requirements are noted with "+". Conflicting (mutually exclusive) requirements are noted with "-".

    Set screen HOWs - Product Requirements - Technical Engineering Design Characteristics

    At the heart of the Quality Function Deployment (QFD) approach is a matrix of how well each customer requirement is satisfied by measurable product requirements (also called by various authors product design engineering characteristics) that are:

    The International TechneGroup, Inc. (ITI) approach for Concurrent Product/Manufacturing Process Development breaks this "WHATs" of the "voice of the customer" (VOC) down further into User Wants, Must Haves, Business Wants, and Provider Wants.

    • critical-to-quality (CTQ) to users, such as:
      • Average response time to fully load a web landing page from a root URL request.
      • Average response time to accept/update a user registration page (of 10 data fields)
      • Average response time to complete a login.
      • Average response time to obtain a 20 item result page from a typical search (of 3 key values)
      • Average user time to complete a booking (or other specific business process)
    • critical-to-delivery (CTD), such as:
      • Number of seconds from failure (through reboot and startup) to acceptance of first transaction.
    • critical-to-cost (CTC), such as:
      • "Cumulative number of transactions before manual intervention is required (for data purging, etc.)"
      • "Number of peak transactions per hour per application server"

    The CMM (Capability Maturity Model developed at Carnegie Mellon University) has 7 measures:

    1. Need Satisfaction measures (Effectiveness, Responsiveness, Correctness, Versatility)
    2. Performance measures (Dependability, Efficiency, Usability, Fidelity)
    3. Maintenance measures (Maintainability, Understandability)
    4. Adaptive measures (Interoperability, Portability, Scalability, Revisablility)
    5. Organizational measures (Cost of ownership, Productivity)

Go to Top of this page.
Previous topic this page
Next topic this page

      Set screen Software Quality Requirements

      Associated with SEI/CMU's Taxonomy of Quality Measures, the 2000 revision to ISO/IEC FDIS 9126:1991 and SQuaRE defines 3 types of software quality requirements:

      • quality-in-use: the user's view of the quality of the software product when it is used in a specific environment and specific Context-of-use — the extent users can achieve their goals in a particular environment. Its characteristics:
        • Effectiveness
        • Productivity
        • Safety
        • Customer Satisfaction

      • External quality: of the product, and
      • Internal quality: applicable to interim products such as documents, and source code.

      Quality
      Characteristic
      Sub-characteristics Definition: Attributes of software that bear on the ...
      Functionality Suitability presence and appropriateness of a set of functions for specified tasks.
      Accurateness provision of right or agreed results or effects.
      Interoperability Attributes of software that bear on its ability to interact with specified systems.
      Compliance Attributes of software that make the software adhere to application related standards or conventions or regulations in laws and similar prescriptions.
      Security Attributes of software that bear on its ability to prevent unauthorized access, whether accidental or deliberate, to programs or data.
      Reliability Maturity frequency of failure by faults in the software.
      Fault tolerance Attributes of software that bear on its ability to maintain a specified level of performance in case of software faults or of infringement of its specified interface.
      Recoverability capability to re-establish its level of performance and recover the data directly affected in case of a failure and on the time and effort needed for it.
      Usability Understandability users' effort for recognizing the logical concept and its applicability.
      Learnability users'effort for learning its application.
      Operability users'effort for operation and operation control.
      Efficiency Time behaviour Attributes of software that bear on response and processing times and on throughput rates in performances its function.
      Resource behavior amount of resource used and the duration of such use in performing its function.
      Maintainability Analyzability effort needed for diagnosis of deficiencies or causes of failures, or for identification of parts to be modified.
      Changeability effort needed for modification, fault removal or for environmental change.
      Stability risk of unexpected effect of modifications.
      Testability effort needed for validating the modified software.
      Portability Adaptability opportunity for its adaptation to different specified environments without applying other actions or means than those provided for this purpose for the software considered.
      Installability effort needed to install the software in a specified environment.
      Conformance Attributes of software that make the software adhere to standards or conventions relating to portability.
      Replaceability Attributes of software that bear on opportunity and effort using it in the place of specified other software in the environment of that software.

      ISO/IEC 14598 gives methods for measurements, assessment and evaluation of software product quality.

      SPICE - Software Process Improvement and Capability dEtermination is a major international standard for Software Process Assessment. There is a thriving SPICE user group known as SUGar. The SPICE initiative is supported by both the Software Engineering Institute and the European Software Institute. The SPICE standard is currently in its field trial stage.

Go to Top of this page.
Previous topic this page
Next topic this page

Set screen Project Numerical Business Goals

    Capacity planning saves money by balancing two conflicting conditions:

    • the expense of unused capacity (in specific componets or system-wide) and, on the other extreme,
    • the loss of profits from not having enough capacity to meet demand, especially during busy periods such as during Christmas shopping, Superbowl, and other peak seasons and events.

   
Requires Skillport membership Balanced Scorecard Diagnostics: Maintaining Maximum Performance (John Wiley & Sons © 2005, 224 pages) by Paul R. Niven presentis a step-by-step methodology for analyzing the effectiveness of a company's balanced scorecard, with tools to reevaluate measures for driving maximum organizational performance.

Go to Top of this page.
Previous topic this page
Next topic this page

    Set screen "Balanced Scorecard" Metrics

    The "Balanced Scorecard" (BSC) was introduced in 1996 by a popular book written by Robert Kaplan and David Norton (consultants and professors at the Harvard Business School).

    This lists the perspectives of a Balanced Scorecard, and some activity and Key Process Indicators (KPIs) which are most relevant to capacity and performance management:

    Perspective &
    Core Measures
    Sample Metrics Relevant Metrics

    Customer

    How do our customers see us?
    • Market share
    • Customer acquisition
    • Customer retention
    • Customer profitability
    • Customer satisfaction
    Satisfaction, retention, market, and account share
    • Over 12 months old customer sales
    • Under 12 months old customer sales
    • Reports on contact with larger customers
    • Sales of new/improved products
    • # Customers (Users)
    • # Customer Interactions (linked transactions)
    • % User Wait Times vs. an expected statistical profile
      • Response Time for various user actions
      • Network delay Time from various locations
      • User Think Time between specific user actions
    • Abandonment rate per interaction
    • % Returning Customers conducting interactions
    • % Complaints per million transactions

    Financial (Results)

    How do we look to shareholders?
    • Return-on-investment/
      economic value-added
    • Profitability
    • Revenue growth/mix
    • Cost reduction productivity
    Return on investment and economic value-added:
    • Sales
    • Gross margin after external purchases
    • Gross wages as % of sales
    • End month sales ledger
    • End month purchase ledger
    • Minimum cash available during month
    • % Asset Utilization:
      % of capacity utilized during time frame
    • # Hours worked
    • Revenue, Cost, and Profit Per
      Customer
    • Revenue, Cost, and Profit Per
      Customer Interaction

    Internal (Efficiency)

    What must we excel at?
    Quality, response time, cost, and new product introductions:
    • Sales per Employee
    • "Value added" per Employee
    • % days lost to illness
    • % reserve capacity available
    • # Planned labor hours spent (on maintenance)
    • # Unplanned labor hours spent (fixing breakdowns)
    • Predictability: % Unplanned vs. Planned labor hours
    • Manageability: # Unplanned events (such as server reboots & restores)
    • Reliability: Mean Time Between Failures or other Unplanned Events
    • Maintainability: Mean Time To Repair or other action (such as upgrade a server)
    • Controllability: % of unplanned events expected statistically

    Learning and Growth
    (Agility, Innovation)

    How can we continue to improve and create value?
    • Employee satisfaction
    • Employee retention
    • Employee productivity
    Employee satisfaction and information system availability:
    • No. of current improvement projects
    • No. of benchmarking visits
    • No. of new/improved products introduced
    • % of Employee days on training/learning
    • # innovation hours planned
    • % of total time spent on innovation (20% at Google)
    • % of capabilties (metrics) available as planned.

    These Balanced Scorecard metrics imply these business strategies:

    • To increase product quality and reliability, invest time on innovation rather than mere maintenance.
    • Find a way to maintain a precarious balance between seemingly conflicting ends:
      • To maintain customer satisfaction:
          keep response times within a statistically expected profile by keeping a percentage of reserve capacity available (for surges in business growth) so that the abandonment rate and rate of complaints remains low.
        Yet
      • To decrease project cost (labor hours):
          achieve a higher percentage of asset utilization.
    • To increase profit, increase the percentage of returning customers.
    • To shorten project cycle time, reduce the number of unplanned events.
    • To decrease project risk, increase the percentage of hours that are planned vs. unplanned.

    Set screen Management Dashboards

    To each metric, dice and slice:

    • This ...
    • Last ...
    • This vs. Last ...
    • Cumulative ...
    • Hour
    • Day
    • Week
    • 4 Weeks
    • Month
    • Quarter
    • Year
    • 2 Year
    • Decade
    • by Customer
    • by Product/Service
    • by Organizational Level
    • by Location
    • by System/Application
    • by Asset
    • by Transaction Type


Go to Top of this page.
Previous topic this page
Next topic this page

Set screen Performance Project Plan

    Download this MS Project 2003go to another page on this site file containing a sample performance engineering project plan.

    This information supplements (and in some cases contradict) the body of knowledge for QAI Certified Software Project Managers (CSPMs).

Go to Top of this page.
Previous topic this page
Next topic this page

    Set screen Performance Within Development LifeCycles

    Parasoft's Automated Error Prevention (AEP) process includes several test-friendly steps:

    • Unit Test between Code and Integrate
    • Integration Test after Code
    • Load Test before Deploy
    • Load Test after Deploy
    • Analyze Performance

    Test Type Timing (When)
    A. Speed Parallel with coding construction, as this provides developers feedback on the impact of their choice of application architecture.
    B. Contention
    C. Data Volume On each release when app components are being integrated.
    D. Stress/Overload Pre-Production for each new application version or hardware configuration.
    E. Fail-over Pre-Production for each new application version or hardware configuration.
    G. Scalability Pre-Production for each new application version or hardware configuration.
    H. Availability In Production for each new application version or hardware configuration.

Go to Top of this page.
Previous topic this page
Next topic this page

    Set screen The Context of Performance and Scope of Performance Tuning

    Performance improvement and management projects can be considered in the context of these architectural layers and components:

      Context Components Tuning Options
    A. Business
    • Products
    • Departments and hierarchy
    • Individual Users & Permissions
    • Number and types of users
    • Business hours
    • Batch (cron) jobs
    • Report schedules
    B. Applications
    • Systems (HR, Accounting, Marketing, Sales, Legal, etc.)
    • Databases (Products, Locations, Customers, Vendors, Employees, etc.)
    • Front-facing Server apps (StoreFront, Login, etc.)
    • Back-end Management apps
    • Client apps (installer, DLLs, JVM)
    • Software architecture
    • Database layout (schemas)
    • Architecture: N-tiered legacy application-oriented architecture (AOA) or service-oriented architecture (SOA) that may includes a virtualized deployment.
    • Access and security methods
    C. Operating
    System
    • Paging File
    • Parameters
    • File size & location
    • Kernel tuning
    • OS revisions
    • Disk volume layouts
    D. Server
    Hardware
    Devices
    • See Provisions below
    • number of CPU's vs. App threads
    • Disk (local, NAS/NSF filers)
    • RAM (availability and usage)
    E. Telecommunications
    Infrastructure
    • Telecom Network architecture
    F. Data Center
    Operations
      Facilities: Buildings, Furniture, Electrical, Lighting, Plumbing, HVAC, Wiring for Phones, Data, Speakers
    • NOC (Network Operations Center)
    • Backup/restore strategies
    • Capacity Management
    • Security

    CMG (the Computer Measurement Group)

Go to Top of this page.
Previous topic this page
Next topic this page

Set screen Provisions for Performance Testing

    One of the key "Success factors" of a performance measurement project is the availability of resources when needed. Waiting for resources (or working around the lack of resources) is one of the major reasons for project delays.

    There are two areas of provision (two sets of budgeted costs):

    • Provisions for the application under test (AUT)
    • Provisions for the project analyzing the AUT

Go to Top of this page.
Previous topic this page
Next topic this page

    Set screen I. Support Expertise

    Time from each roleon this page:
    1. Executive Sponsor
    2. Engagement Manager
    3. Performance Engineer
    4. Database consultant
    5. Development consultant
    6. Operations/Network/Security consultant

Go to Top of this page.
Previous topic this page
Next topic this page

    Set screen II. Knowledge

    1. Infrastructure Typologygo to another page on this site describing each server (the versions of software in it, the data it uses and creates, the protocols and frequency of communications between machines)
    2. Database schemas and how software access them.
    3. Installation and configuration notes for each server.
    4. Requirements, design documents, and developer notes on application logic.
    5. Instructions to end users (help screens, User Manuals, trainer notes)
    6. Sequences of user actions (use casesgo to another page on this site cross referenced to associated load testing scripts and software component names.
    7. Results from Functional Testing ( defect reports on what doesn't work and workarounds)

Go to Top of this page.
Previous topic this page
Next topic this page

    Set screen III. Hardware

    There are several dimensions in the IT infrastructure typology and thus cost of providing IT services:

    1. Environments (clusters of machines), such as
      1. Specific machines used by the application
      2. Specific machines used for load testing
        1. Component Resources within servers

      Environments

    • Live Production Environment
    • Production Staging / Failover Environment
    • Training Environment
    • Load Test Environment
    • Functional QA Environment
    • Development Environment
    • Specific machines on the technology "stack"

      • LDAP and authentication servers.
      • Load Balancergo to another page on this site Virtual IP addresses (VIPs)
      • Ancillary servers and servicesgo to another page on this site such as Spam filters, firewalls, and other appliances
      • Web servers
      • Application servers (such as WebLogic or Websphere servers).
      • Incoming/Outgoing Messaging (email) servers
      • Web service servers (such as media distribution servers)
      • Database servers housing Oracle or SQL servicing database queries
      • External storage (NAS/SAN filers)go to another page on this site Central file repositories
      • Printers, scanners, and other peripheral devices
      • Accelerator boards (for encryption, etc.), if applicable

      Machines Specfic to the Load Test Environment

      • Workstations for tester to develop scripts and scenarios. Tester productivity is greatly improved with multiple monitors.
      • Controller collecting dynamic monitoring data.
      • Web server to hold load test results and analysis reports (so that viewing of results do not disturb tests being run)
      • Load Generator agents (injectors) emulating clients under test
      • Loadtest serversgo to another page on this site emulating servers under test. Examples:
      • Loadclient machinesgo to another page on this site to access the test network and to stage application files.



    Component resources within each server

    Resource Type Metric
    Motherboard Volatile Peak % CPU (CPU Cycles)
    Memory Volatile Peak KB Cache
    JVM Heap Volatile Peak KB Usage
    Local Hard Disk Static Peak KB/sec I/O Bandwidth
    Shared Archive Space Static Peak GB Peak Usage
    DB Disk Space Static Peak GB Peak Usage
    KB/sec Network Volatile Peak Network Bandwidth
    OS file handles Volatile Peak # file handles
    Shared server Volatile Peak # J2EE Containers

Go to Top of this page.
Previous topic this page
Next topic this page

    Set screen IV. Data and permissions

    1. On the LAN firewall, open port 54345 between the LoadRunner controller and its load generators.
    2. Accounts and passwords to test environment machines.
    3. Accounts and passwords for administrators, users, and testers of the application under test.
    4. Test data extracted (and sanitized) from production.
    5. Test data to be generated.
    6. Secure certificates, if any.
    7. Server images (backups)

Go to Top of this page.
Previous topic this page
Next topic this page

    Set screen V. Software

    Specific tools are described and compared in my Load Testing Productsgo to another page on this site

    1. Installation package storage and target locations (folders and file names).
    2. Network performance (instantaneous response time, throughput, & hop count) checker QCheck freeware download (a subset of IxChariot) — installed as "performance endpoints" on servers being tested/used
    3. Access to source libraries (in CVS/SourceSafe).
    4. Special programming associated with the application under test:
      • Program(s) to prepare data needed for testing.
      • SQL calls to clear data destroyed during each run.
      • Driver programs for testing calls to lower-level components.
      • Stub programs to emulate availability of lower-level components.
      • Logic that LoadRunner scripts need to emulate internally (such as data siging).
    5. Special programming associated with the LoadRunner product:
      • Utilities to archive/backup and restore test assets (in and out of ClearCase).
      • Utilities to archive/backup and restore test results.
      • Utilities to transfer data between LoadRunner and external utilities such as graphing
      • Utilities to display test results and analysis.

Go to Top of this page.
Previous topic this page
Next topic this page

Set screen Risk Contingency Adjustments

    Potential Obstacle / Risk Likelihood Avoidance / Mitigation
    1. Servers not available early during the project. High (80%) a. Use dev. environment to develop single-user scripts.
    2. Difficulty with Controller licensing, capacity, etc. Medium (50%) b. Identify issues early by beginng to use controller as soon as the first small script (such as login only) is coded.
    3. Developers not available Medium (50%) c. Perform thorough system analysis to identify issues before scripting.
    d. develop scripts with likely issues early.
    4. Not enough capacity in front-end (portal/login) servers. Medium (40%) e. Quanitfy capacity of front-end servers with login_only scripts.
    5. Change of personnel during the project. Medium-High (60%) f. Take notes. Conduct formal peer walk-throughs.
    g. Make assignments for skill development.
    6. Servers become unavailable late during the project Low (20%) h. Use production staging environment at night.
    i. Instead of going through load balancer, test directly against one server taken off its cluster.
    7. Changes in server hardware Low (10%) j. Conduct benchmark tests on hardware as part of project.
    k. Save server configuration files for historical comparisons.

Go to Top of this page.
Previous topic this page
Next topic this page

Set screen Usage Patterns Trend Analysis

    The business perspective measures "busy-ness" using metrics such as the number of "transactions per second"

    The estimates are based on Estimated Market (User) Usage Patterns which feed Budgets & Forecasts of Costs and revenues

    Sudden peaks are common, as illustrated by this graph of search interest about a movie title:

    Google Zeitgeist 2005 on March of the Penguins movie release

    Set screen Resource Load Patterns

    Measures of "Load" provide a guage of the amount of work, such as "horsepower" in the physical world and MB of data transferred per second.

    service capacity business impact analysis (BIA).

    Set screen Capacities

    Use the Poisson Distribution

    Requests
    per day
    Average
    Requests
    Maximum
    Requests
    10,000 1 6
    50,000 3 8
    100,000 5 14
    215,000 10 21
    300,000 14 27
    450,000 20 35
    500,000 24 40
    648,000 30 47
    864,000 40 59
    1,000,000 47 68
    1,080,000 50 71
    1,500,000 70 94
    2,000,000 93 120
    2,160,000 100 128

    Each piece of equipment has a limit on how much it can produce.

    An assembly can only handle as much as its smallest channel.

    For example, a web server has an input buffer, an internal queue, and an output buffer.

    Set screen Predicted Performance Profiles for anticipated loads

    Estimates need to be based on peak rates of transaction "busyness" at various points in time.

    Rather than "average" loads, it's "maximum" values during various blocks of time.

    Set screen Bottlenecks

    Like a chain's strength is limited by the strength of the weakest link,
    the capacity of an entire system processing transactions is limited by the capacity of the slowest component within the slowest server.

    Cash memory is the part of the computer that remembers how much money you spent on your computer. The more you spend on your computer, the faster it will work. That's why the million dollar computers work so fast - they have more cash memory than you do.

    Set screen Issues

    Analysis of results may identify the following issues:

    • Contention (data, file, memory, processor)

    • Inappropriate distribution of workload across available resources

    • Inappropriate locking strategy

    • Inefficiencies in the application design

    • Unexpected increase in transaction rate

    • Inefficient use of memory

    So the capacity manager must involve him/herself in a large scope of all categories of the entire CIT architecture supporting the organization's Service Catalog:

    • Operating Systems
    • Client applications (IE, Java Runtimes, utilities, etc.)
    • Facilities (cabling)
    • Network equipment (LANs, WANs, bridges, routers, and so on)
    • Hardware:
      • Web Servers
      • Peripheral devices (SANs, etc.)
      • Middle Tier Servers (middleware, Tivoli, BMC, etc.)
      • Database Servers
    • Egress (service supplied outside of IT-for example, power and water)

    The components within each server:

    • Speed of Network Interface Card
    • Speed of CPU (Mhz) and buffers (L1, L2)
    • Speed of disk data transfer speed and cache
    • Speed of application processing
    • Speed of security processing (encryption and decryption)
    • etc.

    The impact of bottlenecks is included in the metric percentage utilization of resources.

    Set screen Capacity Management Database (CDB)

    One of the "best practices" of service management frameworks is that all this information be defined in a capacity plan stored within a capacity management database (CDB). (This is related to but separate from the configuration management database, or CMDB.)

    The CDB contain the detailed technical, business, and service level management data that supports the capacity management process. The resource and service performance data in the database can be used for trend analysis and for forecasting and planning.

    Mainframe based data collection methods, tools and techniques include MXG, SMF, SAS and quantitative analysis techniques.

 

 

Go to Top of this page.
Previous topic this page
Next topic this page

    Set screen UML 2.0 Test Profile

    Information about each Test Context — the collection of a test configuration on which test cases within test suites are executed.

  • Testing Objectives -> Test Generation Directives
  • Specifications -> Behavioral (Dynamic) Model
  • component-level and system-level tests.

    Time Concepts

    The set of concepts to specify time constraints, time observations and/or timers within test behavior specifications in order to have a time quantified test execution and/or the observation of the timed execution of test cases.

    specification of tests for structural (static) and behavioral (dynamic) aspects of computational UML models,

    A test context is just a top-level test case. Annotate the model with testing information:

    • Coverage criteria
    • The purpose of specific tests
    • Testing constraints
    • The interface for testing

 

 

Go to Top of this page.
Previous topic this page
Next topic this page

Set screen Off-Estimate Alerts

    Estimates derived from load and stress testing may not be precise nor realistic.

    So if during production operation loads become higher than expected, those monitoring the data center would issue alerts for reactive action and additional analysis.

 

    wav sound "Missed it by that much" —Don Adams as the inept "Maxwell Smart, Agent 86" in the "Get Smart" NBC TV comedy series during the 1960s
    "Would you believe ... ?"

  Go to Top of this page.
Previous topic this page
Next topic this page

Set screen Performance Criteria (Supplemental Requirements)

    Testers and business analysts work together to compose a Supplementary Requirements document to define the performance criteria for the application being designed. An example of a measurement goal (vs. SLA is

      “When 500 customers access the application at the same time (with normal think times), response time for 95% of transactions must be

      • within 4 seconds for data-intensive requests (such as searches through the database) and
      • within 2 seconds for data-intensive requests local to the user client.”
      • within 1 second  for all other requests local to the user client.”

    Variations on this include response time degradation expected for different number of users exercising business and administrative tasks described in the application's Use Cases document.

   

$95/49 The Art of Computer Systems Performance Analysis : Techniques for Experimental Design, Measurement, Simulation, and Modeling by R. K. Jain (John Wiley: 1991) is a seminal classic must-read for its clarity.

$45/15 Performance Solutions: A Practical Guide to Creating Responsive, Scalable Software (Addison-Wesley: September 17, 2001) by Dr. Connie Smith and Lloyd Williams of Performance Engineering focuses on object-oriented systems and alignment of Software Performance Engineering (SPE) with RUP. It notes performance patterns and anti-patterns.

$40 Measuring Computer Performance : A Practitioner's Guide (Cambridge University Press: September 2000) by David J. Lilja (Professor at U. of Minnesota and author) is a more gentle introduction than Jain's, which is more quantitative.

  Go to Top of this page.
Previous topic this page
Next topic this page

    Set screen User Steps

      Examples of user steps/actions:

    1.	Invoke URL
    2.	Login/Logon
    3.	Pick/Start Application
    4.	Add one
    5.	Import batch
    6.	Search
    7.	Export batch
    8.	Archive/backup
    9.	Delete
    10.	Retrieve/undelete
    11.	Restore from archive
    12.	Exit application
    13.	Logout/Logoff
    
    Examples:
    • Start-up from machine cold-boot.
    • Application invocation (each mode of operation)
    • For each type of user (user role)
      • Register
      • Login
      • UnRegister
      • ReRegister
      • Password recovery
      • Login
    • Select each menu item
    • For each data structure:
      • Browse
      • Sort display
      • Add
      • Change (Edit)
      • UnDo
      • Delete
      • UnDelete
      • Search
      • Print
      • Import batches
      • Export batches
      • View Properties
    • Logout
    • Stopping

Go to Top of this page.
Previous topic this page
Next topic this page

    Set screen Result Design

    An important outcome of the design phase is how results will be organized and presented to various audiences. Results from the tool $800 SPECweb99 (v1.0 announced 1999) and SPECweb99_SSL (March 2002) pre-defined workload generators to benchmark the number of WWW server connections per second are summarized using this table format:

    Test results for each iteration (median is shaded)
    Iteration Conforming Connections Percent Conform Throughput ops/sec Response msec ops/sec/ loadgen kbit/ sec
    1 4130 100.0% 11619.9 355.4 2.81 335.4
    2 4130 100.0% 11583.8 356.5 2.80 334.4
    3 4130 100.0% 11610.2 355.7 2.81 335.1

Go to Top of this page.
Previous topic this page
Next topic this page

Set screen Design the tests

    With the above performance criteria and business functionality in mind, performance testers create a Performance Test Plan which describes what is to be tested (and names the categories and intent of those tests).

    Examples of categories (and actions) include:

    • Installation (Load data)
    • Configuration (scale, languages/locales)
    • Administration (List, Add, Update, Delete user; backup and recovery)
    • Business Transactions (Login, Browse, Search, Logout)
    • Reporting
    • "Security" and "Help" could be a separate category or a part of each category.

    Each possible action during user sessions (the paths through the application) can be graphically depicted with lines when using the industry-common User Community Modeling Language (UCML) UCML sample

    Parentheses after the action indicate the likelihood of that action occuring. Dotted lines under the action identify additional optional actions.
    The UCML encloses in circles the percentage of occurance or number of times of occurance per iteration.

    When Scott Barber first defind UCML in 1999, he also proposed a format for defining additional information, such as:

    The parameters include how many virtual testers will be used and when.

    The factors used during testing define what is varied.

    Variations in configurations and database-access scenarios for each iteration. Mock-ups of the statistics and graphs to be generated after each test (for each build) are created at this time. The production test physical Environment of servers and networking devices are also assembled at this time based on the same Installation Procedures as used during actual Deployment.

    To avoid delay later, it helps to identify early what the techniques needed to handle complexities in the application or environment (such as the use of firewalls, fail-over, load balancing, session identifiers, cookies, XML/XSLT transforms, etc.).

    Web protocol agents do not include processing time for plug-ins like Macromedia Flash and Real players. For timings on how much time it takes Flash to paint the screen, you need an additional license from Mercury for the Flash protocol emulator.

    Test of web servers typically include use of HTTP caching such as

   

This document makes use of the terminology from UML 2.0 Testing Profile specifications v1.0 (July 7, 2005)

This enables the test definition and test generation based on structural (static) and behavioral (dynamic) aspects of UML models,

UML2 Testing Profile was developed based on several predecessors: SDL-2000, MSC-2000, and TTCN-3 (Testing and Test Control Notation version 3) — also published as ITU-T Recommendation Z.140 — (Developed during 1999 - 2002 at the ETSI (European Telecommunications Standards Institute)) is a widely accepted standard in the telecommunication and data communication industry as a protocol test system development specification and implementation language to define test procedures for black-box testing of distributed systems.

ETSI European Standard (ES) 201 873-1 version 2.2.1 (2003-02): The Testing and Test Control Notation version 3 (TTCN-3); Part 1: TTCN-3 Core Language.

J. Grabowski, D. Hogrefe, G. Réthy, I. Schieferdecker, A. Wiles, C. Willcock. An Introduction into the Testing and Test Control Notation (TTCN-3). Computer Networks, Volume 42, Issue 3, Elsevier, June 2003.

Neil J. Gunther

Professor of Computer Science at the Federal University of Minas Gerais, Brazil, Xerox PARC & Pyramid (Fujitsu) alumnus, founder of Performance Dynamics and developer of the PARCbench multiprocessor benchmark and tool C-language Open Source performance analyzer called PDQ queueing model solver.

Errata $90 The Practical Performance Analyst: Performance-By-Design Techniques for Distributed Systems (McGraw Hill: February 1998). This has been obsoleted by

read it online $46 The Practical Performance Analyst, 2nd Edition (Authors Choice Press, October, 2000) &
Browse this on iUniverse

article [Requires Registration] A complete rewrite of the above is now underway with Guerrilla Capacity Planning: Hit and Run Tactics for Sizing UNIX, Windows and Web Applications Springer-Verlag)

 

Go to Top of this page.
Previous topic this page
Next topic this page

Set screen Provisioning Milestones

    Time to ensure proper (coordinated) installation of hardware and software is chronically underestimated for performance projects.

    For whatever reason, it is often assumed that performance testers do not need the same amount of time as operations staff to install an application. Additionally, information about installation issues often do not get to performance testers.

    Yet, even small mistakes in installation can invalidate test results.

    The installation milestones (below) are repeated for each hardware configuration (m) to be tested :

    m.1.allocated > m.2.delivered > m.3.assembled > m.4.installed > m.5.configured > m.6.available > m.7.operational > m.8.benchmarked

    Completion
    Nickname
    Milestone Description Target Date
    m.1.allocated
    m.2.delivered
    m.3.assembled

    m.4.installed
    m.5.configured
    m.6.available

    m.7.operational

    m.8.benchmarked
    1. Provisions (Equipment, IP addresses, SSL certificates, etc.) have been allocated.
    2. Provisions have been delivered for installation work.
    3. Installation packages and instructions assembled for installation and integration work. This ends with inventory check-in of boxes.
    4. App. installer packages (including anti-virus, utilities such as Adobe Reader, and patches) have been verified as installed on the first server.
    5. App. has been configured with modified configuration settings (IP, DNS, time server, rstatd, NAS, etc.) as planned. This ends with the creation of a snapshot archive image file.
    6. App. has been proved useable at completing initial transactions processing test data, and available for use as individual machines on the target network.
    7. All servers are operational as a completely integrated whole. This is verified by the first performance test run (usually Speed Testing).
    8. The performance of various configuration settings are benchmarked for basis of comparison.
     

Go to Top of this page.
Previous topic this page
Next topic this page

Set screen Construction: Create and validate Speed Test scripts

    Test Analysts create Performance Test Scripts to execute a particular Sequence Diagram describing a part of the application's functionality. Plain performance tests of user response time can be manual -- documented using a word processor (such as Microsoft Word).

    This corresponds to SPE steps:

    1. Verify and validate model(s)
    2. Evaluate performance model(s) [until performance is acceptable]
      • if feasible, modify product concept, then modify performance model(s)
      • if not feasible, revise performance objectives

    Repetitive load and stress tests automate the steps defined during preliminary performance testing. Actions automatically captured into a load testing script are modified several ways:

    1. Insert comments to annotate user actions to cross-reference planned steps documented or to identify the source of info at key steps.
    2. Insert lr_ general functions to log vuser computer name and other info about each test run.
    3. Add or remove cookies on client machines if the app assume a single user.
    4. Add "correlation statements" to save and retrieve values.

    Changes to loadrunner scripts include:

    1. Define the format of parameters in the Parameter List. The default braces character to designate a parameter depend on the Vuser type, but can be specified in the General Options screen.
    2. Replace fixed text in captured script code with a <parameter>.
    3. Format internal data generated while a Vuser runs (shown in the Sample Value box) (date/time, Vuser Group, Iteration number, Load Generator Name, Random and Unique numbers). Specifying a format code of %05s pads a string value of 3 as "00003". %3s pads a three character string with spaces.
    4. Define redeveous points

    Scripts created for testing the functionality and business logic of the software may be (after proper planning or rescripting) adapted for reuse for performance testing and monitoring. This is done to avoid re-coding complex logic in test scripts and the time of re-selecting data for use during tests.

 

  Go to Top of this page.
Previous topic this page
Next topic this page

    Set screen Levels of Scripting Capability

    The IBM Rational Process (RUP) for Performance tracks the progress of performance testing by the maturity of scripting assets.

    The purposes of test runs during a load testing project typically follow this sequence of increasing capability levels over time:

    Test
    Level
    A.
    Config.
    B. #
    Users
    C.Range
    of values
    D.Prior
    Data
    E.Length
    of run
    Purpose (Type of Testing)
    1. Initial One Static None Short Initial script creation through recording and playback, plus addition of error checking, transaction definition, etc.
    2. Initial Few Dynamic None Short Initial data parametization and monitor debugging toward on this pageSpeed testing
    3. Initial Few Complex None Short Initial scenario debugging and report template formatting toward on this pageContention testing
    4. Initial Few Complex Much Short on this pageData volume tests to determine whether the amount of stored data impacts performance.
    5. Baseline Many Complex Much Short on this pageStress tests and Test Report presentation previews.
    on this pageFailover tests after an overload occurs.
    6. Baseline Many Complex Much Long on this pageLongevity tests to establish the current baseline.
    7. Altered Many Complex Much Short Application Regression smoke testing vs. the stress-tested baseline above.
    8. Altered Many Complex Much Long Comparison testing (vs. the Current Baseline) for on this pagescalability testing

Go to Top of this page.
Previous topic this page
Next topic this page

    Set screen Possible impacts to performance

    Here are the major variables to track the capability of load test scripts:

    1. The configuration under test (the combination of application code version and parameters used to configure servers).

      Work with an "initial" configuration is necessary in order to carefully defining a "baseline" configuration because the scripting process usually reveals some changes to the precise configuration planned,

      Configurations usually become "altered" due to conclusions reached during a full set of load tests using the baseline configuration.

    2. The load placed on the application, determined by the number of virtual users doing work.

    3. The range of data values used during the test run. "Complex" describes different groups of users active at various portions of runs.

    4. The volume of prior data stored when tests begin.

    5. The Length of time for test runs. This is the last column because capability in this aspect is the consequence of capability in the prior areas to the left. Rushing to produce longer runs would only provide appearance of progress if a script is fundamentally flawed in other aspects of capability.

    Here are the complexities of a load script:

    • Different types of transactions
      which access different tables to look-up values or store transactional data.

    • Different data values
      (such as different departments, vendors, projects, etc.) stored in different areas of the hard disk.

    • Different number of line items
      which use different logic or data paths.

    • Error recovery
      which use different logic or data paths.

Go to Top of this page.
Previous topic this page
Next topic this page

Set screen Process Milestones

    Each project undergoing performance engineering goes through these milestones:

    Completion
    Nickname
    Milestone Description Target Date

    t.1.defined
    t.2.drafted
    t.3.repeatable
    t.4.validated
    t.5.results confirmed
    t.6.analyzed
    t.7.concluded

      The scripting milestones (below) are repeated for each form/type of performance testing (t):on this page

      1. User actions and expected responses are defined.
      2. Scripts to simulate user actions is drafted — and exhibits no application functional errors.
      3. Run-time settings and parameter data to be used in test runs have been established such that scripts can be repeated successfully for several virtual users.
      4. Run-time settings and parameter data to be used in test runs have been established such that scripts can be repeated successfully for several virtual users.
      5. Scripts have been run to produce resultsgo to another page on this site (run logs and statistics obtained during runs).
      6. Run result data have been analyzed to produce tables of data and visual graphs to support statement of issues and findings.
      7. Conclusionsgo to another page on this site about the app's performance profile, — and recommendations for performance tuning — are available for review and acceptance by management.
      8. Note: There may be several rounds of testing to determine the impact of each change to run parameters or application code.

Go to Top of this page.
Previous topic this page
Next topic this page

Set screen Models of Usage and Capacity

    Academic treatment of performance engineering advocate use of mathematical models as impartial tools for collecting, analyzing, and presenting findings that rationally reconciles the various informational needs.

    The objective of modeling is to create a mathematical model. For example, create a spreadsheet such as MS Excel spreadshet Exch_Calc.xls from the pdf Dec 2000 Microsoft Capacity Planning and Topology Calculator to predict the scalability of Exchange 2000 email messaging infrastructure deployment. The spreadsheet calculates the expected number of Windows 2000 Active Directory Global Catalogs, domains, and sites.

      tool The Microsoft Exchange 2000 Resource Kit includes tools ESP and LoadSim to probe the scalability limits of Exchange systems.

      The model would take into account the software clients used to access mail, server transactions with that client, the hardware (number/speed of processors), and the physical deployment itself.

        back-end "user-per-server" numbers are not very useful taken out of the context of the whole deployment.

      As part of Microsoft's Dynamic Systems Initiative (DSI) that supports SOA (Service Oriented Architecture) -- the Windows Communication Foundation (WCF) within Microsoft, code-named Indigo -- on Vista 2007 servers is Microsoft System Center Capacity Planner Manager 2006 simulates deployment sizing forensic simulation "what-if" analysis.

      This product uses a common, central SDM (Systems Definition Model) used by all System Center software packages, starting with Microsoft Operations Manager (MOM) 2005, built for use with the MS.Net Framework version 2.0

      To diagnose the root causes of performance problems in a Microsoft Windows Server 2003 deployment, Microsoft provides Microsoft ® Windows Server ™ 2003 Performance Advisor 6/17/2005 .NET 1.1 Framework replacement of the 5/24/2004 Server Performance Advisor V1.0. These run on Windows 2003 SP1 (not Windows 2000 or Windows XP).

      It provides several specialized reports, including a System Overview (focusing on CPU usage, Memory usage, busy files, busy TCP clients, top CPU consumers) and reports for server roles such as Active Directory, Internet Information System (IIS), DNS, Terminal Services, SQL, print spooler, and others.

    tool TeamQuest analytic modeling software claims to find the optimal configuration based on business forecasts and to handle spikes in demand by experimenting what what-if analysis in a virtual environment. But I would not recommend them because they don't seem to willing to talk to me.

    tool Mercury Capacity Planning (MCP)CDB (Capacity Data Base)

   
$55 Performance by Design: Computer Capacity Planning by Example (Prentice Hall, 05 January, 2004) Hardcover by Virgilio Almeida, Lawrence Dowdy, Daniel Menasce

Virgilio A.F. Almeida

$52 Capacity Planning for Web Services: Metrics, Models, and Methods, 2nd edition by Daniel A. Menasce & Virgilio A.F. Almeida (Prentice Hall; September 11, 2001) download

Daniel A. Menasce

$17 Capacity Planning for Web Pårformance: Metrics, Models, and Methods (Prentice Hall; June, 1998) by Virgilio A.F. Almeida & Daniel A. Menasce

$46 Scaling for E-Business: Technologies, Models, Performance, and Capacity Planning by Daniel A. Menasce & Virgilio A.F. Almeida (Prentice Hall PTR; May 15, 20°0©

104707 eBook ISBN: 1417507810 IT Performance Management (Oxford, Burlington, Mass Butterworth-Heinemann, 2004) by Peter Wiggers, Kok, Henk.; De Boer-de Wit, Maritha.

Go to Top of this page.
Previous topic this page
Next topic this page

Set screen Format data for Presentation

    After each run, there are several activities. For example, with LoadRunner 8 software:

    • In Analysis, set the filter to not include Think Time.

    • In Analysis, set granularity of charts.

    • Perform additonal calculations within a MS-Excel spreadsheet by copying LoadRunner's Summary report (and using "Paste Special" to avoid carrying over the formatting), then adding formulas on empty columns.

    • If changes were made, save a new report template.

Go to Top of this page.
Previous topic this page
Next topic this page

Set screen Refine Stress Test Scenario parameters to Identify Bottlenecks

    To determine whether a server can respond to user requests within specified response times) under various artificial load levels, simultaneous virtual users (testers) are added to emulate real-world stresses on the application and its environment. Increases to the number of users can be applied randomly, logarithmically or by a constant (such as 10 more users per second).

    Early detection of bottlenecks improve the efficiency of developers, so Performance testing parallel to application Construction (rather than deployment) can be very cost efficient.

    Performance testers can make testing scenarios and scripts more realistic by refining scripts to be invoked on a random, sequential or synchronized basis -- emulating more and more complex (and negative/conflicting) scenarios:

    1. Home page visit only, then no other action.
    2. Home page, user registration (successful).
    3. Home page, successful login, then no other action.
    4. Home page, unsuccessful login (repeated), and password recovery.
    5. Login page, successful user login, input transaction (with and without errors).
    6. Login page, successful administrator login, and update transactions.
    7. Login page, successful administrator login, and reporting.

    These tests may be repeated for each set of installation options (such as different brands/capacities of hardware and software) and different configuration settings (support of different locales or database tuning settings).

    Additional functionality can be tested as new builds add additional functionality or stubs and drivers can be created to simulate actual application functionality.

    These tests quantify the two basic parameters used to predict performance capacity:

    • contention delays such as time spent waiting on a database lock,

    • pairwise coherency mismatches such as time to fetch a cache-miss.

 

  Go to Top of this page.
Previous topic this page
Next topic this page

Set screen Diagnosing results for Longevity Load Tests

    To determine whether the system of application and environment can handle the simulated load. In the sample graph at the right, the blue line on the bottom can represent server CPU utilization (and other monitored indicators) that results from the incremental addition of users (represented to the stair-step red line).

    The green line in the middle can represent response time. Some performance test tools can divide the total average response time down into how much time was spent in each aspect of the environment (network, application server, database, etc.). Such analysis identifies performance bottlenecks, such as the bandwidth capacity of a CPU, a network device, application component, or database tuning parameter. The result of this analysis is summarized and formatted for presentation to developers and management. Examples of recommendations include the tuning of run-time parameters on servers and network devices or upgrading of hardware to meet expected loads.

Go to Top of this page.
Previous topic this page
Next topic this page

Set screen Testing Various Configurations for Scalability

    This Capacity Planning graph (at right) forecasts capacity consumption over time (32 months across the horizontal axis). The vertical axis anticipates the app's percentage capacity (marked here at 100, 200, and 300 percent of current capacity) when procurement action is required.

    The top-most curved line represents the highest estimate of resource usage. The lowest estimates of usage are represented by the bottom trend line.

    For example, under the heaviest usage, an additional server should be added before actual usage reaches 100% at month 8 and another should be added before 200% is reached around month 30. However, if the lowest level of usage is actually encountered, no additional server is needed until month 24.

    These curves combine the result of parameters determined during scalability testing multiplied by expected product sales growth estimates.

    This capacity-performance 3D surface [from Neil J. Gunther] predicts the user response based on the number of "m" processors running at various levels of load (load factors).

    Capacity planning saves money by avoiding the expense of too much unused capacity (in specific componets or system-wide) and, on the other extreme, avoiding loss of profits from not having enough capacity to meet demand.

    Performance Engineering Laboratory operated by Dr. Liam Murphy at the University College Dublin and Dublin City University

    http://www.ejbperformance.org COMPAS Performance Prediction

Go to Top of this page.
Previous topic this page
Next topic this page

Portions ©Copyright 1996-2011 Wilson Mar. All rights reserved. | Privacy Policy |


Related Topics:
Load Testin Products
Mercury LoadRunner
Mercury LoadRunner Scripting
NT Perfmon / UNIX rstatd Counters
WinRunner
Rational Robot
Free Training!
Tech Support


How I may help

Send a message with your email client program


Your rating of this page:
Low High




Your first name:

Your family name:

Your location (city, country):

Your Email address: 



  Top of Page Go to top of page

Thank you!