|
Load, Stress, Performance Test Terms, Deliverables, Profiles and ReportsThis page presents the formatting and presentation of a sample performance profile . It is a companion to pages on Performance Testing Plan each aspect of performance of an application — based on statistics, graphs created by load testing tools such as LoadRunner executing scripts. |
|
Introduction: Our Flow of DeliverablesThe project objective is for the project team to present Recommendations to address (in an actionable manner) the Concerns that various stakeholders have about IT support of specific Business Processes. Stakeholders' concerns are clarified into a set of questions that can be answered based on factual metrics that signal whether pre-determined business goals and technical requirements are being met. Action recommendations are based on statistical Conclusions arrived thorough analysis of results generated from several types of performance testing Our rigorous approach applied procedures guided by a set of technical objectives and budgets
|
|
Definition of Terms
Results from a test run (such as in these statistical reports and graphs generated by LoadRunner) are the values obtained from measuring the impact of a specific set of run conditions. Conclusions (such as these) are subjective decisions (a proposition or claim) reached after (hopefully) thoughtful consideration of the facts drawn from evidence provided. In formal statistics, a conclusion evaluates the accuracy of prior hypotheses that is either accepted (confirmed) or rejected based on the outcome of experiments. Conclusions are presented organized to the questions and forms/types of performance tests A finding is a determination about the scope, validity, and reliability of observed facts (data). Example:
This statement limits its scope to:
Findings provide the premises (the "truths" or evidence) providing the basis for making conclusions. All this is the path to a well-reasoned approach to the management of Performance, Scalability, Reliability (PSR). |   |
Concerns, Questions, Metrics, and GoalsHere are the most common concerns. Each may have different importance depending on the organizational context.
This format is based partly on the Goal/Question/Metric (GQM) method, a practical method for quality improvement of software development, (McGraw-Hill, 1999) by Rini van Solingen and Egon Berghout at www.gqm.nl |
Project Technical Objectives to Address ConcernsThe project objectives addressing concerns which prompted this project are: As a practical matter, bugs in Configuration, Installation, Security, and Failover/Recovery (Robustness) are often needed before conducting Performance, Load, and Stress testing:
|
Project Budget Performance
The baseline developed from project planning efforts provide an "early warning" indicator for the percentage completion progress of the effort.
|
Issues (Discoveries and Recommendations)This table presents the concerns which initiated the project together with the subsequent example observations and discoveries found during load profiling and analysis.
To better manage follow-up, action items may be entered into a "defect" tracking system or task/project management system. |
Business Processes
# Steps provides for each business process a count of its user dialogs — the number of "round trips" to the server after the user clicks a submit button or a link. This link provided with the number is to a list of the dialogs and the names of transactions measurements. Iteration Time1 is the total amount of time needed to complete all steps of the business process. (This can be obtained from VuGen during load script development). TPM /User (Transactions Per minute per User) is the TPS (Transactions Per Second) multplied by 60. Peak# Users is the peak (largest) number of users that may perform that process all at once, such as (in the case of login) each work-day morning, and (in the case of business processes) around each accounting period-end. Max# Users is the maximum number of users that can possibly use the specific process all at one time.
|
Observations
Several metrics that affect the performance and capacity of an application can be obtained even before load testing runs are completed. These metrics need to be measured manually, with a stopwatch Or maybe a calendar
Environment ComplexityWe use a test log spiral notebook to record:
Backup Imaging TimeThis affects the amount of time for testing.
Image Restore Time
Recovery/Reboot TimeWe use a test log spiral notebook to record:
Failover TimeThis is deteremined during Failover testing. Failback TimeThis is deteremined during Failover testing.
|
Graphs and Dashboards
Using Microsoft Excel is a two-edged sword. I prefer it because it is the most common package. I don't have to beg my employer to buy it to get my work done. Since some companies may not want to pay for it, I may be stuck using Excel anyway. However, because of its power, Excel can be difficult to master. But I can show you how it can be done. After an initial investment of a few hours, you would develop an impressive skill that you'll take with you.
AnalyticsWhile adequate, LoadRunner does not provide the dynamic presentation features in data visualization software packages."Visual analytics" apps work by reading data from ordinary Microsoft Excel spreadsheet files into files that PowerPoint, PDF, and Flash-enabled web pages use to enable interactive exploration of graphic data — automatically switching graphic presentations in real-time response to variables specified by moving slider bars, accordian menus, and other "spiffy" user interfaces. Packages from several vendors enable wider sources of data, such as XML from web services and direct connection to Oracle or SQL databases. For consistency:
Slide to the left for more historical data. Slide to the right for more recent data.
horizonal sliders or circular speedometers (like the iPod wheel) are used to specify various levels of load on the servers.
pull-down selectors are provided (instead of sliders) to specify non-continuous items such as departments.
|
Results from Each Measurement Run
Information in the "Raw Speed" table below displaying performance results were collected from the start of script development efforts: Our scripts are coded so that statistics are captured for each action run with a single user. These numbers can potentially be used by load scripts to detect anomalies in responses during runs, such as issuing a message if less bytes are downloaded than expected for a particular page.
|
Raw Speed
To better visualize the statistics, this barchart ranks transactions. For each item: This graph should be generated for a run at a single pace (the same number of virtual users) throughout the run.
|
Consistency of Response Time Speed
This line chart presents the results of a run at the same conditions over several hours. Data values for these types of charts need to be presented at the lowest granularity (such as once per second, as shown here). Otherwise, individual spikes would be averaged in and thus not appear. The mean time between failure (MTBF) statistic is calculated by dividing the number of spikes observed into the length of the observation period (such as 8 hours). To analyze why, we drilled down to the small time frame specific to when the "blip" occured on various servers. Contention testing is often necessary to identify occassional spikes in response time. |
Speeds at Various Data LoadsThis question is answered with Data Volume Testing, when the maximum amount of data expected is loaded on the system so that its impact can be measured. Volume testing is especially important to measure database performance because different size datasets require different indexing and caching strategies for maximum efficiency. Adding indexes to large datasets is the most common approach to improving performance from databases. On the other hand, indexing a small and frequently referenced dataset can actually slow processing speed. More on Oracle database architecture and performance
|
Speeds at Various User LoadsTwo approaches to running Stress Tests were used to answer this question:
This more sophisticated (some may say overly complex) visualization is this "High-Avg-Low" chart (formatted using MS-Excel) provides averages, medians, and variation statistics at each level of load (rather than combined together as with the first type of run). Statistics from the first type of run is less useful because by default run averages include the spikes at the end. Data values can be filtered to the specific time period of interest. But "ramp-up" effects are included at every point. Results from stair-step type runs are more realistic to actual patterns of usage. More importantly, the stair-step approach provides information about the variability of response time at various steps. Drop-down selections (or Forward and backward buttons) are provided to see the impact from varying run conditions (such as different configurations, different versions of software, different instllations of hardware, etc.). |
Run LongevityConducting a longevity run over 22 hours identified the avarage response times in this graph. If there is only time for only one run, this statistic should be obtained from a run at high load level (but still sustainable) loads. The Variability statistic is measured using the standard deviation calculation The curiosity here is whether there a statistically valid trend to responsiveness improving or degrading over time. To analyze trends, we can use accordian menus to view consolidated and detailed views of specific time frames.
|
All trademarks and copyrights on this page are owned by their respective owners. The rest ©Copyright 1996-2011 Wilson Mar. All rights reserved. Last updated | HTMLHELP | W3C XHTML | CSS | Cynthia 508 |