Download the Article (PDF, 57 KB)
CSQE Body of Knowledge areas: Metrics, Measurement, and Analytical Methods
A high maturity organization is expected to use metrics heavily for process and project management. A study was conducted to understand how some high maturity organizations use metrics. This article summarizes the similarities found in the use of metrics, focusing on the metrics infrastructure employed, the use of metrics in project planning, the use of metrics in monitoring and controlling a project, and the use of metrics for the improvement of the overall process.
Key words: defect data, infrastructure, process database, process improvements, project management
Software process improvement (SPI) has emerged as a critical area for an organization involved in software development. Various organizations have reported benefits from software process improvement programs (Arther 1997; Butler 1995; Humphrey 1991; Diaz and Sligo 1997; Dion 1993; Haley 1996; Hollenbach et al. 1997; Lipke and Butler 1993; Wohlwend, Snyder, and Willis 1993; Wohlwend, Snyder, and Willis 1994), and now there is little doubt that process improvement can pay rich dividends.
For SPI, perhaps the most comprehensive and influential framework is the Software Engineering Institutes (SEI) Capability Maturity Model (CMM) for software (Paulk et al. 1995). The CMM classifies the maturity of the organization in five levels, level 1 through level 5, with level 5 being the highest. For the purposes of this article, levels 4 and 5 are considered as the high maturity levels. The number of high maturity organizations has been increasing rapidly, and from less than 20 a few years back, the number has grown to more than 129. In high maturity organizations metrics are expected to play a key role in overall process management as well as in managing the process of a project.
As the CMM framework is not prescriptive, it does not specify which metrics should be used or how they should be used. There is a possibility that different organizations may employ different metrics and use them differently. However, as the basic objectives of high quality, high productivity, and small cycle time are the same in all organizations, and because of the common history of metrics and standard practices in other disciplines, there is a good chance there will be similarities in the approaches and metrics used. A general study regarding practices of high maturity organizations found that many similarities do exist in high maturity organizations (Paulk 99).
The aim of this study is to see the similarities among metrics programs of high maturity organizations and the nature of the similarities. One of the main problems of software process improvement initiatives is not knowing clearly what to do, as was revealed by an SEI survey (Herbsleb and Goldenson 1996; Herbsleb et al. 1997). In high maturity organizations, metrics are expected to play a key role. If the nature of the metrics programs and similarities among them are determined, then it can help other organizations build or improve their own metrics program and in their quest for high maturity. Providing this input to the community is the primary objective of this study.
Software metrics and measurements have been areas of active interest for a long time. One of the main objectives has been to quantify properties of interest in the process or the products, with the goal that these can be used to evaluate and control the products and processes. Though metrics can be used in many ways, in a software organization, the three main uses of metrics data are: 1) project planning; 2) monitoring and controlling a project; 3) and overall process management and improvement. This study focused on these three uses. To support these uses, some metrics infrastructure is needed. This study also considered the metrics infrastructure in high maturity organizations. These four areas form the core of this study.
For conducting the study, eight high maturity organizations were selected. The author visited these organizations to understand the role and use of metrics. The information collected during these visits was the main source for writing this article. The study found that most of the organizations studied collect similar metrics and have similar metrics infrastructure in place, though the details of the procedures followed in the use of metrics differ somewhat. In the rest of the article the author discusses further details about the similarities in the use of metrics in these organizations.
A questionnaire was prepared that listed the key questions regarding metrics and their use. The questions regarding metrics were grouped in four categories: 1) metrics infrastructure; 2) use of metrics in planning; 3) use of metrics in managing a projects process; and 4) use of metrics in managing and improving the overall process. Besides these, there were some general questions about the organization, its software engineering process group (SEPG), and so on, and some other miscellaneous questions.
The author visited the organizations personally, discussed the questions with some senior members of the SEPG of the organization, and sought clarifications where needed. The questionnaires were then completed by the author. Each organization was given an assurance of complete confidentiality. The completed questionnaires were then sent to respective organizations for validation, and were later used for this article.
To ensure that the outcome of the study is not biased by the interpretation of some assessors, the organizations chosen for the study were those that had different lead assessors. By doing this, the author randomizes the assessor factor.
All of the organizations that were picked are in India, though some of the organizations are development centers of multinationals based in the United States. There are now many high maturity organizations in India (for more information see the SEI Web site at www.sei.cmu.edu). The primary reason for selecting the organizations from India were the authors contacts and access, and the ease and low cost of conducting the study. Selecting organizations from one country, however, may bring in a cultural bias.
All of the organizations studied were in the software services sector. That is, they executed projects for some customers. Though some of the projects in these organizations were to develop or maintain general-purpose products for the customer, the software development activity was primarily a project driven by customer requirements. The number of software engineers in the organization varied from about 250 to about 3000 (if only a part of the organization was assessed, one is considering only the strength of that part). All except one organization were ISO certified, with ISO certification preceding their CMM assessment. That is, all except one seem to have adopted the ISO framework first, and have later moved to the CMM framework. All of the ISO certified organizations continue to maintain their ISO certification.
All organizations had a formally defined unit that was dedicated to, and responsible for, SPI. This unit is frequently referred to as the SEPG. Most organizations had the SEPG strength of less than 2 percent. Only one organization had the strength of SEPG as 6 percent, but for this organization the SEPG also did tool and technology development, which no other SEPG did. The main activities of the SEPG generally were quality assurance (verifying that the processes are being followed), training, process analysis and definition, and consultancy. In most of the organizations, for the assurance work, the SEPG played the role of a coordinator, while the actual work of audits was done by people from other parts of the organization. Similar was the situation with process definitionSEPG plays the coordinator role, while some task force does the actual definition. Generally, the maximum effort of the SEPG was spent in process deployment related tasks (audits, training, providing help, and so on).
Having an SEPG provides the necessary people support for processes-based improvement in the organization. Clearly, having an SEPG is not sufficient. Since a foundation of the process-based approach for software development is that the past performance of a process can be used to predict performance on a project, it follows that data on past performance must be maintained for planning a project. Process database and the process capability baseline form the two key infrastructure elements for making past performance available for use (Jalote 1999).
The author considers a process database in an organization as a database (or a collection of databases) that contains historical information and data about the use of organization processes on (completed) projects. The author views a process database conceptually separate from databases that may be used for monitoring ongoing projects. The CMM requires that the organization have a process database, which is used for planning, though the CMM does not specify what this process database contains. All the organizations maintain a process databasesome have an integrated database while others kept performance data in multiple databases (for example, the review data might be maintained in a separate database). The process database invariably captured information about project size, effort, schedule, defects, and risks.
For size, the data generally kept were estimated size and actual size. Different units for size were used, with lines of code (LOC) being the most common, though function points were also used. Components as a unit of size was also used. Some organizations worked with multiple units. For effort, generally total effort, and effort spent in different phases was kept. Rework effort was also captured (to help determine the cost of quality). Unit for effort in the database varied from person-hours to person-months, though most organizations captured effort in hours in a project.
For defects, the total number of defects and distribution of defects with respect to where they were detected was captured. Distribution by severity, category, cause, and so on is also frequently recorded. Most also maintain data about the origin of defects, which helps in computing the defect removal efficiency of various quality control tasks. With origin and detection stages, the defect data for a project become a two-dimensional table (an example can be found in Jalote 1999). For schedule, the main data maintained were the actual and estimated dates so slippage can be computed. For reviews, effort and defect data were generally recorded separately for self-preparation and the review meeting. This made detailed analysis possible.
For risks, generally the risks identified by the project during planning and the risk perception at the end of the project were both captured. Most also maintained information about risk mitigation strategies. Besides these, some organizations kept data such as the number of change requests, number of baselines, number of risks, number of reviews, and so on.
All organizations determined their process capability from past data (it is a goal at level 4 of the CMM). The capability of a process specifies the range of expected outcomes if the process is followed. A capability baseline is essentially a snapshot of the process capability at a time. By computing capability baselines at regular intervals, the trends in process capabilities can be analyzed. Understanding the capability of the process and seeing the trends is one of the main purposes of the capability baseline in most organizations. Some organizations also used it to set overall process improvement goals, while some used it for planning purposes. Most computed the capability of overall process in terms of productivity, quality, distributions, removal efficiencies, and so on, while some also computed percent overrun in cost and schedule, percent of defects found before shipment, and the percentage found after shipment. Capability of component processes was also frequently computed, particularly for the review process.
Organizations specified their capability in various ways. Some used the classical definition of six sigma around the mean, while others specified it as the spread of the past performance data, or simply as the mean and standard deviation. For most characteristics like quality and productivity, the range in the process capability was quite large, which reduced the usefulness of process capability data for planning purposes. To reduce the spread, some organizations defined focused baselines for specific types of projects. For example, an organization defined different capability baselines for development projects for different business domains.
Another organization defined a specific baseline for a customer who was regularly giving development projects involving similar technology, platforms, and business domain. Clearly the process capability baseline can be computed from the data in the process database. And where do the data in the process database come from? As the data in a process database are about completed projects, these data come from a postmortem analysis or a closure analysis of the project. This is the analysis that is done on the project after the delivery has been done. It is an excellent tool for learning and building a knowledge base. The relationship between the process database, the process capability baseline, and closure analysis is shown in Figure 1 (Jalote 1999).
Most organizations, as part of metrics infrastructure, also provide tool support to projects for data entry and analysis. The tools varied from On-Site developed tools to commercial tools to simple spreadsheets. Data on effort were generally recorded daily and submitted weekly, while data on defects were recorded after each quality control task. These data are used heavily for project monitoring and control. At the end of the project, the data are analyzed and a summary of data is added to the process database.
USE OF METRICS IN PLANNING
The main use of historical data in planning is for effort estimation. Most organizations use the process database directly and use data from similar projects for estimation. A few of the organizations, however, also use productivity data in the capability baseline for estimation (see Figure 1). Organizations generally do not use the computed estimate directly and use expert judgment or judgment of the project team to correct the computed estimate to arrive at the final estimate. Almost all use a work breakdown structure (WBS), coupled with a bottom-up approach, to arrive at the final estimate. Few used models like COCOMO (Boehm 1981) or other top-down approaches of determining the overall estimate from the overall size estimate (Basili 1980). Some organizations used multiple methods for estimating the effort for a project, and then used the different estimates to arrive at the final estimate.
For estimating the schedule, most did not use the classical approach of determining the schedule as a function of effort (Boehm 1981; Basili 1980). In all the organizations it was accepted that the schedule might be decided based on the customers business needs. However, all of the organizations checked the requested schedule based on experience, past data, the WBS, and availability of manpower.
Most organizations set some quantitative goals for quality for the project during planning. These quality goals were specified in terms of variation of the actual from the estimated effort and schedule, final quality in terms of defect density delivered, defect injection rate in the project, and so on. To achieve these goals, some intermediate goals were also set. These were frequently in terms of defects to be detected at different stages (Jalote 1999). The estimate of defect levels was for phases, or for a smaller task, like a review or testing. (For the latter, control limits are used and are discussed later.)
The project quality goals frequently were an improvement over the existing performance levels in the organization, and these improvement goals were usually set in the context of overall process improvement goals of the organization. Whenever the project goals were better than the current performance levels, the project plan had some special plans for achieving the goals, as it was recognized that using the standard processes will result in achieving only the existing performance levels. These project-level process improvement plans mostly focused on improvements in quality control activities and defect prevention.
Besides planning for cost, schedule, and quality, risk management was another area where past data were used for developing a plan for a new project. Most used information about the risks and risk mitigation strategies, though some also used the data on probabilities and costs associated with risks.
USE OF METRICS FOR PROJECT MONITORING AND CONTROL
A project must be properly monitored when it is executing. A project is monitored to ensure that it continues to move along a path that will lead to its successful completion. During project execution, how does a project manager know that the project is moving along a desired path? For this, visibility about the true status of the project is required. Since software is invisible, visibility in a software project is obtained through observing its effects (Hsia 1996). Providing proper visibility is the main purpose of project monitoring.
There are two key aspects to project monitoring. The first is to collect information or data about the current state of the project and interpret it to make some judgments about the current state. If the current state is a desired state, implying that the project is moving along the planned path, then monitoring provides this visibility and assurance that things are working fine. But what if the monitoring data reveal that the project is not progressing as planned? Clearly, then it must be followed by some actions to ensure that the course of the project is corrected. That is, some control actions are applied to the project. This is the second aspect of project monitoringapplying proper controls to bring the project back on track. This collecting of data or information to provide the feedback about the current state and then taking corrective actions, if needed, is a basic paradigm of project management and is shown in Figure 2 (Jalote 1999). For project management, this is the main use of software metrics (Brown 1996).
For quantitatively monitoring and controlling a project, two approaches commonly used are (these are over and above the methods that may be used for general status reporting): 1) analysis at milestones; 2) ongoing, event-driven analysis.
All of the organizations did milestone analysis. Some organizations did this analysis at some project-defined milestones, while others did this analysis with some periodicity, generally monthly. In this analysis, almost all did an actual vs. estimated analysis of effort and schedule. Some also did a causal analysis of defects found so far. The organizations that predicted defects for phases also did actual vs. estimate analysis for defects. All organizations have set thresholds for acceptable variation in performance from planned. The threshold for effort, for example, varies from 10 percent to 35 percent. The thresholds for schedule are generally lower. In most organizations, these thresholds are set based on experience, comfort level, and past performance data. Besides these, in the milestone analysis risks are reanalyzed, cumulative impact of requirement changes analyzed, and so on.
Many organizations also applied monitoring and control at the event level to provide better control. The organizations that did ongoing, event-triggered analysis generally used some form of statistical process control (SPC). With SPC, a run chart of some performance parameter is plotted. Based on the past performance data, control limits are established. If a point falls outside the control limits, it is assumed that there is some special reason for it. The case is then analyzed and the special cause found and removed. By doing this the process stabilizes and gives predictable performance (for definitions of SPC, see Montgomery 1996; Wheeler and Chambers 1992). For more discussion on use of control charts in software, see Floras and Carleton 1999).
A sample control chart for unit testing is shown in Figure 3. Each point represents the performance of a unit test (in terms of defects per LOC detected.) Control limits are established based on past performance, such that if the unit testing process is working normally, the point will fall within the control limits. If performance of unit testing of a module falls outside the control limits, it is assumed that there is some special cause for the change in performance. The analysis may consider effort spent, quality of estimation, quality of code, and so on, and may recommend actions like retesting, extra testing later on, revised estimates, and so on.
SPC is used most frequently for code reviews, though it was also used for unit testing and other testing. Use of XMR charts was most common, though simple x-charts were also used (for definition of these charts and other aspects of SPC, see Montgomery 1996; Wheeler and Chambers 1992; and Florac and Carleton 1999). For code reviews, some used organizationwide data to build control charts, while some built control charts used by a project from data from reviews from that project itself (the first few reviews were used for this).
The control limits were defined as three sigma in some organizations, while in others it was tighterone sigma, two sigma, or spread in data without outliers. Most organizations used some software to help maintain the control chart. The organizations that built global control charts generally built separate charts for different programming languages (for code reviews), and either one control chart for document reviews or different control charts for reviews of different types of documents. The actions taken by most organizations when a point fell outside the control limits included changing the checklist, improving training, and so on to improve the process, and taking actions such as rereview, careful testing in the future, and so on to improve the product.
USE OF METRICS FOR PROCESS ANALYSIS AND IMPROVEMENT
With the process capability being computed on regular intervals, trends in process capability over time can easily be analyzed. Most organizations computed the process capability either yearly or half-yearly.
For improving the organization process, most organizations set an organizationwide goal. This goal may be set in terms of improvement in quality and productivity, reducing the variance in performance, reducing the defect injection rate, or improving the review effectiveness. These goals are generally accompanied by some overall strategy. Frequently, each project devises its own plans to meet or beat the organization improvement goals (see previous discussion).
Many organizations were able to quantify improvement. One organization that sets very aggressive goals for its quality improvement has seen a 30 percent reduction per year in its defect injection rate for the last four years. Another organization has seen a reduction of about 50 percent in delivered defect density as well as defect injection rate in a year. One organization reported a 50 percent improvement in productivity in five years, while another saw its effort overruns go down from 30 percent to 20 percent and then further to 10 percent within two years.
Some organizations were using metrics consistently to evaluate technology and process initiatives for improvement. These organizations had a defined procedure for first implementing a change or new technology in one or more pilot projects, setting quantitative improvement goals for this pilot, and then evaluating the pilot against the goals. The evaluation of results from a pilot was used to decide how deployment was to proceed, for example, whether the entire organization should use it, or some type of projects, and so on.
At high maturity organizations, metrics are expected to play a key role. Though metrics data are collected, and even used at level 2 and 3, from level 4 on metrics are expected to play a key role in overall process management as well as in managing the process of a project. The aim of this study is to see the similarities of the metrics programs of high maturity organizations, and the nature of the similarities, with the hope that this information can help other organizations in building or improving their own metrics program and in their quest for high maturity.
In a software organization, metrics data are used for project planning, monitoring and controlling a project, and overall process management and improvement. To support these uses, some metrics infrastructure is needed. For conducting the study, the author selected a few high maturity organizations. He visited these organizations personally to understand the role and use of metrics. The information he collected was the main source for writing this article. One of the criteria for selecting the organizations to be included in the study was that their lead assessors should be different, so that the view of the assessor does not bias the study.
Most of the organizations studied collect similar data that focused on effort, defect, size, and schedule. Most organizations have a process database that maintains metrics data for completed projects. Capability of the process is determined from the past data. For planning, past metrics data are used for effort and schedule estimation, though it has also been used for setting quantitative quality goals. For monitoring a project, all organizations have a regular metrics analysis that focuses on estimated vs. actual of the parameters that have been estimated in the project plan. Many organizations have enhanced it by SPC techniques to smaller tasks like reviews, unit testing, and so on.
For overall process management, most organizations analyze past data to see the trends in different parameters. Some also use it to set organization goals for improvement. Different types of improvements have been observed by these organizations. Improvements were observed in terms of reduction in delivered defect density, improved productivity, reduction in defect injection rates, reduction in effort overruns, and so on.
The author is grateful to the eight companies that participated in the study, and to their representatives who read the initial report and provided valuable feedback.
Arther, L. J. 1997. Quantum improvements in software system quality. Communications of the ACM 40, no. 6 (June): 47-52.
Basili, V. R. 1980. Tutorial on models and metrics for software management and engineering. Washington, D. C.: IEEE Press.
Boehm, B. 1981. Software Engineering Economics. Englewood Cliffs, N. J.: Prentice Hall.
Butler, K. L. 1995. The economic benefits of software process improvement. Crosstalk (July): 10-19.
Brown, N. 1996. Industrial-strength management strategies. IEEE Software (July): 94-103.
Diaz, M., and J. Sligo. 1997. How software process improvement helped Motorola. IEEE Software 14, no. 5 (September): 75-81.
Dion, R. 1993. Process improvement and the corporate balance sheet. IEEE Software (July): 28-35.
Florac, W. A., and A. D. Carleton. 1999. Measuring the Software ProcessStatistical Process Control for Software Process Improvement. Reading, Mass.: Addison Wesley.
Haley, T. J. 1996. Software process improvement at Raytheon. IEEE Software (November): 33-41.
Herbsleb, J. D., and D. R. Goldenson. 1996. A systematic survey of CMM experience and results. 18th International Conference on Software Engineering, Berlin.
Herbsleb, J. et al. 1997. Software quality and the capability maturity model. Communications of the ACM 40, no. 6 (June): 31-40.
Hollenbach, C. et al. 1997. Combining quality and software improvement, Communications of the ACM 40, no. 6 (June): 41-45.
Hsia, P. 1996. Making software development visible. IEEE Software (May): 23-26.
Humphrey, W., T. R. Snyder, and R. R. Willis. 1991. Software process improvement at Hughes aircraft. IEEE Software (July): 11-23.
Jalote, P. 1999. CMM in Practice: Processes for Executing Software Projects at Infosys. Reading, Mass.: Addison Wesley.
Lipke, W. H., and K. L. Butler. 1993. Software process improvement: a success story. Crosstalk (November): 29-39.
Montgomery, D. C. 1996. Introduction to Statistical Quality Control, third edition. New York: John Wiley & Sons.
Paulk, M., et al. 1995. The Capability Maturity ModelGuidelines for Improving the Software Process. Reading, Mass.: Addison Wesley.
Paulk, M. C. 1999. Practices of high maturity organizations. 1999 SEPG Conference, Atlanta, March.
Wheeler, D. J., and D. S. Chambers. 1992. Understanding statistical process control. Knoxville, Tenn.: SPC Press.
Wohlwend, H., and S. Rosenbaum. 1993. Software improvements in an international company. 15th International Conference on Software Engineering, Baltimore, Md.
Wohlwend, H., and S. Rosenbaum. 1994. Schlumbergers software process improvement program. IEEE Transactions on Software Engineering, 20:11, November.
Pankaj Jalote is a professor and head of the department of computer science and engineering at the Indian Institute of Technology, Kanpur. Previously he was an assistant professor in the department of computer science at the University of Maryland. He has a bachelors degree from IIT Kanpur and a doctorate in computer science from the University of Illinois at Urbana-Champaign in 1985. From 1996 to 1998 he was vice president of quality at Infosys Technologies Ltd., a large Bangalore-based company providing software solutions worldwide, where he spearheaded Infosys successful transition from ISO to CMM.
Jalote is the author of CMM in PracticeProcesses for Executing Software Projects at Infosys and of the upcoming book Software Project Management in Practice. He previously authored two other books, An Integrated Approach to Software Engineering and Fault Tolerance in Distributed Systems. Jalote is a member of IEEE. He can be reached at email@example.com.
If you liked this article, subscribe now.