This paper is the third in a sequence defining Evaluated Product Maturity (EPM). EPM is designed to assist in assessing the risk of breaking or diminishing the functionality of a system-of-systems functional thread by modifying, changing, or replacing one or more systems that make up that thread. The sequence of paper provides the basis for computing EPM using BTI’s automated systems and software data science capabilities.
This first document defined the purpose, background, and general approach for computing the EPM score.
The second installment defined the factors that the data collected from a wide array of systems has indicated is critical in evaluating the risk for changes to a system to break or diminish the functionality of that system.
The third installment, which you are currently reading, discusses automatically capturing data to be processed by the EPM tool and how calculations are done.
The next (also the last) installment provides an example use case for EPM.
For Sections 1 through 4, see
Of the sixteen Fundamental Metrics identified, only 12 are used in the calculation of the product’s EPM. They are listed below. Eight of the metrics are reported on a per release basis. The others are reported as most recent captured.
Most of these metrics are available directly through the use of existing tools. In those cases, no detailed formula (or discussion) is needed. For the others, the tools may provide the base measures that are used to create the composite metric, or additional processing may be needed to get the actual metrics. For example, DF, and hence MDF, is not a metric that is generally reported. But the base measure used to calculate DF is (Time of Release). A simple calculation giving the difference between the last two releases provides a value for DF. In the following sections detailed formulas for calculating the metrics from base measures are provided. If no actual calculation is necessary, a detailed description is not provided.
Throughout the discussion of data capture and analysis a release is defined to be a released version of the software. This is the version where a project plans on releasing new features, or defect repairs, or both, to end users/customers.
Since release names can be non-specific, even within a single project, statistics need to be captured based on the release type that contains the most records, although the raw data for each release group is retained. For example, project A has decided to track new requirements on a version numbering scheme that follows the x.y.z notation, where the versions are identified numerically with the numbers assigned in increasing order. The ‘x’ represents a major version, ‘y’ a minor version and ‘z’ a patch or bug fix. Releases then follow in time like the following:
1.0.0, 1.1.0, 1.2.0, 1.2.1, 1.2.2, 1.3.0, 2.0.0, …
However, during that release string there were also some priority bug fixes which were identified using a different nomenclature for release name. These were interspersed with the normal releases as shown below:
1.0.0, 1.1.0, 1.2.0, 1.2.1, BugFix_1, BugFix_2, 1.2.2, 1.3.0, 2.0.0, …
An argument is made that the x.y.z series (based on the SemVer versioning scheme) is the primary release pattern and so the TCR is calculated based on that series of data. Data files containing the two types of releases are maintained if offline analysis is desired.
Although there are a number of industry standard tools that can be used to capture these metrics, information is provided based on the use of the most commonly used tools, namely, Atlassian’s Jira, SonarQube™, Git variants (such as Git, GitLab, GitHub) and Microsoft Project, and outlines a prototype environment for use in initial piloting of the product maturity effort. The data is being captured and analyzed using the R programming language together with a Shiny Web interface and Excel for dashboard prototyping.
The following sections describe the requirements for the last step in the overall process, monitoring the data, and contains the guidance needed to develop a data capture and analysis system to accomplish the goal of evaluating the EPM of monitored products and threads. The relevant section of the complete process is shown in Figure 7 with the pertinent areas discussed highlighted in green.
Figure 7 – Main Process
The twelve fundamental metrics needed for the calculation of the EPM are listed in Table 21. These metrics are categorized into one of three major Metric Groups: Quality, Performance and Software Vulnerabilities. Standard Tools used to capture these metrics are also listed in the table. Each metric is captured and evaluated at every release of a product.
Table 21 – Fundamental Metrics with Weights
Metric Group | Metric | Tool | |
---|---|---|---|
Quality (MSIQ) | TC | Test Coverage | SonarQube™ |
Comp | Code Complexity | SonarQube™ | |
DD | Defect Density | SonarQube™ | |
CSD | Code Smell Density | SonarQube™ | |
Performance (MSIP) |
MDF | Mean Deployment Frequency | Jira |
MCLT | Mean Change Lead Time | Jira | |
MTTR | Mean Time To Repair | Jira | |
NTR | New Tickets Received | Jira | |
TCR | Ticket Closure Rate | Jira | |
MAUC | Mean Active User Count | ToolMetrics | |
PvAP | Plan vs. Actual | Jira, MS Project | |
Software Vulnerabilities (MSIv) | VD | Vulnerability Density | SonarQube™ |
Because of the need to provide a consistent, standardized method of capturing the required metrics, many of the measures need to be adjusted, or normalized. We have found that many products have taken liberty in the way Jira allows customer customization to include many user-defined fields. Since there was no apparent standard way of standing up an instance of Jira, there is a lack of uniformity in the selection of key fields contained, used and propagated within a product’s Jira environment. Each section below describes the mechanisms, procedures and processed of normalizing the data captured so that a consistent representation of the data, and subsequent analysis, can be realized.
The metrics for measuring product quality come completely from SonarQube™, via the Nymbul Treasure instance.
Each of the four Quality Metrics are easily captured via the RESTful interface to SonarQube™. However, they change whenever a product executes a SonarQube™ analysis (as new code is developed, changes are made, etc.). Since data must be captured at product release, there needs to be a determination of when a release is conducted and an appropriate trigger to run the data capture for the SonarQube™ metrics.
Test Coverage (TC) is a singular metric and will be reported as the percentage of code covered by unit tests. This is sometimes known (and reported) as overall coverage.
Code Complexity (Comp) is a singular metric and will be reported as the average cyclomatic complexity per file.
Defect Density (DD) and Code Smell Density (CSD) are composite metrics. Since SonarQube™ provides five levels of severity for Bugs and CS, these metrics will be reported as described in the mapping below:
Performance is centered on the ability of a product team to provide their product to end users and customers with a high degree of quality and available in a timely manner. It is not a measure of how well the product performs in terms of reliability and availability, although these measures can be captured in future instantiations of the EPM model. Most of the data needed is readily available in Jira.
The majority of the required base measures needed to calculate the Performance Metrics are captured via the RESTful interface to Jira. The Mean Active User Count and Plan vs. Actual are available via other mechanisms.
Additional processing is required to combine the base measures obtained via Jira into the metrics reported on the dashboard and used in the calculation of the EPM. A sample size is used to provide normalization across the metrics and projects. Typically, a sample size (ss) of 10% of the most recent releases is adequate providing a rolling average for the final mean calculation. For measures where the number of releases is less than a specified minimum value (defaulted to 8), the whole set of data is used to calculate the mean.
The Deployment Frequency is measured as the rate at which deployments are dropped to the customer. It is a time value that measures the number of hours between product releases. The Mean Deployment Frequency (MDF) is the mean of the DF values across all product releases.
The Release Identifier (name) and Release Date (releaseDate) are both captured via the following RESTful call to Jira:
https://location-of-jira-db/jira/rest/api/latest /project/<projectname>/versions
For each new release, r, calculate the Deployment Frequency (DF) and the Mean Deployment Frequency (MDF). The Release Identifier is captured for reference and is required in the following section.
Base Measure | Definition | Data Type |
---|---|---|
r | Release number | Integer |
Release Date | Date of the release | Date/Time |
n | Number of releases | Integer |
ss | Sample Size | Integer |
Hrs/Day | Number of hours in a workday (default = 8) | Float |
The Change Lead Time is the time from the start of a development cycle to when the new feature is deployed to the customer base. It is a measure of how long it takes a feature to reach the user base once the feature has been accepted for incorporation into the product. Since there will usually be more than one new feature request in a given deployment, the Average Change Lead Time (ACLT) is calculated. It is the average CLT of all the new features during a given deployment and represents the average time it takes to incorporate a new feature request. In addition, the Mean Change Lead Time (MCLT) is the average of ACLTs across all releases of the product. CLT is sometimes referred to as cycle time. CLT and MCLT are measured in hours.
Jira contains a number of issue types that identify the type of issue and only specific issue types are considered new capabilities (Features in this paper) or Defects. As mentioned earlier, the lack of consistency among the various projects within Jira necessitates a mapping of Issue Types that collectively represent a feature. Section 6.2.1.2.1 provides the rationale for identifying a Created Issue as a Feature or a Defect.
On November 19, 2018 an analysis of Issue Type utilization within a publicly available Jira instance was conducted to evaluate the extent of issue type usage and to provide guidance in the determination of a mapping algorithm that can be used to identify issue types that represent features. This analysis showed that Jira contains 371 unique Issue Types, but the vast majority are used only by a very small number of projects. The top ten Issue Types for each project type are shown in Table 22. Table 23 shows all Issue Types where at least 10 projects have identified them. In each case the numbers represent the number of projects that have identified the Issue Type for use within their Jira environment.
Table 22 – Top 10 Issue Types by Project Domain
Business | Service Desk | Software | ||||||||
---|---|---|---|---|---|---|---|---|---|---|
Task | 109 | Sub-task | 46 | Sub-task | 476 | |||||
Sub-task | 106 | Bug | 41 | Task | 476 | |||||
Action | 57 | Epic | 34 | Epic | 465 | |||||
Change | 54 | Task | 34 | Bug | 437 | |||||
Epic | 16 | New Feature | 31 | Story | 395 | |||||
Bug | 7 | Help Ticket | 21 | New Feature | 234 | |||||
New Feature | 7 | FeedBack | 20 | Improvement | 214 | |||||
Improvement | 5 | Story | 20 | Requirement | 167 | |||||
Initiative | 5 | Other | 19 | Enhancement | 39 | |||||
Story | 5 | RFI | 17 | Research | 28 |
Table 23 – Number of Issue Types used across all projects
Because of the disparate definitions of what represents a bug, new feature, requirement, etc., a mapping is necessary to capture all the various issue types that have a common meaning. Jira comes with a small set of default issue types as described in Table 24. Table 25 provides the suggested list of issue types along with a mapping of project’s issue types into two categories: Feature and Defect. Of the non-default issue types, ‘New Feature’, ‘Improvement’ and ‘Requirement’ are heavily used by projects.
Table 24 – Jira Default Issue Types
Group | Issue Type | Description |
---|---|---|
Jira Core | Task | Task that needs to be done |
Subtask | Smaller task within a larger piece of work | |
Jira Software | Story | Functionality request expressed from the perspective of the user |
Bug | Problem that impairs product (or service) functionality | |
Epic | Large piece of work that encompasses many issues | |
Jira Service Desk | Incident | System outage or incident |
Service request | General request from a user for a product or service | |
Change | Rollout of new technologies or solutions | |
Problem | Track underlying causes of incidents |
Table 25 – Suggested Issue Types and Mapping
Issue Type | Number of Projects Using (11/16/18) | Percent of Total Projects (679 Total) |
Mapping |
---|---|---|---|
Bug | 485 | 71% | Defect |
Change | 64 | 9% | |
Defect | 31 | 5% | Defect |
Enhancement | 46 | 7% | Feature |
Epic | 515 | 76% | Feature |
Improvement | 231 | 34% | Feature |
Incident | 7 | 1% | |
New Feature | 272 | 40% | Feature |
Problem | 19 | 3% | Defect |
Requirement | 178 | 26% | Feature |
Service request | 12 | 2% | |
Story | 420 | 62% | Feature |
Subtask | 628 | 92% | |
Task | 619 | 91% | |
User Story | 20 | 3% |
The base measures needed for the calculation of MCLT (as well as MTTR and MGCR) all come from the release data captured via the MDF process and the following RESTful call to Jira which captures the set of issue data that can be linked to each release:
https://location-of-jira-database/jira/rest/api/latest/search?jql=project=<projectname>&maxResults=50
For each new release, r, calculate the Change Lead Time (CLT) and Mean Change Lead Time (MCLT).
Base Measure | Definition | Data Type |
---|---|---|
Release Date | Date of the release that contains the requested features | Date/Time |
Feature Request Date | Date of the feature request (usually the date corresponding to the opening of a ticket to incorporate the feature in an upcoming release). | Date/Time |
n | Number of feature requests that were deployed in the given release | Integer |
r | Number of releases | Integer |
ss | Sample Size | Integer |
Hrs/Day | Number of hours in a workday (default = 8) | Float |
The Time To Repair metric is the time from a given failure (captured as a ‘defect’ reported based on the mapping described in Table 25) to the release of a product with the corresponding repair. NOTE: These measures are based on failures of the product as a unit, not of the system as a whole. Time To Recover would apply in those cases. As in the case with CLT, there may be more than one defect repair in a given deployment, and so the Average Time To Repair (ATTR) is calculated. It is the average TTR of all the defect repairs during a given deployment and represents the average time it takes to repair a defect. In addition, the Mean Time To Repair (MTTR) is the average of ATTRs across all releases of the product. TTR and MTTR are measured in hours.
Base Measure | Definition | Data Type |
---|---|---|
Release Date | Date of the release that contains the defect repairs | Date/Time |
Defect Reported Date | Date the defect was reported (usually the date corresponding to the opening of a ticket documenting a defect in the product). | Date/Time |
n | Number of defect repairs incorporated in the given release | Integer |
r | Number of releases | Integer |
Ss | Sample Size | Integer |
Hrs/Day | Number of hours in a workday (8) | Float |
This metric captures the number of new tickets received over time. New tickets are those tickets that have not been entered into a working state. That is, they have just been received and no assignment for work has been made. Ideally, this metric should be zero, or at least very close to zero at any given time.
Tools that provide the ability to track service tickets generally provide a state transition scheme that defines the states of tickets as they move through the service process. NTR is simply the number of all tickets currently in the ticketing system but yet to be assigned a service provider.
The Ticket Closure Rate (TCR) is the difference between the rates at which existing tickets are closed and new tickets arrive. It is calculated by subtracting the rate at which tickets enter by the rate tickets are closed between two consecutive periods (releases).
Base Measure | Definition | Data Type |
---|---|---|
Closed | Number of closed tickets | Integer |
Total | Total number of tickets | Integer |
r | Release Number | Integer |
At this time the only metric captured related to Software Vulnerabilities is the static code analysis of code vulnerabilities. This is captured by SonarQube™.
Vulnerability Density is a composite metric. Since SonarQube™ provides five levels of severity for vulnerabilities, this metric will be reported as described in the mapping below:
By: Mike Mangieri
Senior Principal Process Engineer
Business Transformation Institute, Inc.