Normalizing story points for enterprises
Insightful Analogy
The concept of normalizing estimates is analogous to converting (i.e., normalizing) currencies across different divisions of a multi-national company. This analogy helps us understand the need for story point normalization across the space and time dimensions of a large enterprise.
As shown in Figure 2, a US-based company has divisions in Germany, UK, Australia and Hong Kong, each reporting its 2nd quarter revenue in its local currency (Euro, Pound, Australian dollar, and Hong Kong dollar, respectively). US Headquarters needs to calculate corporate revenue in US dollars. The US dollar plays the role of currency “Normalization Basis.” Figure 2 also indicates the currency conversion factors, for example 1 £ = 1.59 US$. In this case, the currency conversion factor for the UK pound is 1.59. It is meaningless to simply add the revenue numbers without first converting (normalizing) the numbers to a standard currency.
Figure 2: Normalizing currencies in a multi-national company
Each division normalizes its local currency to US dollars by applying its currency conversion factor. After currency conversion (normalization), the total revenue for the company can be expressed in US dollars.
Step-by-Step description of CNM for bottom-up estimation
CNM (bottom-up estimation) consists of four steps described below.
Step 1: Decide the Normalization Basis for an enterprise
All teams, projects, programs and portfolios across the enterprise should agree on the ideal hour equivalent of 1 Normalized Story Point (NSP). The number of hours decided is called the Normalization Basis. Thus 1 NSP = the number of ideal hours decided upon = the Normalization Basis.
The Normalization Basis for an enterprise can be an arbitrary value, such as 8, 10, 16, 20, 32, 40, 25, 100, or even 1 ideal hour. It does not matter what the Normalization basis is, as long as everyone agrees on a single value throughout an enterprise. The Normalization Basis value can be decided simply by fiat or a random draw. As such, Step 1 is literally a one minute decision step; no discussion is warranted. This number will remain constant across the space and time dimensions for an enterprise.
In this blog series, as an example assume that an enterprise has decided its Normalization Basis = 40 ideal hours or 1 week (1 NSP = 40 ideal hours). 40 ideal hours per 1 NSP becomes the common standard for representing estimates throughout the enterprise covering both the space and time dimensions.
One ideal hour of effort has the same meaning across the enterprise. Similarly, one NSP of effort has the same meaning (40 ideal hours), as per the example above. In one ideal hour of effort, different teams may produce different amount of outputs based on their productivity. One ideal hour of effort represents the same amount of effort, but not the same level of output across all teams, projects, programs, portfolios, and sprints.
Step 2: Estimate the relative sizes of stories using relative sizing techniques
Relative sizes of stories are commonly estimated in story points by using techniques such as Planning Poker. Because a story point only has meaning in the context of the team that did the estimation, I call it a Team Story Point (TSP). TSPs for one team are not equivalent to TSPs for any other team. TSPs of a team may not be compared across different sprints of even the same team.
Step 3: Determine the Calibration size for each team
In CNM, each team calibrates the size of one TSP by using a sample of up to 3 stories from its sprint backlog for each sprint. This process determines the average number of hours per TSP, or Calibration Size, for a team.
Calibration Size = (Total estimated hours of effort for up to 3 sample stories) / (Total Team Story Points for same sample stories) = Team Hours per Team Story Point (TSP)
As an example based on Figure 3:
The team’s Calibration Size = (29 + 62 + 98) / (1 + 2 + 3) = 31.5 ideal hours per TSP
Figure 3: Story Point Calibration
Using the Calibration size, the team can predict how many ideal hours it will take to complete a story of a given size. For example, if a story is estimated to be 3 TSP, then it will likely take 3 * 31.5 = 94.5 ideal hours to complete.
Up to 3 sample stories should be selected from the sprint backlog for determining the Calibration Size. Your sample may contain 2 or 3 small stories, which reduces the risk of using only a single story for calibration, because a single story may be an outlier and skew all estimates. However, if you feel confident in choosing only a single story as your calibration point, you may do so.
If you have any other basis to calculate the Calibration Size, perhaps based on that team’s historical data for the actual effort needed for 3 sample stories, you may do so. You need to be careful that the historical data is truly representative of the team (exactly the same team members working under a very similar “weather pattern”).
Unlike SAFe’s 1NM, CNM does not force you to select a story with 1 IDD effort or any pre-determined quantity of effort. CNM simply lets the “chips fall wherever they may” and calibrates the size of one TSP for each team for every sprint, without making any assumptions.
Step 4: Normalize the story points and enter NSP for each story in agile project management tool
I now define an important ratio called the Point Conversion Factor, which is the ratio: Calibration Size / Normalization Basis. The Point Conversion Factor is similar to the currency conversion factor in a multi-national company shown in Figure 2. The Point Conversion Factor allows you to convert team story points into equivalent normalized story points.
In our example, the Point Conversion Factor = 31.5/40 = 0.787.
To convert the TSP for a story into NSP, multiply the TSP number by the Point Conversion Factor. So if the team has estimated a story to be 3 TSP, then this becomes the equivalent of 0.787 * 3 = 2.361 NSP.
The Point Conversion Factor also allows you to convert an NSP into equivalent TSP. To covert the NSP for a story into TSP, divide the NSP number by the Point Conversion Factor. Therefore in our example, 2 NSP = (2/0.787) TSP = 2.54 TSP.
In Part 5 of the blog series, I will provide an Excel-based downloadable template for doing all story point normalization math in a very quick and easy way.
Now that story points are normalized, we can enter and use these values in our agile project management tools (such as VersionOne) to ensure that all story point roll-ups, progress bars, math and reports are meaningful and correct across large-scale agile projects with several teams, programs and portfolios. (Hooray!)
I now describe various applications of CNM.
Velocity and calibration
Estimating team velocity
For teams operating under yesterday’s weather model (see Part 2 of this blog series), a reasonable prediction of future velocity can be done by taking an average of the last 3-4 sprints. The velocity for a team could be expressed as TSP/sprint or NSP/sprint. However, especially for large-scale agile projects involving more than one team, it is important that we use the normalized velocity of each team so that the team velocities can be added together or rolled up correctly.
For teams not operating under yesterday’s weather model, it is harder to predict future velocity using past sprint results. In this case, CNM recommends that the teams calculate their Agile Capacity for future sprints in hours. Agile capacity for a team is the total hours available from all team members, not including time allocated to planning, meetings, vacations, emails, etc. Note the following relationships.
Estimated maximum normalized velocity = (Agile Capacity / Normalization Basis) NSP per sprint
Estimated maximum team velocity = (Agile Capacity / Calibration Size) TSP per sprint
For example, If the Agile Capacity of a team is 500 hours/sprint, and the enterprise Normalization Basis is 40 ideal hours and Calibration Size = 20 ideal hours, then the estimated maximum normalized velocity = 500/40 or 12.5 NSP per sprint, and estimated maximum team velocity = 500/20 or 25 TSP per sprint.
In Part 5 of the blog series, I will present an Agile Capacity calculation template as a worksheet inside the Excel-based template mentioned above for doing all story point normalization math. The template will make agile capacity and all story point normalization calculations very quick and easy. For any given team, the Calibration Size may change from sprint to sprint, but this has no bearing on the estimated maximum velocity expressed in NSP. Estimated maximum normalized velocity depends only on the Agile Capacity of a team and the Normalization Basis chosen for the enterprise.
Story point normalization deals well with team velocity differences caused by calibration size differences
Let us assume Team A and Team B have the same Agile Capacity of 400 hours, but Team A takes a sample of small stories for its calibration, and as a result its calibration size is 5 ideal hours; while Team B’s sample consisting of larger stories yields a calibration size of 20 ideal hours. Team A will have an estimated maximum local velocity = (Agile Capacity / Calibration Size) = 400/5 or 80 TSP, while Team B will have estimated maximum local velocity = 400/20 or 20 TSP. However, both teams are estimated to have a maximum normalized velocity of (Agile Capacity / Normalization Basis) = 400/40 or 10 NSP. If Team A and B are part of a program, rolling up their velocity numbers (whether estimated or measured) in TSPs to calculate program velocity (estimated or measured) would be meaningless; but it would make perfect sense to roll up their velocities in NSPs to get program velocity in NSPs.
Roll-up of NSPs and velocity metrics
For portfolio of programs, epic hierarchies, feature group hierarchies and goals, NSP numbers should be rolled-up the hierarchy (and not TSP numbers, as done typically in many agile projects). An example is illustrated in Figure 4. All story point numbers shown in gray ovals in Figure 4 are estimated NSP numbers (i.e., planned normalized velocity numbers); and all story point numbers shown in green ovals are measured normalized velocity numbers. Therefore, their roll-up math (addition while moving up the hierarchy) is meaningful and correct.
Teams 1.1.1, 1.1.2 to 1.1.j making up Program 1.1 have estimated workload of 13.4, 14.1 and 11.9 NSP at the beginning of a sprint. They have demonstrated measured velocity of 12.2, 13.8 and 12.0 NSP at the end of that sprint. Program 1.1 has estimated workload of 61.7 NSP and measured velocity of 52.8 NSP.
Each NSP number can be easily converted to the equivalent number of ideal hours by multiplying each NSP number by Normalization Basis (40 ideal hours in our running example). If the enterprise has historical data for the loaded cost for one ideal hour of work or one NSP work, it can easily calculate the estimated cost at the team, program, portfolio and enterprise levels. Similarly, NSP numbers can be added up for a release cycle by adding the numbers for all sprints of that release cycle.
Note that the estimated workload and velocity numbers are rolled up from bottom to top by agile project management tool (such as VersionOne). It requires the lowest team-level workload estimates and velocities in NSP for each team and every sprint. This can easily done if each team plans its sprint before the actual sprint work starts, and during its sprint planning estimates its stories in TSP and converts TSP numbers into NSP numbers by following Steps 2 through 4 of CNM described above.
The approach illustrated in Figure 4 for bottom-up roll-up of estimated workload and measured velocities (all expressed in NSP) can be adapted for use by enterprises with many independent projects. This aggregation will be meaningful if the story points for stories at the bottom-most team levels are entered in NSP. These independent projects are likely to have different sprint cadences, and sprints most likely are not synchronized. It is still possible to report on “Velocity by Date” metric or “Weekly Throughput” of accepted stories expressed in NSP. For example, an enterprise may report a throughput of 500 NSP per week, which is equivalent to (500 x 40) = 20,000 ideal hours of accepted work. Metric represented in ideal time units (hours, staff-days, staff-weeks, etc.) are much more meaningful to management than metric represented in story points.
Needless to say CNM can also be used by small agile projects too. Even a single-team project may find story point normalization useful if one or more requirements of yesterday’s weather model (see Part 2 of this blog series) do not hold for any sprint of that team.
Figure 4: Roll-up of normalized story points from team-level to program-level to portfolio-level to enterprise level
Why not just estimate each story in ideal hours instead of story points?
I now answer this legitimate and important question by giving the following reasons:
- Estimating a story in ideal hours requires substantial effort as the story needs to be broken down into its tasks and tests to be estimated in ideal hours. This is a lot more effort compared to relative size estimation of stories.
- Stories are of value, benefits and meaning to customers, while tasks and tests inside stories are not. Relative sizes of stories remain stable while estimates in ideal hours may change over time.
- CNM does not require you to estimate each story in a backlog in ideal hours; only up to 3 stories in a sample are estimated in ideal hours to calibrate the size of one TSP. CNM also does not require you to track the actual effort in hours.
- Relative size estimation is a team effort; the discussion among team members for estimating relative sizes increases the collective and shared understanding of each story among all team members. This benefit of the conversation is often of greater value than the result of arriving at a specific story point number for each story.
- NSP numbers can be immediately converted to ideal hours by multiplying a NSP number with Normalization Basis. There is no reason to estimate each story in ideal hours by breaking it into its tasks and tests, and then estimating each task and test in ideal hours (this is a lot of effort as stated above).
- Like SAFe’s 1NM, CNM too is a hybrid method. Both methods establish equivalence between NSPs and ideal hours of work. In SAFe’s 1NM, 1 NSP = 1 IDD = 8 ideal hours. In CNM, 1 NSP = Normalization Basis number of ideal hours.
Acknowledgements: I have greatly benefited from discussions and review comments on this blog series from my colleagues at VersionOne, especially Dave Gunther, Andy Powell and Lee Cunningham.
Credits: Sattish from Collabnet VersionOne