If you want to see process-nerds fight ask each of them to define “cycle time”. One group will look at individual units of work. They will say “cycle time” is the duration from start to delivery of a piece of work. Another group will look at the work in aggregate. They will cite Little's Law and say there is only "work in progress" and "throughput". Some people will throw up their hands in disgust, and say there is no such thing as “cycle time” in software delivery. A more recent wave of DevOps practitioners will introduce a new term, “change lead time”. Unfortunately they hastily disagree on what the new term means. How do we break this cycle of debate? Let's examine what we're trying to measure.
What is Cycle Time
The goal of Cycle Time (CT) is to measure how quickly work can exit through a system once it begins. “Work in Progress” (WiP) is a term used to describe this type of work. WiP refers to any unit of work that has started but not finished. The start point is the moment an engineer focuses on a piece of work with the intention of writing code for it. Please consider reading The Phoenix Project for more background on WiP and “lean” methodologies. Before we move on let's summarize: Cycle Time measures how long work remains in progress.
Calculating Cycle Time
The debate around cycle time revolves around how to calculate CT. There are two popular ways to measure CT: direct measurement and Little's Law.
Cycle Time = Work end datetime - Work start datetime
This is the most accurate representation of cycle time. It is also more complicated to calculate. Outliers affect this equation. Outliers are work that took unusually long or short periods of time to complete. Calculating an average CT which includes these outliers will skew the CT result. To calculate CT with direct measurement subtract the start date from the end date. This gives you the duration that the work was in progress. Let's calculate CT given this single unit of work:
|ID||Start date||End date|
The CT for this work is 1 hour (13:00 - 12:00). The CT for a time period is the average of all the units' CTs over that time period. For example, given the table above, the average CT for one day, October 10th, is 1 hour.
Cycle Time = Work in Progress / Throughput
This formula is easier to calculate. You don't have to measure and average the durations of each unit of work. Also, this formula ignores the outliers we described in the direct measurement formula. Little's Law does this by putting work into two states: "incomplete" and "complete". In the following diagram we're generating a CT using Little's Law for a given time period. Complete work is green; incomplete work is red:
This diagram shows three units of WiP. WiP is defined as having either:
- A start time within the period's range
- An end time within the period's range
There are two complete units of work in the period. We define "complete" work as having an end time within the period's range. That gives us a CT of 1.5 per time period (3 / 2). The diagram has an example time period of one day. This tells us we can complete 1.5 units of work per day.
One nice thing about this calculation is that it takes WiP into consideration. The direct measurement formula can only calculate complete work. This calculation can consider work that is currently in progress. This gives you a result that is accurate to the current moment. Another benefit of this formula is that it illustrates the two ways you can improve CT: increasing throughput and decreasing WiP.
These calculations are averages. Depending on when you measure your sample you may get misleading results. Let's apply Little's Law to the same example from above:
|ID||Start date||End date|
Calculating for one day, Oct. 10th, yields a CT = 1 day (1 / 1 per day). This is in contrast to the direct measurement method for the same day yielding a CT of one hour. Let's measure a two day period with Little's Law: the 10th and 11th. This gives us a CT of 1 unit of work every two days (1 / 1 per two days). This may paint a bad picture of Little's Law, but it shouldn't. This is a limitation of averages for any formula. The direct measurement formula could be similarly impacted. You could chose a period of many outliers that would skew direct measurement. It's possible, intentionally or unintentionally, to pick a period that reflects CT in positive or negative lights. To combat this limitation it's worthwhile to measure the same period at regular intervals. For example, consistently calculate and report CT for one day, one week, or whatever period makes sense for your team.
Cycle time measures the part of a software delivery pipeline that is mostly (if not entirely) controlled by an engineering team. Because of this it's a great metric to track. Changes to development (ex. feature flags), peer review, QA, CI/CD, and team structure will all have an impact on CT. Measure, observe, and improve! 👩🔬
The largest factor for how you want to handle CT calculations is outliers. If you want to feel the pain of a ticket that was WiP for 3 months, go with direct measurement. There are reasons for and against this. Teams want to identify pain points so that they can correct the issue. However, there are cases where work was in progress, but outside of engineering's control. You may not want that outlier indicating there's a problem if you can't fix it. If you want to have an understanding of CT, but gloss over some of the details, go with Little's Law.