logo image

TD Magazine Article

See Metrics in a Different Light

An easier way to approach talent measurement is to focus on reducing uncertainty rather than getting a perfect metric.

By

Sat Feb 01 2025

Feature3_Feb25_TD
Loading...

Walk into any boardroom, and you'll find discussions filled with data. Executives rely on data to manage their organizations because they don't have the ability to directly observe every aspect of their business. Unfortunately, talent development professionals often struggle to capture their work with meaningful metrics, and as a result, executives can overlook TD's potential to be a strategic partner.

True, talent data is softer than business data. It is easier to measure the length of a customer service call or the cost of heavy machinery than the amount of skill that a leadership training program created. Yet, executives want to know what the business case is for investments in improving organizational culture, developing employees, and reducing key skills gaps. If TD professionals can only offer anecdotal impressions and participant counts, we will quickly lose our seat at the leadership table.

That isn't just a perception problem. We need better TD measurements to successfully support our organizations. Reliable metrics around program effectiveness enable us to prioritize programs in which to invest. Realistic predictions of how much a training program can improve performance can save us from developing initiatives that aren't likely to generate a positive return on investment. And, without informative measurements about our workforce, it will be impossible to sustain culture, develop key talent, and manage learning programs at scale.

Reduce uncertainty

Fortunately, there are simple ways for us to leverage data to gain insight into our work. We can even measure talent "intangibles," such as leadership, culture, and skills, without complicated statistics. The key is to realize that the goal of measurement is not to calculate a number. Rather, the purpose of talent metrics is to improve our decisions.

If we have data that shows one training program is performing worse than another, that improves our ability to decide where to invest in troubleshooting. When we gather data that demonstrates it would take $100,000 of instructional design effort to solve a problem that only costs the organization $50,000 per year, we can modify the project so it is more likely to benefit the company.

Once we realize that the purpose of talent metrics is to make better decisions, we can expand our conception of what talent metrics can look like. In fact, according to Douglas Hubbard, the founder of applied information economics, people can use any observation that reduces their uncertainty as a measurement.

Consider the game Wordle. Players have six chances to guess a five-letter word, and after each guess, they learn whether the letters in their guess match the letters and position of the target word. At the beginning of a Wordle puzzle, there are 12,972 possible five-letter words. Each guess gives information that reduces a player's uncertainty about which of those words is the target word, and more strategic guesses reduce that uncertainty faster. Wordle players quickly realize that an initial guess of STARE is better than QUERY because they almost always learn more about the target word. In Hubbard's framework, each Wordle guess is a different measurement that improves a player's decisions about their next guess, and "STARE" is a better measurement than "QUERY" because it does a better job of reducing uncertainty about the target letters.

So how can TD professionals create talent metrics that similarly reduce our uncertainty and enable us to make better decisions?

Establish operational definitions

One way to create simple but consequential talent metrics is through an operational definition. The father of the quality movement, W. Edwards Deming, described operational definitions as a way to establish a method for measuring a concept.

The Apgar score for neonatal health exemplifies an operational definition. Hundreds of factors go into whether an infant is healthy, but nurses use a five-criteria, 0–10 score to gauge an infant's physical condition both one minute and five minutes after birth. That scoring mechanism is simple enough for nurses to apply in any birthing setting and powerful enough to triage when infants need additional medical intervention.

Remarkably, the creation of the Apgar score did not involve complex statistical regressions or machine learning algorithms. It instead codified the clinical experience of Virginia Apgar, who used her decades of experience as a physician to identify several critical indicators of neonatal health. While it may be possible to improve on the Apgar score with additional criteria and weighted calculations, Apgar's operational definition of newborn vitality is good enough to reduce the uncertainty about when a baby needs additional care so that healthcare workers quickly recognize and act on high-risk situations. Since the creation of the Apgar score in 1952, healthcare professionals have prevented tens of thousands of newborn deaths every year, notes Atul Gawande in his book Better: A Surgeon's Notes on Performance.

As TD professionals, we can craft operational definitions—our own versions of an Apgar score—for what matters to our organizations. Using an operational definition is a three-step process: Align on meaning, agree on measures, and leverage data.

To illustrate the operational definition process, suppose your leaders want to assess the strength of the organization's learning culture.

Align on meaning. Your TD team can ask: When we say we have a learning culture, what does that mean for us? TD can align on characteristics such as "a learning culture means that our employees are engaging in voluntary development experiences throughout the year" and "employees set and get feedback on development goals throughout the year."

Agree on measures. The measures must reduce the TD function's uncertainty about those characteristics. In the example, measure what portion of the employees participated in voluntary learning activities during the past six months and track the median date for the most recent update to staff development goals in the HR information system.

Leverage data. Assess how your learning culture is faring. Do specific employee groups have lower rates of voluntary learning participation? Are certain departments better or worse at updating development goals?

Use the same process to create metrics around the readiness and diversity of the company's succession pool, the effectiveness of leaders, or the likelihood of employees having high potential. For the succession pool, imagine that executives are worried about being able to quickly fill roles on an emergency basis because competing firms highly seek out your organization's incumbents, so having internal successors who are ready is a high priority. Also suppose that the executives want to see the company's leadership have greater ethnic diversity.

A measure that could reduce uncertainty around the readiness of the emergency backups could include an assessment of the leadership team to determine which internal successors could fill the role on an emergency basis within a year. A metric that could reduce uncertainty around the likely future diversity of the leadership is what portion of succession plans include ethnic minorities.

Those answers suggest a two-part operational definition of a healthy succession pool: at least X percent of roles have an internal successor who can fill the position in an emergency situation within a year, and the percentage of roles with minority succession candidates is greater than the percentage of minority incumbents by at least Y percent. The executives could collaborate with talent leaders to improve their decisions around succession planning. Do they need to include more ethnic minorities on succession watch lists and give them additional stretch opportunities? Are there specific roles that need accelerated development plans for emergency successors?

There isn't one right way to measure those elements. In fact, the conversations to generate alignment with executives on how you want to operationally define those concepts, select measures, and leverage data can be even more valuable than the metrics themselves because the process forces you to clarify expectations around those talent intangibles.

Estimate plausible ranges

A second technique to reduce uncertainty is through a Fermi estimation, named after the Nobel Prize-winning physicist Enrico Fermi. He would regularly make estimates by decomposing quantities into component elements that he could estimate more reliably.

As an example, most people would struggle to make a plausible estimate of how many piano tuners work in Chicago, Illinois. But Fermi would break that problem down: How many people live in Chicago? How many people are in a typical household? What portion of households own pianos? What's the typical frequency for a piano tuning?

It is far easier to estimate the responses for each of those questions:

  • Chicago's population is between 1 million and 5 million.

  • The average household has between two and four people.

  • Less than 25 percent of households own pianos.

  • A piano tuning more than once a year is an outlier.

Putting those numbers together, we can estimate the number of tunings that take place each year in Chicago is in the low six figures at most, which is enough work to support a few hundred (but not thousands) of piano tuners.

In short, the Fermi estimation technique enables you to quickly calculate a ballpark estimate for a number that is difficult to measure directly. Making a Fermi estimate is a straightforward, three-step process:

  1. Decompose the overall quantity into component parts.

  2. Estimate the high and low ends of the range for each component. If necessary, break down the components into subcomponent estimates.

  3. Multiply and add the component parts as appropriate to estimate the range for the overall quantity.

The method is accurate enough to apply to many decisions. For instance, astute readers may notice that the piano tuner estimation completely neglected commercial piano tunings. But even if the commercial piano tuning market is four times as large as the residential market, the estimate of piano tuners is still within the right order of magnitude. If a more precise number is necessary, look for additional data such as surveying households to refine what portion own pianos or interviewing music schools to clarify how often they have piano tunings.

Other areas of business regularly use Fermi estimates. If you've ever read a headline of how many millions of hours workers lose every year due to unnecessary meetings or how many jobs artificial intelligence could replace, the author probably used a Fermi estimate to generate the number (for example, multiply these elements together: the number of meetings per average person per day; the portion of meetings that were ineffective; the number of employees and working days; and the dollar cost of that many hours of time).

Use the same process to estimate the value of a training program. Suppose that a TD department must build a business case for leadership development, and executives are worried about poor leadership because they hope to stem turnover from managers that don't engage their team. TD could estimate the cost of poor leader performance by breaking turnover into the number of regrettable departures per year, the cost to replace those roles, and the portion of turnover that the company can attribute to managers.

The HR team can provide the first number and use industry benchmarks to estimate the replacement cost. Estimate manager influence based on exit surveys or engagement data. Even if there is a wide range for the estimate of manager influence, the TD function's business case can clarify whether that is a problem costing the organization millions of dollars or rather tens of thousands of dollars. That's enough to guide conversations around priorities and budgets.

As another example, imagine that the TD team wants to estimate the business impact of a learning program. Suppose midlevel leaders must attend a class on delegation skills. What benefits would TD expect to see from that program? Participants should be spending less time directly doing work that their team members could do, and the managers should be able to spend more time working on strategic projects appropriate to their level.

Let's use a Fermi estimate to break down the value of each of those items. The value of time the managers no longer spend on tasks below their level is the difference of salary between those roles multiplied by the number of hours they saved. HR knows the first number, and TD practitioners can do a before-and-after calendar audit with a focus group of participants to estimate the second.

The TD team can likewise estimate the value of more time for strategic projects. Given a range of expected project outcomes for work at the managerial level and a typical estimate of hours to complete a project, plus a calendar audit to reveal the additional time the managers dedicate to the work post-training, the TD function can gauge how many more projects of that caliber managers could complete in a year and the estimated value to the business of that incremental contribution.

Perfection isn't a requirement

Both operational definitions and Fermi estimates enable TD to craft metrics that improve decisions around talent intangibles such as culture, leadership, and skills. The key insight is to realize that metrics don't need to be perfect to significantly improve decisions. By using data to reduce uncertainty, we can step forward as strategic partners that help executive teams prioritize business opportunities and investments.


Facilitating Operational Definition Conversations

Operational definitions enable stakeholders to improve decisions with data. More importantly, the conversation when creating an operational definition helps build clarity and buy-in. These prompts can help facilitate conversations around operational definitions.

The why. Asking why stakeholders are interested in a metric can clarify where to focus. Because talent concepts are multifaceted, creating agreement about the purpose of the operational definition helps the team identify the main issues.

Experts' intuition. What are the characteristics that experts tend to pay attention to? It may be easier to generate data for specific aspects of a talent concept than to find data for the whole concept.

Disqualifiers. Sometimes it is helpful to work backward. Asking stakeholders to identify aspects of the concept that are not important to them can illuminate where the group should focus the conversation.

Exemplars. Prompting stakeholders to give concrete examples of cases that exemplify the idea at hand (or provide examples of what they don't want) can give the group fodder to discuss what they see in those examples.

Facilitating alignment on those questions enables the group to build a shared understanding of how the operational definition will function. Doing that at an early stage can overcome many objections about how available data is imperfect because the group will have already bought into the areas that are most critical for improving its decisions.


Fermi Estimation Tips

There are several common patterns in Fermi estimation breakdowns. When you are stuck, consider the following options.

Is there an equation? Whenever you can mathematically model a quantity, the elements of that model are good choices to estimate. For instance, the calculation for productivity is output divided by input; estimate each of those components to estimate an overall productivity value.

Are there independent components? Find entry points to an estimate by brainstorming elements that add together to produce the higher-level component and identifying the small handful that drive most of the result.

Can you find a rate? Often, Fermi estimates use frequency rates such as "per person" or "per day" to connect quantities with different units. Likewise, percentages (literally per-cent or per one hundred) can normalize figures from different populations.

You've Reached ATD Member-only Content

Become an ATD member to continue

Already a member?Sign In

issue

ISSUE

February 2025 TD Magazine

View Articles

More from ATD

Loading...

Copyright © 2025 ATD

ASTD changed its name to ATD to meet the growing needs of a dynamic, global profession.

Terms of UsePrivacy NoticeCookie Policy