Applying big data in the training and development field is a method that will soon make its mark on the industry.

Overall, we see big data as the ability to analyze, compare, and slice enormous streams of data—primarily byproducts of the digital age. Therefore, what makes big data "big" is a vast number of data elements across a volume of incidence. It opens the possibilities of understanding at a deeper level that, in most cases, can't be achieved otherwise. For example, it can give a historical analysis, such as why people voted a certain way at the polls. It also can provide a predictive framework, such as how to get more people to the polls.

Businesses that we support as learning professionals in many cases already are leveraging big data for business intelligence and inevitably are going to draw the connection between learning and customer satisfaction. Whether or not our organizations are pushing for business intelligence, we can use these data points to help them better design learning, better evaluate the impact of learning, better fuel an evidence-based approach to experimentation, and better create personalization.

Big learning data will come down to value: the benefits for the learner, designer, manager, or organization that enable each to do something better, faster, cheaper, more strategically, and more persuasively with that big data.

Three components

The term "big learning data" encompasses three aspects of learning data: volume, velocity, and variety.

Volume. Big learning data enables an organization to access and analyze a volume of data for a richer perspective. Volume can mean information about thousands of learners taking a course or experience. Volume also can mean you are looking at multiple data points, over time, about a single learner.

Volume can provide data on a deeper and richer set of learning activities—even capturing the time a learner paused while answering a specific question. Also, volume might someday bring together learning data from hundreds of organizations to provide a global perspective.

Velocity. Big learning data enables learners and organizations to have rapid access to data—even in real time. Imagine a worker entering a wrong answer into an assessment exam. Velocity instantly would provide him with remedial and enrichment options based on his historical learning patterns and successful strategies from thousands of other learners who also failed that question.

Finally, velocity would allow learning producers the ability to make adjustments to content delivery—based on rapid analysis of user experience—on a continual basis.

Variety. Big learning data connects the dots, weaving together a wider variety of information from talent, performance, demographics, and business metrics. You can then see the correlations between learning performance and other behavior and background points. Imagine correlating performance reviews with learning activities and hiring data, either for thousands of employees or drilled down to a single worker.

Data sources

The problem with learning data, historically, is that we've always gone for the low-hanging fruit. Learning professionals have collected inexpensive, easily acquired data from people while they are in our domain, usually the classroom or program. In a big learning data world, we will need to rethink our data sources.

For example, we will need to look at learning in the broader context. When we do, we'll begin to see many potentially valuable data sources, including those that already exist. One way is from a broader human resources perspective.

The relationship between selection, training, and competency is very interesting. For example, we often evaluate the impact of a leadership program with the assumption that we did great things in the program. In reality, we know that much of it has to do with how well we select the program participants from our pool, and how well we select people to join the organization. We also might look at what the participants did before they came into the program and what their managers did after they left the program.

Another way to think about where to get data is in an element of design: the usability of learning content and resources. For example, instructors often will mention books, articles, and TED talks every time they teach based on their opinions about the resources they enjoy. We might look at whether learners accessed or completed these resources, what they thought about them, and so on.

We also will need to consider alternative approaches to collecting data. Some ways our approaches might change include depth of measurement, expense, and types of data.

Depth of measurement. We have looked at whether learners passed an exam, but more valuable data might include the answer, as well as characteristics of how learners answer the question. For example, how long it took them to answer and whether their mouse hovered over a wrong answer for a while.

Expense. Some data that we will use in big learning data will be more expensive to get than what we have traditionally used. But what we easily collected tended to be modestly superficial. Collecting data through interviews with managers of learners, for example, costs more but yields much more data.

Types of data. We have looked for how learners have answered a question, but more valuable would be their confidence in answering that question.

Big learning data shifts

As we move toward big learning data approaches, we will need to shift in the learning field in at least two ways.

First, we need to have an anthropological view of the learning process to understand that there are many factors that may influence learning. We need to realize that learning may influence or may support or destroy the impact of learning, thus broadening our view of potentially relevant data.

Second, we need to have an analytical approach that says that if we gather this data, we need to analyze it with integrity and with a more sophisticated multivariable analysis. How do we display that data so it brings meaning to people? If I'm given this data, what do I do with it strategically and how do I handle that?

Learning organizations will confront shifts in other important areas such as readiness, infrastructure, and openness.

Readiness. This refers to the extent to which individuals making decisions are ready to operate with a massively enhanced set of data.


Infrastructure. Organizations will need to upgrade, alter, or change learning systems because they are not ready for big data.

Openness. We need to understand where, how, and in what way it's appropriate to share and use that data because, from a performance or personnel perspective, it could lead to some pretty radical things happening.


When discussing big learning data, we must honestly consider the risks that it raises, including:

  • organizational change—requiring an interactive process of planning, feedback, and disclosures
  • privacy, security, and transparency issues—requiring strategic, legal, and codes-of-conduct elements
  • new skill, competency, and leadership dimension—requiring a learning initiative to build, assess, and develop appropriate skills
  • externally referenced phenomena—requiring an eye toward a greater context of how big data issues are evolving outside the workplace, including governmental, consumer, and judicial elements
  • experimental and innovative process—requiring iterative attempts, evidence, correction, and incubation time and reflection
  • values sensitive—requiring alignment or sensitivity to an individual's personal, political, or even religious beliefs about privacy, openness, and individuality
  • globally sensitive—requiring a context-specific approach that varies globally, based on culture and governmental regulations and expectations.

Since big learning data is just evolving, it is difficult to be prescriptive about those issues. Part of the innovation process is an active and open dialogue, along with collaboration on these risks. However, to add to this discussion, here are a few approaches that you might consider to better align big learning data with these concerns.

Transparency. Learners have the right to know how learning data will be used, shared, stored, or leveraged, so we should develop a clearly stated policy so that there are no surprises. It would be great if the learning industry developed a set of simple icons that indicate the types of transparencies or uses that learning data will "live under" within the corporation.

Privacy. Organizations may want to define areas where the privacy levels are different, or even where the learner gets to indicate the desired degree of privacy. Who gets to see the aggregated data of 1,000 learners? Who gets to see a single learner's data?

Value to the learner. Big learning data can provide great value back to the learner. She may want to know what other learners who have taken the same program found most difficult. What are the types of questions that she, as a learner, most often gets wrong? What remedial actions have been most successful for other learners who failed that question or program?

Silly data be gone. There will be a major temptation by managers and consultants to provide high-definition big data analysis that shouts "silly data." Data must have context, trust, and reliability to be effective.

Tracking data exhaust without context will create huge, interactive, silly data scorecards. We should develop a set of queries that helps us evaluate the meaningfulness of the evidence and conclusions.


If you posit that big learning data will create an environment of rich information, it would suggest that learners will be better informed and it would then allow for greater personalization.

For example, say someone wants to take job B, having done job A for a year. Big data would indicate, first of all, the number of people who did job A and who then got to job B. Of the people who got job B, what preparation did they have? It also would indicate which learning programs were most effective, and what the timing was for when they attempted to change to job B.

Big learning data also could be informative from a feedback and context arena because somebody often might fail at a topic but not know why he is failing. It becomes interesting when the learner can look not just at himself, but at other people who have had the same experience. He may certainly get an insight either that would explain it so he is not frustrated or that he could use to correct it—so that he could succeed again.

Also, if you implemented big data in a comprehensive way, learners potentially become invested in inputting data to the process because they see the impact of how it works. We would need to be concerned about whether learning can be micro-engineered by data, and I am always a combination of a behaviorist and a more gestalt approach. That said, ultimately we can program a better mix or selection. However, we would want to be cautious if we thought that with enough data we can program everybody to a predicted success at a predictive rate, at a predictive time, and have very few failures in that process.

Going forward

We are in the early stages of considering the possibilities for big learning data in the organizational workplace. And while exciting, this means that we have a lot of work to do as learning professionals.

Going forward, let's approach big learning data as a new world that will have great potential and also real risks with new challenges. Ironically, let's apply big data to our approaches to big learning data. Let's evaluate the impact of big learning data from a 360-degree perspective and be learners about this important field.

And finally, let's value and respect the concerns, views, and opposition to big learning data from some of our colleagues. Whether they are fearful, low risk, or deeply correct in their concerns, we must be open and influenced by all perspectives on this evolving approach.

Big Learning Data: A Revolution in the Making?

By Tony Bingham

In today’s digital landscape, one where people walk around with access to the world’s knowledge in their pockets and regularly interact with millions of pieces of information, data takes on new dimensions. We are only beginning to conceptualize what can be done with the mammoth amount of information to which we have access.

I have talked about the power of technology in the learning profession for years. It is revolutionizing the way learning and development practitioners do their work. Leveraging big data is the next logical step in this evolution. The outputs of technology—the data that we gather—provide learning professionals a new vantage point from which to view the work they do.

We have access to volumes of data, but we must understand what it can tell us, what is does tell us, and as importantly what it can’t and doesn’t tell us. As learning professionals we take that information and layer it over the organizational goals we are seeking to support, the gaps we are trying to close, and the engagement and retention metrics we are trying to improve. And then we create courses, programs, initiatives, and processes that have sustained business impact.

Big learning data can empower us to develop the knowledge and skills of professionals around the world in ways that we’ve never been able to do before. It’s never been a more exciting time to be in the learning profession.

Tony Bingham is president and CEO of ASTD. Excerpted from Big Learning Data.