Summary
It is hard to measure learning, and even harder to measure what its impact is.
For both face-to-face and online learning, what aid agencies currently measure doesn’t tell us much. A lot of the confidence that face-to-face learning is superior depends on the gut feeling of the facilitators. Sometimes this is right, sometimes it’s not. You can’t make plans on gut feeling alone.
In online learning that gut feeling is missing. That, plus many expensive failures, makes many people question online learning. The reality is, you could ask many of the same questions about face-to-face learning. It’s just that in-person, you have a sense that a group “got it”. But that sense can be wrong.
What do we actually measure with in-person learning? Typically, it’s attendance and participant satisfaction. It may or may not attempt to measure knowledge change with pre and post-tests. It’s very hard to get these right, and when non-specialists do them, they’re rarely valid. Whilst everyone wants to show the organisational impact, they don’t put a realistic process in place to do that.
Typical online learning is also limited in what it measures. Most is self-paced (“click-through”) e-learning. It measures completion and includes knowledge checks, which still tell us little about what people actually learned. It rarely collects learner feedback.
When implemented well (especially with structured, facilitated online courses), online learning can give more meaningful measurements. Assignments can be set for individual learners, so we know how well each learner is completing the task. They can be more realistic, so we have a sense of whether the learner can do the actual job task. And they can get a structured review from experts (or peers with expert guidance). This moves us away from a general sense of “did the learners get it?” and to a specific assessment of the quality of the work they do.
What should you do?
To understand the impacts –
- Not be over the top. Be curious, explore the impact of online and hold it to a good standard – but don’t demand a level of proof much higher than face-to-face.
- Focus on the skill/task level. See whether participants can complete realistic job tasks by reviewing assignments.
- If not doing so already, collect information on the top line goals, in case there are impacts. This assumes that doing so is not prohibitively expensive.
To learn and improve –
- Look at participant reaction surveys to understand things like time commitment, ease of use, and clarity of instructions.
- Use surveys to check on the learner’s perceptions of good practices.
- Organise some unstructured calls or conversations with learners to find out how it worked for them.
To communicate to sceptical stakeholders –
- Focus on skill achievement/task competence. A message such as “80% of participants included all the criteria for a high-quality project design” is powerful.
- Consider a survey for stakeholders to ask them about their perceptions of the environmental factors that would hamper transfer. This sends a message that if they are concerned by a lack of transfer, they should look at the environment as the skills are now in place.
Introduction
Moving from face-to-face training to online learning is hard. Trainers trade intimate, person-to-person contact for greater reach and sustainability. That takes away their sense of what is working for the learners. When you sit across from me, and we have a conversation over coffee, I have a sense of whether you get what I’m saying. The questions learners ask, or the comments that they make over lunch give us an impression of whether they’ve got the nuances of a topic, or are skating across its surface.
Although there are ways to get better interaction between SMEs/facilitators and the learners online, they are inevitably mediated through technology. That creates barriers, raises the level of formality, or removes some subtle queues. With nuanced, sensitive topics, experts rightly worry about what they are missing.
But online learning also creates new, compelling possibilities for facilitators to dive deeper into learning. This paper compares what can be done online and face-to-face. And it makes recommendations for online courses. If you’re concerned about what you’re losing when you move from in-person to online, this shows you ways that you can reduce that worry – and even get more out of online.
Why is it so hard to measure learning impact?
No one thinks it is easy to measure the impact of learning. We can construct tests to see whether people make the right decisions, recall correct definitions, or get the right result of a calculation. But for workplace learning, this is a narrow way to view success. Organisations want to know if this is helping them achieve their goals. Making sure that learners did learn something is part of an evidence chain in showing impact, but it is not the end goal.
But why is it hard? Surely, we can run a training course, see something change for the organisation and then celebrate?
We can’t look inside someone’s head and see it filling up with knowledge. Even if we take a practical approach and have them show their skills on a training course, there’s no guarantee they will do it in real life.
And there are enormous challenges in showing causality in an interrelated and often chaotic place like our world. Linking a change in organisational metrics to one factor (like learning) is hard. Even well-designed social science or psychological studies struggle to unpick causality. And aid agencies simply don’t have the resources to run randomized, controlled trials on their training.
Practically, the ways we work make approaches which could show an impact too complicated, expensive or at odds with work culture. Our groups are spread around the world. The environments they work in are very different from one another. We can’t feasibly go and observe them as they do their work and see if they use what they learned.
Let’s think through the factors that make it so hard for aid agencies, who work in truly unique circumstances to measure impact from training.
Practical factors that make showing impact hard for aid agencies
Vague goals.
Organisation goals are often not clear or can’t be measured. There aren’t many equivalents to “increase sales of Product X by 13% in Quarter 4”. We could do better on that – but for now goals remain poorly defined. That means you can’t assess whether there is a meaningful contribution to meeting them.
Highly dispersed learners.
Think about a 20th Century factory – the workers are all in one location, with the same environment, with similar cultural and legal constraints, doing similar tasks. Aid agencies are trying to do something very different. If learners from Afghanistan show results and learners from Guatemala don’t, does that mean the programme worked, or not? The environment is so different that it will be very difficult to judge – even when the learners ostensibly do the same job.
Broken causal chain to business results.
There are many factors that impact changed business results. It is hard to show a causal relationship between improved skills and business impacts. This is also true for factors like staff hiring, investments in processes, strategies or new ways of working, fundraising or marketing spending, etc. If you start advertising ice-cream and sales go up, maybe it is due to the hot weather? It is a real challenge and does need careful monitoring of each stage – gaining skills; making decisions; applying the skills; and finally impacts, including negative impacts.[1]
Difficulty in measuring transfer to the job.
How do we know if learners are using new skills on the job, or using them correctly? Surveys are an option but they have validity problems. Individuals can be biased or feel pressured to report certain results. Surveying colleagues or managers has the same problems. Not many people would say that their manager knows more about what skills they apply on their job than they do themselves. Observation is a better measure, but is extremely expensive – generally prohibitively so. And observation has its own effects on behaviour. People act differently when observed, even subtly[2]. They might use better approaches if you just observed them, anyway. That doesn’t tell you much about the effectiveness of the learning intervention.
Tough to construct valid assessments of knowledge.
When we realise that participant surveys aren’t good enough for showing impact, the first place most people turn is pre and post-tests, to show that the group learned something. But this rarely works. Firstly, because organisations don’t really care if the group learned something. They care if the work gets done differently, and if there’s a positive impact from that. Secondly, because it’s so hard to make a good test.
The academic field of psychometrics looks at how to gain insights in to mental processes and assess learning. Creating valid tests is hard. Ideally, the test approaches the same bit of knowledge from several different directions, in multiple questions, to cross-check responses. The question must be well written to not lead the respondent, or confuse them. Answer options must be well written and ordered to be clear, not overlapping and getting at the right piece of knowledge.
It is actually easier to assess skills or task competency. Then you are not trying to get an insight into someone’s head. You are seeing if they can do a given piece of work to an expected standard. Whatever happens in their head is irrelevant, if the work gets done the way it should.
Weak links between content covered and learner job tasks.
This could be avoided, but is the way the sector works. Training courses are driven more by supply (what the expert wants to share, or believes learners should know), rather than by need. When training is not about how someone does their job, it stops being training, and becomes education.
General education is very important. Having a rich understanding of the world helps you in myriad, subtle ways. That means it’s not the most direct or appropriate way to improve how you work. It would be a real stretch to say that somebody getting an economics degree made their job performance improve (unless the job is being an academic economist)!
General learning about the sector, theories, history, and issues may help an individual and eventually organisations. But it will be hard to show impact from it, because it tangential to the work people do.
A lot of on-the-job training is a slight improvement over this. It is about frameworks, organisational policies, and interagency agreements. It is information heavy – and not linked to the work they do. This is more relevant, but isn’t practical enough to show impact – plus it is boring. Impact will be much easier to see when training targets the real skills that learners need.
Reliance on gut sense.
Much learning assessment, especially face-to-face, is grounded in the facilitator’s intuition about what’s happened for the learners. It’s not that we’re wrong on this. It’s that we might be!
The facilitator’s sense of learners’ understanding and competence, that they get from casual observation and interactions, is not wrong. A lot of the time it will be very accurate about who the top performers are. But it is fallible.
We can be wrong about how competent someone is, thrown off by good presentation skills or an easy manner. We can miss quiet star performers. And we can avoid seeing how little someone who is struggling really gets it – because they’re quiet and keep their head down. This intuition is a useful data point – particularly for learning about how to improve the experience. It isn’t enough to make informed choices, though.
Overuse of self-reporting.
Learners struggle to judge how well they have learned and whether there was impact – and so do their managers! What learners report about their experiences and the impact they have is interesting. Occasionally, for low-cost programmes, it will be enough to make a business decision on whether to continue, expand or change. But they are not fair and unbiased judgements – and can’t really tell us about impact.
Inappropriate expectations for accuracy and proof.
There is flipside to the criticisms of validity. Demands for scientific proof can be a way out of acting.
No business case is ever proof – it is a convincing picture. These demands push decisions down the road or absolve decision makers of their responsibility to make judgement calls with imperfect information. And all business information is imperfect. An overemphasis on accuracy and scientific validity can reduce the desire to evaluate as best you can, too. If self-reports are not accurate, why not just stop gathering the data?
It can also push you towards disproportionately expensive evaluation. Costs of evaluation should be in proportion to the programme costs and the expected benefits. Doing something else is just bad management. If a programme costs $10,000 and you need to spend $9,000 to evaluate it, it’s out of proportion. Likewise, for expensive, strategic programmes, you should also not baulk at spending $10,000 to evaluate it if it costs $250,000 and will carry on in the future.
Yes, that’s a lot of challenges! You’d be forgiven for thinking it’s not possible and we shouldn’t try. But with some adjustments to how we work, there’s a lot that we can do. Before we come on to where we want to go, first we need to look at where we are. We’ll start with in-person training, because lots of people are happy enough that it has impact – at least, they’ve accepted that it’s a good thing to do for many years.
What do we actually measure in face-to-face training?
This varies a great deal, of course. There are patterns though. The format of face-to-face training workshops (run over a few days) lends itself to some kinds of measurement and makes others harder.
Most often, training workshops don’t measure much.
At the level of an individual course, we might not think about it, but one of the most important things measured is attendance. Beyond the individual course, this is one of the most widely used metrics. We report on how many people have been trained in one year, the number of contact days, or how many were trained under one project. This is relevant management information. If we are managing a training project, we want to know this.
But it says little about learning.
It would be surprising if someone attended a three-day workshop and learned nothing. But that is a long way from knowing that the people who attended achieved the course learning objectives. And when training is poorly designed, it is possible for the amount of learning to be very low. Certainly, the amount of learning that can be applied back on the job can be very, very low.
Beyond attendance, courses most often measure participant reactions. Participants get a simple survey with closed response questions (and perhaps a few open ones) at the end of the workshop. These might include questions about whether they thought they learned something, whether they will use it on the job and so on.
Occasionally, workshops include pre and post-tests. In general, these are not valid tests. They get created by someone with the right instincts, but limited experience on assessing learning. They include leading or misleading questions. The choices are obvious. Often the same questions are on both the pre and post-test. Even with these imperfections, this kind of attempt to measure learning is still rare.
What we do get is facilitator observation. There is a sense of whether participants are processing information correctly from the nature of the questions they ask, for example. We might see whether people seem confident or capable in exercises. This is important. It is a valid source of information. But there are drawbacks, as discussed above.
Of course, there’s so many different courses that there’s huge variety in what gets measured – but it is limited. This is important to remember when we’re comparing in-person to online training – we don’t have a lot of insight into what the in-person courses are doing. I’ve got no problem with holding online training to a high standard. But we should want in-person training to be held to a similar high standard.
What does online training normally measure?
For a lot of e-learning, they also measure very little – even less than face-to-face training. They measure number of completions and completion rates, where that means having clicked through all the screens or watched the videos. Very few collect learner’s views and opinions, which many in-person workshops do.
E-learning can try to understand participant behaviour or satisfaction through more quantifiable means. If a high percentage of learners exit the course at a certain point, that indicates that something is wrong around that point.
During webinars, some platforms offer the ability to track participant attentiveness based on which window is open at the front. In principle mouse movements can track attention too, though this does not seem to be widely used. Knowing whether people are switching off from your webinar could help you design a more engaging one. But it has no value as a measure of learning, because watching something and learning it are not the same thing at all.
The reality is that e-learning developers rarely use the feedback they get to improve courses – at least not the courses in question.
Within the sector, most agencies don’t have their own e-learning production team (nor should they). Click-through e-learning then gets created as a one-time product, developed by an agency for a client. The agency that develops it, hands it over, and has no involvement with how it is used. So, there is little or no opportunity to see what is going wrong and improve based on that. The product has been delivered and won’t be changed.
Another common measure used in e-learning is that learners move through “knowledge checks”. These are a much more prominent feature of e-learning than face-to-face training. However, they are equally often poorly constructed, so they’re not actually testing what they claim to. They normally come immediately after the course presents the content. This means that the knowledge check isn’t confirming recall in any meaningful way. The content is still in learners’ short-term memory, so the check shows nothing about actual learning.
The way that the humanitarian sector does it, neither in-person or online learning seems to be measuring much of use! The thing is, there are many opportunities with online learning that we’re not using. Next, we’ll explore those, before we look at how we you should think about measuring online courses.
What can we measure online that is hard to do face-to-face?
Both online and face-to-face learning programmes miss chances to measure learning, learner performance and impact. The gut sense that facilitators get from workshops, and enjoyable workshops that senior leaders themselves take part in, means that leaders often don’t ask as many questions about their effectiveness. E-learning has overpromised and under delivered at great expense, so there is healthy scepticism about it and its impact.
There are ways in which online learning can do much more to assess learning than in-person workshops – when we move away from the paradigm of click-through e-learning.
Assessment options
Individual assignments.
Online courses make it easy to create tasks that participants complete themselves. This makes it a lot clearer what an individual learner can do.
This could be done face-to-face, but would seem strange and artificial in a workshop. So, in-person, group or paired work dominates. There are advantages to that, but it hides what is happening for individuals. One person could be doing almost everything in a task, with others standing by. Given that, what does it even mean that someone “completed” a course. Just as important, as facilitators and designers we are left wondering whether our course is working for everyone, or just a few.
Realistic assignments.
Assignments can be more realistic online. We can allocate time so they are doing something very like the real work.
In-person workshops often need quick approximations of work tasks that take e.g. 30 minutes. Longer than that can lead to a loss of focus, or a drop-in group energy. An online assignment could be creating a real, complete logframe over a couple of hours. It is unlikely that we would go into that level of detail in a workshop. But that realism helps learning, and is more convincing evidence that someone can actually do the work.
Review and peer review.
Facilitators can provide structured and individualised feedback – potentially to every single member of the group, if they have the time.
It is not just about raising the main learning points for the group. Each person can hear where they’ve gone right, gone wrong and can improve. Feedback is tailored and makes point directly about that person’s work – even if they’ve been made to others. A feedback guide (rubric) helps work to be reviewed consistently and comprehensively (covering all the main points). With appropriate rubrics and peer review functions in a platform, fellow learners can do this, too. That creates another chance to learn by seeing the work of others and thinking through how it meets the criteria. As facilitators are able to review realistic work products, we can get a much more reliable sense of the skills learners have.
Actually existing learning measurement is pretty bad for both in-person and online courses. There are some ways in which measuring learning and showing its impact is easier for online courses than in-person ones. And given that we know there are benefits of scale and sustainability for online courses, it makes sense you’d want to go deeper on that and see if it works for you
How should you think about measurement of your online courses – especially if you’re moving online from in-person?
To understand the impacts –
- Don’t go over the top. You probably have stakeholders who have accepted the impact of the face-to-face training to date. Certainly, be curious, explore the impact of online and hold it to a good standard – but don’t demand a level of detail and proof that is vastly higher than the expectations from in-person courses.
- Focus on the skill/task level. Can participants complete a realistic job task? If so, they’ve got a good chance of doing it again, somewhere else. Evaluating transfer back to the work environment would be a big task. Showing they can do the tasks is probably an improvement with showing your impact already. That means it should be good enough for now. If you’re concerned, have experts review assignments for quality.
- Try to understand your organisation or office’s top line goals. Try to align training work with them. If there are changes in the measurements of those, you want to know. Even without a direct causal chain, it would be compelling if there was a change once the courses were running.
To learn and improve –
- Use participant reaction surveys to understand things like time commitment, ease of use, clarity of instructions. Avoid using these for issues participants are less able to objectively assess (like how much they learned or the usefulness to them). For the first few iterations of yoru course, include one or two questions previously used with the face-to-face training. This would reassure you and your team that the online course is getting similar reactions. Use a question like “On a scale of 1 to 5, how likely are you to apply what you learned in the course to your day-to-day work?”.
- Include questions that check on the learner’s perceptions of good practices. Surveys should ask questions about known elements of good learning design and whether they were there. An example might be “did we leave topics for some time and then return to them?”
- Surveys should not be about how much they “enjoyed” the course. Online learning is not necessarily as much “fun” as face-to-face, especially if you’ve got longer, tougher assignments. That doesn’t mean people are learning less. Keep in mind that mastering a skill is inherently rewarding.
- Organise some unstructured calls or conversations with learners to find out how it worked for them.
- Focus your “learning” on the course processes (e.g. onboarding, joining instructions, payment) and to some extent the clarity and ease of use of the course. The latter fades into learning design but is not the same thing. You want to see how we can run the course better. The actual design of exercises, supports etc should be grounded in learning good practice. This may or may not be easy or hard, pleasant or unpleasant for the learners, depending on the skill they have to practice. Yes, check if learners recognise those elements as being present, but you need to audit your own designs for these
To communicate to sceptical stakeholders –
- Skill achievement/task competence should go a lot of the way. It’s very compelling, especially as it is missing from other learning initiatives (e-learning, most face-to-face). A message such as “80% of participants included all the criteria for a high-quality project design in their assignment” is powerful
- Consider a survey for stakeholders to ask them about how they see the environmental factors that would hamper transfer. This sends a message that if they are concerned by a lack of transfer, they should look at the environment. This is likely more within their sphere of influence/control than the course team’s – and the skills are now in place (if your training has done it’s job). You can use the results of this to lobby for future changes in the work environment. You can also adjust the course to respond to likely environmental challenges.
Do those things, and you’ve got a great chance of success. Happy measuring!
Hey – if you’ve actually read this far, you must be really interested in this. This article is more than 4,300 words, so you’re not a casual reader. If you are as interested in it as I think you are, book a call with me and let’s talk about what showing impact from online learning would look like for you. No problems to just have a casual chat, or to explore a project.
[1] Thalheimer, W. (2018). The learning-transfer evaluation model: Sending messages to enable learning effectiveness.