Kirkpatrick Model: 4 Levels of Training Evaluation (Examples)

2026/06/26

Click to upload or drag and drop

PDF, DOCX, PPTX, TXT, JPG, JPEG, PNG, HEIC, ODP, ODT, BMP, or TIFF

up to 20MB

Please wait, your quiz is being created...

Uploading...

The Kirkpatrick Model is a four-level framework for measuring whether training actually worked, moving from learner reaction up to business results. The four levels are Reaction (did people find it useful), Learning (did they gain the knowledge or skill), Behavior (are they applying it on the job), and Results (did it move a business metric). Most teams evaluate Level 1 with a feedback survey and Level 2 with a short quiz or test, then track Levels 3 and 4 over the weeks that follow. Below is what each level measures, real examples, the questions you would ask, and how to build the Level 2 knowledge check fast.

What is the Kirkpatrick Model?

The Kirkpatrick Model is a training evaluation framework created by Donald Kirkpatrick in the 1950s and still the most widely used model in corporate learning and development. It organizes evaluation into four levels that build on each other, so a team can answer not just whether learners liked a course but whether they learned anything, changed how they work, and produced a measurable return. The model is deliberately simple, which is why L&D teams, compliance trainers, and HR departments across US companies use it as a shared language for proving training value.

What are the 4 levels of the Kirkpatrick Model?

The four levels are Reaction, Learning, Behavior, and Results. Each one answers a different question and uses a different method, from a quick post-session survey at Level 1 to a business-metric analysis at Level 4. You do not have to measure all four for every course, but stronger evidence comes from the higher levels. This table summarizes what each level measures and how teams typically collect it.

Level	Question it answers	How you measure it
1. Reaction	Did learners find the training engaging and relevant?	Post-session feedback survey, rating scales, short comments
2. Learning	Did they gain the intended knowledge or skill?	Quiz or test, pre and post scores, skill demonstration
3. Behavior	Are they applying it back on the job?	Manager observation, on-the-job checklists, 30 to 90 day follow-up
4. Results	Did it improve a business metric?	KPIs like error rate, sales, retention, safety incidents

Level 1: Reaction

Level 1 measures how learners respond to the training right after they finish it. You ask whether the content was relevant to their role, whether the pace and format worked, and whether they would recommend it. A simple five-question rating survey sent at the end of a session captures this. Reaction data alone never proves learning happened, but a course that scores poorly here usually has a design problem worth fixing before you invest in the higher levels.

Example questions: How relevant was this training to your daily work? How confident do you feel using what you learned? What would you change about the session?

Level 2: Learning

Level 2 measures whether learners actually gained the knowledge, skills, or confidence the course set out to teach. This is where a knowledge test does the work: you give a short quiz after the session, and ideally the same quiz before it, so you can compare scores and show the gain. Tie every question to a learning objective, set a passing score in advance, and use a mix of recall and applied items. Most teams build this with multiple choice because it scores instantly and covers a lot of ground; a well-built multiple choice question maker turns your training material into draft items in seconds. You can also generate the test straight from your slides or handbook with an employee training quiz maker instead of writing every question by hand.

For a deeper walkthrough of designing this step, see our guide on how to create a post-training assessment and the basics of a knowledge check.

Level 3: Behavior

Level 3 measures whether learners are applying the training on the job, which is where most evaluation programs stop short. People can pass a quiz and still revert to old habits if the workplace does not support the change. You evaluate this 30 to 90 days later with manager observations, on-the-job checklists, or a follow-up assessment that asks how often the new behavior is happening. The honest finding here is often that the training was fine but the environment, tools, or incentives blocked the change.

Example methods: a manager scorecard rating observed behaviors, a self-report follow-up survey, or a spot-check audit of completed work.

Level 4: Results

Level 4 measures the business outcome the training was meant to influence, such as lower error rates, higher sales, fewer safety incidents, or better retention. You pick the metric before the program starts, capture a baseline, and compare it after enough time has passed for behavior change to show up. This level is the hardest to isolate because many factors move a business metric, so teams use control groups or trend lines to make a reasonable case rather than claim perfect proof.

Training program	Level 2 measure	Level 4 result metric
Compliance / safety	Passing score on a policy quiz	Drop in incidents or violations
Sales onboarding	Product knowledge test score	Ramp time, win rate, quota attainment
Customer support	Process quiz score	First-contact resolution, CSAT

Why is the Kirkpatrick Model important?

The Kirkpatrick Model matters because it forces training to prove value beyond attendance and smile sheets. Leadership funds programs that show results, and the four levels give L&D a credible way to connect a course to a business outcome. It also helps you diagnose failure: a low Level 2 score points to course design, while a strong Level 2 paired with a weak Level 3 points to a workplace barrier, not a knowledge gap. That diagnostic value is why the model has lasted more than 60 years.

How do you measure Level 2 learning?

You measure Level 2 by testing knowledge before and after training and comparing the scores. Build a short quiz of 8 to 15 questions tied directly to your objectives, give it as a pre-test to set a baseline, deliver the training, then give the same or an equivalent post-test. The score gain is your evidence of learning. Set a clear passing threshold ahead of time, mix recall questions with applied scenarios, and keep the test short enough that people finish it without fatigue. Our guide on how to set a passing score walks through choosing that threshold. For regulated programs, a compliance training quiz documents that each employee met the standard.

How does the Kirkpatrick Model compare to the ROI model?

The Kirkpatrick Model has four levels; the Phillips ROI Model adds a fifth level that converts Level 4 results into a financial return-on-investment figure. If your leadership wants a dollar value, Phillips extends Kirkpatrick rather than replacing it. For most teams, evaluating cleanly through Levels 1 and 2 on every course and reaching Levels 3 and 4 on high-stakes programs is a realistic and defensible standard.

Can AI build the Level 2 quiz for me?

Yes. Instead of writing test questions by hand, you can upload your training deck, manual, or notes and have the questions generated for you, then edit the ones you want to keep. Paste the source above and you get a draft quiz in seconds; you can also turn a PDF into a quiz directly from a policy document or handbook. If your source material is a scanned manual or printed handout, run it through a tool like document OCR software first so the text is machine-readable. Once trained staff sign off on a compliance course, capture the acknowledgment with online document e-signing, and track certificates and renewal dates for regulated programs with compliance tracking software.

Putting the model to work

Start small. Add a Level 1 survey and a Level 2 quiz to your next course, since those two are quick to set up and give you immediate evidence. Pick one program where the business cares about the outcome and follow it through Levels 3 and 4 to build a full case study. Over a few cycles you will have a repeatable evaluation routine that shows which training earns its budget and which needs a redesign, which is exactly what the Kirkpatrick Model was built to reveal.

Z tej samej rodziny narzędzi