How to Do an Item Analysis (Step by Step)

2026/06/17

Click to upload or drag and drop

PDF, DOCX, PPTX, TXT, JPG, JPEG, PNG, HEIC, ODP, ODT, BMP, or TIFF

up to 20MB

Please wait, your quiz is being created...

Uploading...

To do an item analysis, score the test, then for each question calculate the difficulty index (the share of students who answered it correctly) and the discrimination index (how well the question separates high scorers from low scorers). Review the results question by question, flag items that are too easy, too hard, or that strong students miss, and revise or drop them before you reuse the test.

Item analysis is how you find out whether your test questions are actually doing their job. A question that everyone gets right tells you nothing about who learned the material, and a question that your best students miss is usually broken, not hard. This guide walks through the two core statistics, how to calculate them by hand or in a spreadsheet, what good numbers look like, and how to use the results to build a cleaner test next time. It is written for instructors, exam writers, and training teams who reuse their questions and want them to hold up.

What is item analysis?

Item analysis is the process of examining how students responded to each question on a test in order to judge the quality of those questions. It looks at one item at a time and asks two things: how hard was it, and did it tell the strong students apart from the weak ones. The goal is to keep the questions that measure learning well, fix the ones that are confusing, and retire the ones that add noise. It is most valuable when you maintain a question bank and reuse items across terms.

How do you do an item analysis?

You do an item analysis by scoring every test, then calculating two numbers for each question and reviewing what they tell you. The basic steps are the same whether you use a spreadsheet or your LMS report:

Score all the tests and record each student's total score.
Rank students by total score and split them into an upper group and a lower group. The standard method uses the top 27% and bottom 27%; for small classes the top and bottom half works fine.
For each question, calculate the difficulty index: the proportion of all students who answered it correctly.
For each question, calculate the discrimination index: the share correct in the upper group minus the share correct in the lower group.
For multiple choice items, also look at how often each wrong answer (distractor) was chosen.
Flag the outliers, decide whether to keep, revise, or drop each one, and note the change for next time.

What is the difficulty index and how do you calculate it?

The difficulty index is the proportion of test-takers who answered a question correctly, and it runs from 0.0 to 1.0. You calculate it by dividing the number of students who got the item right by the total number who attempted it. A value of 0.85 means 85% answered correctly, which is an easy item; a value of 0.20 means only 20% did, which is a hard item. Despite the name, a higher number means an easier question.

What is the discrimination index and how do you calculate it?

The discrimination index measures how well a question separates students who did well on the whole test from those who did poorly, and it runs from minus 1.0 to plus 1.0. You calculate it by finding the proportion correct in your upper group and subtracting the proportion correct in your lower group. If 90% of the top group and 40% of the bottom group answered correctly, the discrimination index is 0.50. A positive value means strong students did better on that item, which is what you want.

What is a good difficulty index for a test?

A good difficulty index for most questions falls between 0.30 and 0.70, with an average around 0.50 to 0.60 across the test. Items in this range carry the most information because they split the class rather than being answered the same way by everyone. Very easy items above 0.90 and very hard items below 0.20 are not automatically wrong, a few easy openers can settle nerves, but if most of your test sits at the extremes it will struggle to tell students apart.

What is a good discrimination index?

A good discrimination index is 0.30 or higher, and the closer to 1.0 the better. Items between 0.20 and 0.29 are marginal and worth a second look, while anything below 0.20 is weak. A negative discrimination index is a red flag: it means low scorers got the item right more often than high scorers, which usually points to a confusing question, a miskeyed answer, or a distractor that is accidentally defensible. Fix or drop negative items first.

Why is item analysis important?

Item analysis is important because it turns a finished test into evidence about your questions, not just your students. It catches miskeyed answers, ambiguous wording, and dead distractors that nobody picks, and it shows which topics the class actually struggled with. For anyone building a reusable question bank or a certification exam, item statistics are the basis for defending that the test is fair and measures what it claims to. Over a few cycles, the questions you keep get sharper and the test gets more reliable, the core idea behind test reliability and validity.

Can AI do item analysis?

AI does not run the statistics for you, the difficulty and discrimination indices come from real student responses, so you calculate those in your spreadsheet or LMS report after the test is taken. Where AI helps is on the front end and the rewrite: a tool like PDFQuiz generates fresh questions and a clear answer key from your source material, and once item analysis flags a weak question, you can regenerate a replacement on the same topic instead of writing one from scratch. The analysis stays with you; the question writing gets faster.

To build the questions you will analyze, upload your material to the AI test generator or the multiple choice quiz maker, which produce items and an answer key you can administer and then score. If you keep a reusable pool of items, the question bank generator and exam generator help you grow and version it. For the design work that feeds good item analysis, see how to write a test blueprint and how to write good test questions.