Determine whether adaptive testing is appropriate for the assessment situation. Adaptive testing is appropriate for tests of achievement, knowledge and aptitude, where each question has a distinct correct answer. Adaptive testing is not appropriate for personality assessment or opinion surveys.
Generate potential test questions by sub-domain. For example, if you are creating a test of high school mathematics, you might create test questions in categories like "algebra," "plane geometry," "solid geometry," "trigonometry," "calculus" and so on. You may need to make finer distinctions, for example, dividing "calculus" into "integrals" and "derivatives". You should generate many questions for each sub-domain, at least 20, preferably more.
Administer your potential test questions to a sample of test-takers, preferably over 500 individuals. The larger the sample of test-takers, and the more similar they are to the population you wish to assess with your adaptive test, the better your final system will be. You should include response modes like "Skip this question" and "I do not understand this question."
Analyze the sample's responses to your potential test questions. For example, for the question, "What is 2 to the 10th power?" the data you collect should allow you to say something like, "40 percent of the sample answered correctly," "55 percent of the sample answered incorrectly," "3 percent did not answer the question" and "2 percent did not understand the question."
Delete items that show deviant or extreme response patterns. Delete items that more than 10 percent or more of the sample did not understand, or that 15 percent or more of the sample did not answer. Delete items that 100 percent of the sample answered correctly, or that none of the sample answered correctly.
Order the remaining items in descending order of difficulty. An item that only 0.5 percent of the sample answered correctly would be near the top of the list, and an item that 99.5 percent of the sample answered correctly would be near the bottom of the list.
Enter your items into your software engine for presentation in adaptive testing mode.
In scoring the test, count all items correct up to the most difficult item passed, less items actually failed.