What is IRT

A Brief, Non Technical Description of IRT

NOTE: The following was excerpted from Review of Tenko Raykov and George Marcoulides’s A Course in Item Response Theory and Modeling with Stata, by Ariel Linden, Linden Consulting Group, The Stata Journal (2018) 18, Number 2. I am not the author of the review, who I know by name only, but have the book and also recommend it.

Item response theory (IRT) is foundational to instrument development and therefore not typically covered in general statistical training. However, IRT is potentially applicable to a variety of measurement problems, making it a valuable methodology for a broader audience.

IRT is a statistical methodology for conducting latent-variable modelling in which the responses to items on an instrument are assumed to be explained by one or more latent (unobserved) variables (also referred to as constructs, traits, abilities, etc.). IRT emphasizes the probability of a response for each item as being a function of the level of the latent variable and item characteristics.

In the case of binary scored items, the response probability is typically expressed using the logistic function (referred to in IRT as the item characteristic curve [ICC]), in which the probability of a “correct” response (with “correct” representing a “yes” in a yes or no response or a “1” in a 1 or 0 response, etc.) is on the y axis and is plotted against levels of the underlying latent variable θ (theta). The investigator is typically interested in assessing the position of the latent variable where the probability of a “correct” response for an item is 0.50.


The further to the left on the horizontal axis of the ICC that this intersection occurs, the stronger the inclination to choose the “correct” response over the “incorrect” response, and vice versa. In the jargon of IRT, an item is considered easier if a “correct” response is obtained at a lower range of the latent variable relative to other items and is considered more difficult if a “correct” response is obtained at a higher range of the latent variable relative to the other items (correspondingly, the model parameter that provides this estimate is called an “item difficulty parameter”).

After fitting an IRT model (presumably after determining that it fits the data better than other available models), the investigator may inspect the item information function to see how much information (and where along the continuum) each item contributes about the latent variable. Moreover, the investigator may inspect the test information function—which summarizes the behavior of all items in a single curve—to ensure that the instrument differentiates best between respondents in the desired range of the latent variable. For example, a medical researcher may be interested in developing an instrument to detect early warning signs of impending dementia. Such an instrument would have its best differentiation capabilities in the lower range of scores of the latent trait.

Conversely, a researcher developing an instrument to measure perceived pain may desire to have its best differentiation potential in the higher range of scores to ensure that a patient experiences a relatively high level of pain before receiving a powerful narcotic. Finally, an investigator will want to assess whether any item exhibits differential item functioning; that is, whether individuals belonging to different subgroups, but with the same level of the latent variable, have a different probability of responding “correctly” to the item.

Although the preceding description of IRT focuses on instruments using binary scored items, IRT can also be used to develop instruments with polytomous items (which may be ordinal or nominal) or instruments with a mix of item types (called hybrid models). Fortunately, the various functions used for developing instruments with binary scored items are readily applied in these more complex models as well.

In summary, I strongly recommend this book for both students of an introductory course in instrument development and for more seasoned researchers interested in conducting IRT analyses in Stata who may not have been exposed to IRT as part of their statistical training.


Raykov, T., and G. A. Marcoulides. 2018. A Course in Item Response Theory and Modeling with Stata. College Station, TX: Stata Press.

Arrange a Conversation 


Article by channel:

Read more articles tagged: Analytics, Featured, Statistics