Skip to content

Drs. Brandi A. Weiss and William R. Dardick will present their research, Detecting Differential Item Functioning with Entropy in Logistic Regression, at the Frontiers in Educational Measurement conference at The University of Oslo, Norway in September 2018.

In this talk we will discuss the adaptation of four entropy variants to detect differential item functioning (DIF) in logistic regression (LR): entropy (E), entropy misfit (EM), the entropy fit ratio (EFR), and a rescaled entropy fit ratio (Rescaled-EFR). Logistic regression is frequently used to detect DIF due to its flexibility for use with uniform and nonuniform DIF, binary and polytomous LR, and groups with 2+ categories. In this talk we will focus on binary LR models with two groups (reference and focal), however, we will also discuss the use of entropy with polytomous LR models and models with 2+ focal groups. We will present both a mathematical framework and results from a Monte Carlo simulation.

A fair test is free of measurement bias and construct-irrelevant variance. When groups are found to differ on an underlying construct test fairness may be impacted. DIF may help identify potentially biased items. While traditionally, dichotomous measures of statistical significance have been used to detect DIF in LR (e.g., χ2 and G2), more recent work has emphasized the importance of simultaneously examining measures of effect size. Model fit statistics can be thought of as a type of effect size. Previously, entropy has been used to capture the separation between categories and is expressed as a single measure of approximate data-model fit in latent class analysis, data-model fit in binary logistic regression, person-misfit in item response theory (IRT), and item-fit in in IRT. Entropy captures discrimination between categories and can be thought of as a measure of uncertainty that may be useful in conjunction with other measures of DIF. In this presentation we extend entropy for use as a measure to detect DIF that complements currently utilized DIF measures.

Monte Carlo simulation results will be presented to demonstrate the usefulness of entropy-based measures to detect DIF with a specific focus on model comparison and changes in entropy variants. We evaluate the following variables across 1,000 replications per condition: sample size, group size ratio, between-groups impact (i.e., difference in ability distributions), percentage of DIF items in the test, type of DIF (uniform vs nonuniform), and amount of DIF. Results will be presented comparing entropy variants to current measures used to detect DIF in LR (e.g., χ2, G2, δR2, difference in probabilities, and the delta log odds ratio). Statistical power and Type I error rates will be discussed.

Entropy-based measures may be advantageous for detection of DIF by providing a more thorough examination of between-group differences. More specifically, entropy exists on a continuum thus representing the degree to which DIF may be present, does not rely on dichotomous hypothesis testing, has an intuitive interpretation because values are bounded between 0 and 1, and can simultaneously be used as an absolute measure of fit and a relative measure for between-groups comparisons.