Diagnosing Model Performance Under Distribution Shift

Cai, Tiffany Tianhui; Namkoong, Hongseok; Yadlowsky, Steve

Statistics > Machine Learning

arXiv:2303.02011 (stat)

[Submitted on 3 Mar 2023 (v1), last revised 10 Jul 2023 (this version, v4)]

Title:Diagnosing Model Performance Under Distribution Shift

Authors:Tiffany Tianhui Cai, Hongseok Namkoong, Steve Yadlowsky

View PDF

Abstract:Prediction models can perform poorly when deployed to target distributions different from the training distribution. To understand these operational failure modes, we develop a method, called DIstribution Shift DEcomposition (DISDE), to attribute a drop in performance to different types of distribution shifts. Our approach decomposes the performance drop into terms for 1) an increase in harder but frequently seen examples from training, 2) changes in the relationship between features and outcomes, and 3) poor performance on examples infrequent or unseen during training. These terms are defined by fixing a distribution on $X$ while varying the conditional distribution of $Y \mid X$ between training and target, or by fixing the conditional distribution of $Y \mid X$ while varying the distribution on $X$. In order to do this, we define a hypothetical distribution on $X$ consisting of values common in both training and target, over which it is easy to compare $Y \mid X$ and thus predictive performance. We estimate performance on this hypothetical distribution via reweighting methods. Empirically, we show how our method can 1) inform potential modeling improvements across distribution shifts for employment prediction on tabular census data, and 2) help to explain why certain domain adaptation methods fail to improve model performance for satellite image classification.

Subjects:	Machine Learning (stat.ML); Machine Learning (cs.LG)
Cite as:	arXiv:2303.02011 [stat.ML]
	(or arXiv:2303.02011v4 [stat.ML] for this version)
	https://doi.org/10.48550/arXiv.2303.02011

Submission history

From: Tiffany Tianhui Cai [view email]
[v1] Fri, 3 Mar 2023 15:27:16 UTC (559 KB)
[v2] Fri, 10 Mar 2023 02:31:28 UTC (559 KB)
[v3] Mon, 17 Apr 2023 22:17:21 UTC (152 KB)
[v4] Mon, 10 Jul 2023 20:46:14 UTC (576 KB)

Statistics > Machine Learning

Title:Diagnosing Model Performance Under Distribution Shift

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Statistics > Machine Learning

Title:Diagnosing Model Performance Under Distribution Shift

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators