Authors
Cho H, Kley T, Li H
Journal
Journal of the Royal Statistical Society, Series B
Citation
J R Stat Soc Series B: Stat Method. 2025. accepted manuscript.
Abstract
For data segmentation in high-dimensional linear regression settings, the regression parameters are often assumed to be sparse segment-wise, which enables many existing methods to estimate the parameters locally via ℓ1-regularised maximum likelihood-type estimation and then contrast them for change point detection. Contrary to this common practice, we show that the exact sparsity of neither regression parameters nor their differences, a.k.a.\ differential parameters, is necessary for consistency in multiple change point detection. In fact, both statistically and computationally, better efficiency is attained by a simple strategy that scans for large discrepancies in local covariance between the regressors and the response. We go a step further and propose a suite of tools for directly inferring about the differential parameters post-segmentation, which are applicable even when the regression parameters themselves are non-sparse. Theoretical investigations are conducted under general conditions permitting non-Gaussianity, temporal dependence and ultra-high dimensionality. Numerical results from simulated and macroeconomic datasets demonstrate the competitiveness and efficacy of the proposed methods. Implementation of all methods is provided in the R package \texttt{inferchange} on GitHub.