Detection of single influential points in OLS regression model building

Loading...
Thumbnail Image
Date
2001
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Identifying outliers and high-leverage points is a fundamental step in the least-squares regression model building process. Various influence measures based on different motivational arguments, and designed to measure the influence of observations on different aspects of various regression results, are elucidated and critiqued here. On the basis of a statistical analysis of the residuals (classical, normalized, standardized, jackknife, predicted and recursive) and diagonal elements of a projection matrix, diagnostic plots for influential points indication are formed. Regression diagnostics do not require a knowledge of an alternative hypothesis for testing, or the fulfillment of the other assumptions of classical statistical tests. In the interactive, PC-assisted diagnosis of data, models and estimation methods, the examination of data quality involves the detection of influential points, outliers and high-leverages, which cause many problems in regression analysis. This paper provides a basic survey of the influence statistics of single cases combining exploratory analysis of all variables. The graphical aids to the identification of outliers and high-leverage points are combined with graphs for the identification of influence type based on the likelihood distance. All these graphically oriented techniques are suitable for the rapid estimation of influential points, but are generally incapable of solving problems with masking and swamping. The powerful procedure for the computation of influential points characteristics has been written in Matlab 5.3 and is available from authors. © 2001 Elsevier Science B.V.
Description
Subject(s)
Diagnostic plot, High-leverages, Influence measures, Influential observations, Outliers, Regression diagnostics
Citation
ISSN
0003-2670
ISBN
Collections