Using clustering and robust estimators to detect outliers in multivariate data
Data
2005
Embargo
Autores
Orientador
Coorientador
Título da revista
ISSN da revista
Título do volume
Editora
Idioma
Inglês
Título Alternativo
Resumo
Introduction
Outlier identi¯cation is important in many applications of multivariate analysis. Either because
there is some speci¯c interest in ¯nding anomalous observations or as a pre-processing task before
the application of some multivariate method, in order to preserve the results from possible harmful
e®ects of those observations. It is also of great interest in discriminant analysis if, when predicting
group membership, one wants to have the possibility of labelling an observation as "does not belong
to any of the available groups". The identi¯cation of outliers in multivariate data is usually based
on Mahalanobis distance. The use of robust estimates of the mean and the covariance matrix
is advised in order to avoid the masking e®ect (Rousseeuw and von Zomeren, 1990; Rocke and
Woodru®, 1996; Becker and Gather, 1999). However, the performance of these rules is still highly
dependent of multivariate normality of the bulk of the data. The aim of the method here described is
to remove this dependency. The ¯rst version of this method appeared in Santos-Pereira and Pires
(2002). In this talk we discuss some re¯nements and also the relation with a recently proposed
similar method (Hardin and Rocke, 2004).
Palavras-chave
Outliers, Clustering, Discriminant analysis
Tipo de Documento
conferenceObject
Versão da Editora
Dataset
Citação
Pires, A. and Santos-Pereira, C. (2005). Using clustering and robust estimators to detect outliers in multivariate data. Proceedings of the international conference on robust statistics ICORS 2005. Disponível no Repositório UPT, http://hdl.handle.net/11328/2345
Identificadores
TID
Designação
Tipo de Acesso
Acesso Aberto