Using clustering and robust estimators to detect outliers in multivariate data

Data

2005

Embargo

Orientador

Coorientador

Título da revista

ISSN da revista

Título do volume

Editora

Idioma
Inglês

Projetos de investigação

Unidades organizacionais

Fascículo

Título Alternativo

Resumo

Introduction Outlier identi¯cation is important in many applications of multivariate analysis. Either because there is some speci¯c interest in ¯nding anomalous observations or as a pre-processing task before the application of some multivariate method, in order to preserve the results from possible harmful e®ects of those observations. It is also of great interest in discriminant analysis if, when predicting group membership, one wants to have the possibility of labelling an observation as "does not belong to any of the available groups". The identi¯cation of outliers in multivariate data is usually based on Mahalanobis distance. The use of robust estimates of the mean and the covariance matrix is advised in order to avoid the masking e®ect (Rousseeuw and von Zomeren, 1990; Rocke and Woodru®, 1996; Becker and Gather, 1999). However, the performance of these rules is still highly dependent of multivariate normality of the bulk of the data. The aim of the method here described is to remove this dependency. The ¯rst version of this method appeared in Santos-Pereira and Pires (2002). In this talk we discuss some re¯nements and also the relation with a recently proposed similar method (Hardin and Rocke, 2004).

Palavras-chave

Outliers, Clustering, Discriminant analysis

Tipo de Documento

conferenceObject

Versão da Editora

Dataset

Citação

Pires, A. and Santos-Pereira, C. (2005). Using clustering and robust estimators to detect outliers in multivariate data. Proceedings of the international conference on robust statistics ICORS 2005. Disponível no Repositório UPT, http://hdl.handle.net/11328/2345

TID

Designação

Tipo de Acesso

Acesso Aberto

Apoio

Descrição