ParisTech se présente
 Evénements
 
 Etudier à ParisTech
 La coopération internationale
 Ressources documentaires
 Vivre à ParisTech
 ParisTech et les entreprises
 ParisTech Libres Savoirs
 
 

Test of fit and model selection based on likelihood function.

Accueil || Parcours || Recherche || S'enregistrer || Mon Compte || Contacts || Aide || Langues

Abdolreza, Sayyareh (2007) Test of fit and model selection based on likelihood function. Doctorat Statistique, ISPED, Unité 875 Biostatistique, Université de Bordeaux 2, AgroParistech 2007AGPT0020 p.205.

Plein texte disponible en tant que :

- These-Reza-avril2007Ber.pdf ( 1880 Kb )
Licence: Copyright

Résumé

Notre travail port sur l’inf´erence au sujet de l’AIC (un cas de vraisemblance

p`enalis´ee) d’Akaike (1973), o`u comme estimateur de divergence de Kullback-Leibler est

intimement reli´ee `a l’estimateur de maximum de vraisemblance. Comme une partie de la statistique

inf´erentielle, dans le contexte de test d’hypoth`ese, la divergence de Kullback-Leibler et le lemme

de Neyman-Pearson sont deux concepts fondamentaux. Tous les deux sont au sujet du rapports de

11

vraisemblance. Neyman-Pearson est au sujet du taux d’erreur du test du rapport de vraisemblance

et la divergence de Kullback-Leibler est l’esp´erance du rapport de log-vraisemblance.

Type d'EPrint:Thèse (Doctorat)
Directeur de Mémoire:Bar-hen, Avner
Date:22 Juin 2007
Jury de Mémoire:Biernacki, Christophe et Saracco, Jérôme et Daudin, Jean Jacques et Commenges, Daniel
Ecole Doctorale:ED 435 AGRICULTURE, ALIMENTATION, BIOLOGIE, ENVIRONNEMENTS ET SANTE
Discipline:Statistique
Fonds:AgroParistech
Institution:AgroParistech
Laboratoire:ISPED, Unité 875 Biostatistique, Université de Bordeaux 2
Sujets:1. Mathématiques et leurs applications
Mots-clés libres:Akaike criterion, Confidence interval, Kullback leibler, Logistic regression, Model selection, Multiple regression, Variable selection
Code ID:3400
Déposé par :Nadine Pontal
Déposé le :12 Février 2008

Références Bibliographiques

Akaike, H. (1973) Information theory and an extension of maximum likelihood principle. Second

International Symposium on Information Theory, Akademia Kiado, 267-281.

Atkinson, A.C.(1970) A method for discriminating between models Journal of the Royal Statistical

Society B 32, 323-344

Berk, R.H. and Jones, D.H. (1979) Goodness-of-Fit Test that dominate the Kolmogorov statistics.

Zeitschrift fur Wahrsheinlichkeitstheorie und Verwandte Gebiete, 47, 47-59.

Biernacki, C. (2004) Testing for a Global Maximum of the Likelihood. Journal of Computational and

Graphical Statistics, 14, 3, 657-674.

Bozdogan, H. (2000) Akaik’s information criterion and recent developments in information complexity.

Journal of Mathematical Psychology,44, 62-91.

Chernoff, H. and Lehmann, E.L. (1954) The use of maximum likelihood estimates in c2 tests of

goodness of fit. Ann. Math. statist. 25, 579-586.

Cochran, W.G. (1952) The c2 test of goodness of fit. Ann. Math. statist. 23, 315-345.

161

Commenges, D. Joly, P. Gegout-Petit, A. and Liquet, B. (2007)Choice between semi-parametric estimators

of Markov and non-Markov multi-stat models from generally onservations. Scandinavian

Journal of statistics, in press.

Cox, D.R.(1961) Test of separate families of hypothesis proceeding of the 4th Berkeley symposium,

Vol. 1(University of California Press,Berkeley), 105-123.

Cox, D.R.(1962) Further result on tests of separate families of hypothesesJournal of the Royal Statistical

Society B 24, 406-424.

Dastoor, N.K. (1983) Some aspects of testing non-nested hypothesis Journal of Econometrics 21,

213-228.

Davidson, R. and MacKinnon, (1981)Several tests for model specification in the presence of alternative

hypotheses Econometrica 49, 781-793.

Fisher, G.R., and McAleer, M. (1981) Alternative procedures and associated tests of significance for

non-nested hypotheses Journal of Econometrics, 16, 103-119.

Hurvich, C.M. and Tsai,C.L. (1989)Regression and time series model selection criterion Biometrika

76, 297-307

Ishiguro, M., Sakamoto,Y. and Kitagawa, G. (1997) Bootstrapping log likelihood and EIC, an extension

of AICAnnals of the institue of Statistical Mathematics, 49, 411-434

Jager, L. and Wellner, J.A. (2005)A new goodness of fit test: the reversed Berk-Jones statistic

http://bayes.stat.washington.edu/www/research/reports/2004/tr443.pdf.

Knight, K. (1999) Mathematical Statistics Chapman and Hall.

Konishi, S. and Kitagawa, G. (1996) Generalized Information Criteria in Model Selection. Biometrika

83, 4, 575-590.

Lehmann, E.L. (1998) Elements of Large-Sample Theory. Springer-Verlag, New York.

162

Lehmann, E.L. (1986) Testing Statistical Hypothesis. Wiley, New York.

Linhart, H. and Zucchini, W. (1986) Model Selection. Wiley, New York.

Mallows, C.L. (1973) Some comments on Cp Technometrics, 15, 661-675.

McCullagh, P. and Nelder, J.A. (1989) Generalized Linear Models CHAPMAN AND Hall.

Myung, I.J. (2000) The importance of complexity in model selection Journal of Mathematic 44, 190-

204.

Pesaran, M.H. (1974) On the general problem of model selection Review of Economic Studies 41,

153-171.

Pesaran, M.H. andDeaton, A.S. (1978) Testing non-nested nonlinear regression models Econometrica

46, 667-694.

Shapiro, S.S. andWilk, M.B. (1965) An analysis of variance test for normality. Biometrika 52, 591-

611.

Shapiro, S.S., Wilk, M.B. and Chen, H.J. (1968) A comparative study of various tests for normality.

J. Amer. Statist. Ass. 63, 1343-1372.

Schwarz, G. (1978) Estimating the dimension of a model Annals of Statistics, 6, 461-464

Shimodaira, H. (1998) An application of multiple comparison techniques to model selection Annals

of Ins. statistical mathemathics 50, No. 1, 1-13.

Shimodaira, H. (2001) Multiple comparisons of log-likelihoods and combining non-nested models

with application to phylogenetic tree selection Communication in statistics30, 1751-1772.

Stephens, M.A. (1986) editors. Goodness-of-Fit Techniques. Marcel Dekker, New York.

Van der Varrt, A.W. (1998) Asymptotic Statistics. Cambridge University Press.

Vuong, Quang H. (1989) Likelihood ratio tests for model selection and non-nested hypotheses The

level of test and efficiency for this test will be verified. It seems that our statistic is comparable by

163

Berk-Jones statistic. . Econometrica, 57, No. 2, 307-333.

White, H. (1982) Maximum Likelihood Estimation of Misspecified Models. Econometrica, 50(1):1-

26, jan.

White, H. (1994) Estimation Inference and Specification Analysis. Cambridge University Press.

Weisberg, S. and Bingham, C. (1975) An approximate analysis of variance test for non-normality

suitable for machine calculation. Technometrics 17, 133-134.

Yanagihara, H. and Ohomoto, C. (2005) On distribution of AIC in linear regression models Journal

of Statistical Planning and Inference 133, 417-433.

164

Table des Matières

1 Introduction 19

1.1 Our Objective - 27

1.2 Plan of Thesis - 29

2 Reminders about models

and some asymptotic results 30

2.1 Models - 30

2.2 Model Selection - 32

2.3 Goal of Model Selection and its means - 34

2.4 Nested and Non-Nested Models - 35

2.5 Probability Metrics - 36

2.6 Akaike framework and his Theorem - 37

2.7 Complexity in model selection - 38

2.8 Asymptotic theory - 42

2.9 Goodness of Fit Test and

Classical Hypothesis Testing - 43

14

CONTENTS CONTENTS

2.10 Reminder on Theorems and Lemmas - 45

3 Reminder on Goodness of Fit Tests 47

3.1 Testing fit to a fixed distribution - 47

3.1.1 Basic Goodness of Fit Test - 48

3.1.2 Tests on the basis of Functional Distance - 49

3.2 Adaptation of tests coming from the fixed-distribution - 51

3.3 Tests on the basis of Correlation and Regression - 52

3.4 Tests on the basis of Likelihood Functions - 54

3.4.1 Berk-Jones’s statistics - 54

3.4.2 Generalized Linear Models (GLMs) and Deviance - 55

4 Motivation to Model Selection Tests 61

4.1 Introduction - 61

4.2 Assumptions - 63

4.3 Likelihood Function and

Maximum Likelihood Estimator - 65

4.3.1 Correctly Specified and Mis-Specified models - 68

4.4 Metrics on spaces of probability - 69

4.4.1 Kullback-Leibler Discrepancy (divergence) - 73

4.5 Consistency of Maximum Likelihood

Estimator - 76

4.6 Akaike Information Criterion (AIC) - 77

15

CONTENTS CONTENTS

4.7 Distribution of Maximum Likelihood

Estimator - 79

5 Proposed test for Goodness of Fit Test:

A test based on empirical likelihood ratio 82

5.1 Introduction - 82

5.2 Our objective - 84

5.3 Union-Intersection Test - 85

5.4 Proposed test based on empirical

likelihood ratio - 86

5.4.1 Level of test - 89

5.4.2 Comparison with Berk-Jones’s test - 89

5.4.3 Bahadur efficiency of proposed test - 89

6 Proposed Model selection tests based on likelihood and AIC 92

6.1 Introduction - 92

6.2 Known parameters case - 98

6.3 Unknown parameters case - 106

6.4 Test function - 110

6.5 Variance estimation - 111

6.6 Distribution of Tn under H1

and Power of Test - 123

6.6.1 Distribution of Test Statistic Tn under H1 - 123

6.6.2 Power of Test - 127

16

CONTENTS CONTENTS

6.7 Consistency of Test - 130

6.7.1 Power computation - 131

6.7.2 Invariance - 132

7 Test For Model Selection based on difference of AIC’s:

application to tracking interval for DEKL 136

7.1 Introduction - 136

7.2 Objective - 138

7.3 Non-Nested Models comparison - 140

7.3.1 Motivation to Confidence Interval construction - 142

7.3.2 Confidence Interval for DEKL - 144

7.4 Logistic Regression: - 149

8 Conclusion and perspective 156

9 Bibliography 161

10 Appendix 165

I APPENDIX A 166

10.1 Introduction - 1

10.2 Expected Kullback-Leibler Criteria and AIC - 5

10.3 Hypothesis Testing - 7

10.4 Simulation - 10

10.4.1 exploration of our result - 10

17

CONTENTS CONTENTS

10.4.2 Application to The Multiple Regression Model - 11

II APPENDIX B 20

10.5 Introduction - 22

10.6 Theory about inference of differences of AIC criteria - 24

10.6.1 Estimating a difference of Kullback-Leibler divergences - 24

10.6.2 Tracking interval for a difference of Kullback-Leibler divergences - 27

10.6.3 Extension to regression models - 29

10.7 Application to logistic regression: a simulation study - 30

10.8 Choice of the best coding of age in a study of depression - 32

10.8.1 The Paquid study - 32

10.9 Discussion - 34

18

Statistiques de consultation

Administrateurs de l'archive uniquement : éditer cet enregistrement

 
ParisTech
 
droits de reproduction et de diffusion réservés © ParisTech 2007