Accueil || Parcours || Recherche || S'enregistrer || Mon Compte || Contacts || Aide || Langues
Abdolreza, Sayyareh (2007) Test of fit and model selection based on likelihood function. Doctorat Statistique, ISPED, Unité 875 Biostatistique, Université de Bordeaux 2, AgroParistech 2007AGPT0020 p.205.
Plein texte disponible en tant que :
|
|
Résumé
Notre travail port sur l’inf´erence au sujet de l’AIC (un cas de vraisemblance
p`enalis´ee) d’Akaike (1973), o`u comme estimateur de divergence de Kullback-Leibler est
intimement reli´ee `a l’estimateur de maximum de vraisemblance. Comme une partie de la statistique
inf´erentielle, dans le contexte de test d’hypoth`ese, la divergence de Kullback-Leibler et le lemme
de Neyman-Pearson sont deux concepts fondamentaux. Tous les deux sont au sujet du rapports de
11
vraisemblance. Neyman-Pearson est au sujet du taux d’erreur du test du rapport de vraisemblance
et la divergence de Kullback-Leibler est l’esp´erance du rapport de log-vraisemblance.
| Type d'EPrint: | Thèse (Doctorat) |
|---|---|
| Directeur de Mémoire: | Bar-hen, Avner |
| Date: | 22 Juin 2007 |
| Jury de Mémoire: | Biernacki, Christophe et Saracco, Jérôme et Daudin, Jean Jacques et Commenges, Daniel |
| Ecole Doctorale: | ED 435 AGRICULTURE, ALIMENTATION, BIOLOGIE, ENVIRONNEMENTS ET SANTE |
| Discipline: | Statistique |
| Fonds: | AgroParistech |
| Institution: | AgroParistech |
| Laboratoire: | ISPED, Unité 875 Biostatistique, Université de Bordeaux 2 |
| Sujets: | 1. Mathématiques et leurs applications |
| Mots-clés libres: | Akaike criterion, Confidence interval, Kullback leibler, Logistic regression, Model selection, Multiple regression, Variable selection |
| Code ID: | 3400 |
| Déposé par : | Nadine Pontal |
| Déposé le : | 12 Février 2008 |
Références Bibliographiques
Akaike, H. (1973) Information theory and an extension of maximum likelihood principle. Second
International Symposium on Information Theory, Akademia Kiado, 267-281.
Atkinson, A.C.(1970) A method for discriminating between models Journal of the Royal Statistical
Society B 32, 323-344
Berk, R.H. and Jones, D.H. (1979) Goodness-of-Fit Test that dominate the Kolmogorov statistics.
Zeitschrift fur Wahrsheinlichkeitstheorie und Verwandte Gebiete, 47, 47-59.
Biernacki, C. (2004) Testing for a Global Maximum of the Likelihood. Journal of Computational and
Graphical Statistics, 14, 3, 657-674.
Bozdogan, H. (2000) Akaik’s information criterion and recent developments in information complexity.
Journal of Mathematical Psychology,44, 62-91.
Chernoff, H. and Lehmann, E.L. (1954) The use of maximum likelihood estimates in c2 tests of
goodness of fit. Ann. Math. statist. 25, 579-586.
Cochran, W.G. (1952) The c2 test of goodness of fit. Ann. Math. statist. 23, 315-345.
161
Commenges, D. Joly, P. Gegout-Petit, A. and Liquet, B. (2007)Choice between semi-parametric estimators
of Markov and non-Markov multi-stat models from generally onservations. Scandinavian
Journal of statistics, in press.
Cox, D.R.(1961) Test of separate families of hypothesis proceeding of the 4th Berkeley symposium,
Vol. 1(University of California Press,Berkeley), 105-123.
Cox, D.R.(1962) Further result on tests of separate families of hypothesesJournal of the Royal Statistical
Society B 24, 406-424.
Dastoor, N.K. (1983) Some aspects of testing non-nested hypothesis Journal of Econometrics 21,
213-228.
Davidson, R. and MacKinnon, (1981)Several tests for model specification in the presence of alternative
hypotheses Econometrica 49, 781-793.
Fisher, G.R., and McAleer, M. (1981) Alternative procedures and associated tests of significance for
non-nested hypotheses Journal of Econometrics, 16, 103-119.
Hurvich, C.M. and Tsai,C.L. (1989)Regression and time series model selection criterion Biometrika
76, 297-307
Ishiguro, M., Sakamoto,Y. and Kitagawa, G. (1997) Bootstrapping log likelihood and EIC, an extension
of AICAnnals of the institue of Statistical Mathematics, 49, 411-434
Jager, L. and Wellner, J.A. (2005)A new goodness of fit test: the reversed Berk-Jones statistic
http://bayes.stat.washington.edu/www/research/reports/2004/tr443.pdf.
Knight, K. (1999) Mathematical Statistics Chapman and Hall.
Konishi, S. and Kitagawa, G. (1996) Generalized Information Criteria in Model Selection. Biometrika
83, 4, 575-590.
Lehmann, E.L. (1998) Elements of Large-Sample Theory. Springer-Verlag, New York.
162
Lehmann, E.L. (1986) Testing Statistical Hypothesis. Wiley, New York.
Linhart, H. and Zucchini, W. (1986) Model Selection. Wiley, New York.
Mallows, C.L. (1973) Some comments on Cp Technometrics, 15, 661-675.
McCullagh, P. and Nelder, J.A. (1989) Generalized Linear Models CHAPMAN AND Hall.
Myung, I.J. (2000) The importance of complexity in model selection Journal of Mathematic 44, 190-
204.
Pesaran, M.H. (1974) On the general problem of model selection Review of Economic Studies 41,
153-171.
Pesaran, M.H. andDeaton, A.S. (1978) Testing non-nested nonlinear regression models Econometrica
46, 667-694.
Shapiro, S.S. andWilk, M.B. (1965) An analysis of variance test for normality. Biometrika 52, 591-
611.
Shapiro, S.S., Wilk, M.B. and Chen, H.J. (1968) A comparative study of various tests for normality.
J. Amer. Statist. Ass. 63, 1343-1372.
Schwarz, G. (1978) Estimating the dimension of a model Annals of Statistics, 6, 461-464
Shimodaira, H. (1998) An application of multiple comparison techniques to model selection Annals
of Ins. statistical mathemathics 50, No. 1, 1-13.
Shimodaira, H. (2001) Multiple comparisons of log-likelihoods and combining non-nested models
with application to phylogenetic tree selection Communication in statistics30, 1751-1772.
Stephens, M.A. (1986) editors. Goodness-of-Fit Techniques. Marcel Dekker, New York.
Van der Varrt, A.W. (1998) Asymptotic Statistics. Cambridge University Press.
Vuong, Quang H. (1989) Likelihood ratio tests for model selection and non-nested hypotheses The
level of test and efficiency for this test will be verified. It seems that our statistic is comparable by
163
Berk-Jones statistic. . Econometrica, 57, No. 2, 307-333.
White, H. (1982) Maximum Likelihood Estimation of Misspecified Models. Econometrica, 50(1):1-
26, jan.
White, H. (1994) Estimation Inference and Specification Analysis. Cambridge University Press.
Weisberg, S. and Bingham, C. (1975) An approximate analysis of variance test for non-normality
suitable for machine calculation. Technometrics 17, 133-134.
Yanagihara, H. and Ohomoto, C. (2005) On distribution of AIC in linear regression models Journal
of Statistical Planning and Inference 133, 417-433.
164
Table des Matières
1 Introduction 19
1.1 Our Objective - 27
1.2 Plan of Thesis - 29
2 Reminders about models
and some asymptotic results 30
2.1 Models - 30
2.2 Model Selection - 32
2.3 Goal of Model Selection and its means - 34
2.4 Nested and Non-Nested Models - 35
2.5 Probability Metrics - 36
2.6 Akaike framework and his Theorem - 37
2.7 Complexity in model selection - 38
2.8 Asymptotic theory - 42
2.9 Goodness of Fit Test and
Classical Hypothesis Testing - 43
14
CONTENTS CONTENTS
2.10 Reminder on Theorems and Lemmas - 45
3 Reminder on Goodness of Fit Tests 47
3.1 Testing fit to a fixed distribution - 47
3.1.1 Basic Goodness of Fit Test - 48
3.1.2 Tests on the basis of Functional Distance - 49
3.2 Adaptation of tests coming from the fixed-distribution - 51
3.3 Tests on the basis of Correlation and Regression - 52
3.4 Tests on the basis of Likelihood Functions - 54
3.4.1 Berk-Jones’s statistics - 54
3.4.2 Generalized Linear Models (GLMs) and Deviance - 55
4 Motivation to Model Selection Tests 61
4.1 Introduction - 61
4.2 Assumptions - 63
4.3 Likelihood Function and
Maximum Likelihood Estimator - 65
4.3.1 Correctly Specified and Mis-Specified models - 68
4.4 Metrics on spaces of probability - 69
4.4.1 Kullback-Leibler Discrepancy (divergence) - 73
4.5 Consistency of Maximum Likelihood
Estimator - 76
4.6 Akaike Information Criterion (AIC) - 77
15
CONTENTS CONTENTS
4.7 Distribution of Maximum Likelihood
Estimator - 79
5 Proposed test for Goodness of Fit Test:
A test based on empirical likelihood ratio 82
5.1 Introduction - 82
5.2 Our objective - 84
5.3 Union-Intersection Test - 85
5.4 Proposed test based on empirical
likelihood ratio - 86
5.4.1 Level of test - 89
5.4.2 Comparison with Berk-Jones’s test - 89
5.4.3 Bahadur efficiency of proposed test - 89
6 Proposed Model selection tests based on likelihood and AIC 92
6.1 Introduction - 92
6.2 Known parameters case - 98
6.3 Unknown parameters case - 106
6.4 Test function - 110
6.5 Variance estimation - 111
6.6 Distribution of Tn under H1
and Power of Test - 123
6.6.1 Distribution of Test Statistic Tn under H1 - 123
6.6.2 Power of Test - 127
16
CONTENTS CONTENTS
6.7 Consistency of Test - 130
6.7.1 Power computation - 131
6.7.2 Invariance - 132
7 Test For Model Selection based on difference of AIC’s:
application to tracking interval for DEKL 136
7.1 Introduction - 136
7.2 Objective - 138
7.3 Non-Nested Models comparison - 140
7.3.1 Motivation to Confidence Interval construction - 142
7.3.2 Confidence Interval for DEKL - 144
7.4 Logistic Regression: - 149
8 Conclusion and perspective 156
9 Bibliography 161
10 Appendix 165
I APPENDIX A 166
10.1 Introduction - 1
10.2 Expected Kullback-Leibler Criteria and AIC - 5
10.3 Hypothesis Testing - 7
10.4 Simulation - 10
10.4.1 exploration of our result - 10
17
CONTENTS CONTENTS
10.4.2 Application to The Multiple Regression Model - 11
II APPENDIX B 20
10.5 Introduction - 22
10.6 Theory about inference of differences of AIC criteria - 24
10.6.1 Estimating a difference of Kullback-Leibler divergences - 24
10.6.2 Tracking interval for a difference of Kullback-Leibler divergences - 27
10.6.3 Extension to regression models - 29
10.7 Application to logistic regression: a simulation study - 30
10.8 Choice of the best coding of age in a study of depression - 32
10.8.1 The Paquid study - 32
10.9 Discussion - 34
18
Administrateurs de l'archive uniquement : éditer cet enregistrement