AGR Stats Colloquium: Confidence sets for variable selection

Host Institution:

La Trobe University

Title of Seminar:

Confidence sets for variable selection

Speaker's Name:

Dr Davide Ferrari

Speaker's Institution:

The University of  Melbourne

Time and Date:

Friday 13 September, 1.00pm (AEST)

Seminar Abstract:

We introduce the notion of variable selection confidence set (VSCS) for linear regression based on F-testing. The VSCS extends the usual notion of confidence intervals to the variable selection problem: A VSCS is a set of regression models that contains the true model with a given level of confidence. For noisy data, distinguishing among competing models is usually very difficult and the VSCS will contain many models; if the data are really informative, the VSCS will contain a much smaller number of useful models. We advocate special attention to the set of lower boundary models (LBMs), which are the most parsimonious models that are not statistically significantly inferior to the full model at a given confidence level. Based on the LBMs, variable importance and measures of co-appearance importance of predictors can be naturally defined.

Up to date, an almost exclusive emphasis has been on selecting a single model or two. In the presence of a number of predictors, especially when the number of predictors is comparable to (or even larger than) the sample size, the hope of identifying the true or the unique best model is often unrealistic. Consequently, a better approach is to select a relatively small set of models that all can more or less adequately explain the data at the given confidence level. This strategy identifies the most important variables in a principled way that goes beyond simply trusting the single lucky winner based on a model selection criterion.

Seminar Contact:

This email address is being protected from spambots. You need JavaScript enabled to view it.

AGR Support:

This email address is being protected from spambots. You need JavaScript enabled to view it.