Null Distribution of the Test Statistic for Model Selection via Marginal Screening: Implications for Multivariate Regression Analysis

α
A.V. Rubanovich
A.V. Rubanovich
σ
V.A. Saenko
V.A. Saenko

Send Message

To: Author

Null Distribution of the Test Statistic for Model Selection via Marginal Screening: Implications for Multivariate Regression Analysis

Article Fingerprint

ReserarchID

1YN32

Null Distribution of the Test Statistic for Model Selection via Marginal Screening: Implications for Multivariate Regression Analysis Banner

AI TAKEAWAY

Connecting with the Eternal Ground
  • English
  • Afrikaans
  • Albanian
  • Amharic
  • Arabic
  • Armenian
  • Azerbaijani
  • Basque
  • Belarusian
  • Bengali
  • Bosnian
  • Bulgarian
  • Catalan
  • Cebuano
  • Chichewa
  • Chinese (Simplified)
  • Chinese (Traditional)
  • Corsican
  • Croatian
  • Czech
  • Danish
  • Dutch
  • Esperanto
  • Estonian
  • Filipino
  • Finnish
  • French
  • Frisian
  • Galician
  • Georgian
  • German
  • Greek
  • Gujarati
  • Haitian Creole
  • Hausa
  • Hawaiian
  • Hebrew
  • Hindi
  • Hmong
  • Hungarian
  • Icelandic
  • Igbo
  • Indonesian
  • Irish
  • Italian
  • Japanese
  • Javanese
  • Kannada
  • Kazakh
  • Khmer
  • Korean
  • Kurdish (Kurmanji)
  • Kyrgyz
  • Lao
  • Latin
  • Latvian
  • Lithuanian
  • Luxembourgish
  • Macedonian
  • Malagasy
  • Malay
  • Malayalam
  • Maltese
  • Maori
  • Marathi
  • Mongolian
  • Myanmar (Burmese)
  • Nepali
  • Norwegian
  • Pashto
  • Persian
  • Polish
  • Portuguese
  • Punjabi
  • Romanian
  • Russian
  • Samoan
  • Scots Gaelic
  • Serbian
  • Sesotho
  • Shona
  • Sindhi
  • Sinhala
  • Slovak
  • Slovenian
  • Somali
  • Spanish
  • Sundanese
  • Swahili
  • Swedish
  • Tajik
  • Tamil
  • Telugu
  • Thai
  • Turkish
  • Ukrainian
  • Urdu
  • Uzbek
  • Vietnamese
  • Welsh
  • Xhosa
  • Yiddish
  • Yoruba
  • Zulu

Abstract

Marginal screening (MS) is the computationally simple and commonly used for the dimension reduction procedures. In it, a linear model is constructed for several top predictors, chosen according to the absolute value of marginal correlations with the dependent variable. Importantly, when k predictors out of m primary covariates are selected, the standard regression analysis may yield false-positive results if m >> k (Freedman’s paradox). In this work, we provide analytical expressions describing null distribution of the test statistics for model selection via MS. Using the theory of order statistics, we show that under MS, the common F-statistic is distributed as a mean of k top variables out of m independent random variables having a 2 1 χ distribution. Based on this finding, we estimated critical p-values for multiple regression models after MS, comparisons with which of those obtained in real studies will help researchers to avoid falsepositive result. Analytical solutions obtained in the work are implemented in a free Excel spreadsheet program.

References

24 Cites in Article
  1. Mohammad Ahsanullah,Valery Nevzorov,Mohammad Shakil (2013). An Introduction to Order Statistics.
  2. K Alam,K Wallenius (1979). Distribution of a sum of order statistics.
  3. Barry Arnold,N Balakrishnan,H Nagaraja (2008). A First Course in Order Statistics.
  4. J Cohen (1988). Statistical Power Analysis for the Behavioral Sciences.
  5. J Cohen (1992). A Power Primer.
  6. George Diehr,Donald Hoflin (1974). Approximating the Distribution of the Sample R 2 in Best Subset Regressions.
  7. J Fan,Q Shao,W Zhou (2017). Are Discoveries Spurious? Distributions of Maximum Spurious Correlations and Their Applications.
  8. Dean Foster,Robert Stine (2006). Honest confidence intervals for the error variance in stepwise regression.
  9. David Freedman (1983). A Note on Screening Regression Equations.
  10. Christopher Genovese,Larry Wasserman (2009). Confidence sets for nonparametric wavelet regression.
  11. C Genovese,J Jin,L Wasserman,Z Yao (2012). Comparison of the lasso and marginal regression.
  12. T Hastie,R Tibshirani (2003). Expression arrays and the n p >> problem.
  13. J Lee,J Taylor (2014). Exact Post Model Selection Inference for Marginal Screening.
  14. J Leek (2016). Everyday Ethics: Top 10 Ethical Considerations in Using Telepractice.
  15. Paul Lukacs,Kenneth Burnham,David Anderson (2010). Model selection bias and Freedman’s paradox.
  16. Haikady Nagaraja (1980). Contributions to the theory of the selection differential and to order statistics.
  17. H Nagaraja (1982). Some Nondegenerate Limit Laws for the Selection Differential.
  18. H Nagaraja (1980). Order Statistics from Independent Exponential Random Variables and the Sum of the Top Order Statistics.
  19. A Rubanovich,N Khromov-Borisov (2016). Genetic risk assessment of the joint effect of several genes: Critical appraisal.
  20. David Salt,Subhash Ajmani,Ray Crichton,David Livingstone (2007). An Improved Approximation to the Estimation of the Critical <b><i>F</i></b> <b>Values in Best Subset Regression</b>.
  21. Stephen Stigler (1973). The Asymptotic Distribution of the Trimmed Mean.
  22. R Tibshirani,J Taylor,R Richard Lockhart,R Tibshirani (2016). Exact Post-Selection Inference for Sequential Regression Procedures.
  23. (2016). Statistical Approaches to Gene X Environment Interactions for Complex Phenotypes.
  24. N Wray,J Yang,B Hayes,N Wray,J Yang,B Hayes,A L Price,M Goddard,P Visscher (2013). Pitfalls of predicting complex traits from SNPs.

Funding

No external funding was declared for this work.

Conflict of Interest

The authors declare no conflict of interest.

Ethical Approval

No ethics committee approval was required for this article type.

Data Availability

Not applicable for this article.

How to Cite This Article

A.V. Rubanovich. 2021. \u201cNull Distribution of the Test Statistic for Model Selection via Marginal Screening: Implications for Multivariate Regression Analysis\u201d. Global Journal of Science Frontier Research - G: Bio-Tech & Genetics GJSFR-G Volume 21 (GJSFR Volume 21 Issue G1): .

Download Citation

Methods for multivariate regression analysis with model selection via marginal screening. Suitable for academic research and statistical modeling.
Issue Cover
GJSFR Volume 21 Issue G1
Pg. 23- 31
Journal Specifications

Crossref Journal DOI 10.17406/GJSFR

Print ISSN 0975-5896

e-ISSN 2249-4626

Keywords
Classification
GJSFR-G Classification: FOR Code: 060499
Version of record

v1.2

Issue date

October 21, 2021

Language
en
Experiance in AR

Explore published articles in an immersive Augmented Reality environment. Our platform converts research papers into interactive 3D books, allowing readers to view and interact with content using AR and VR compatible devices.

Read in 3D

Your published article is automatically converted into a realistic 3D book. Flip through pages and read research papers in a more engaging and interactive format.

Article Matrices
Total Views: 1907
Total Downloads: 885
2026 Trends
Related Research

Published Article

Marginal screening (MS) is the computationally simple and commonly used for the dimension reduction procedures. In it, a linear model is constructed for several top predictors, chosen according to the absolute value of marginal correlations with the dependent variable. Importantly, when k predictors out of m primary covariates are selected, the standard regression analysis may yield false-positive results if m >> k (Freedman’s paradox). In this work, we provide analytical expressions describing null distribution of the test statistics for model selection via MS. Using the theory of order statistics, we show that under MS, the common F-statistic is distributed as a mean of k top variables out of m independent random variables having a 2 1 χ distribution. Based on this finding, we estimated critical p-values for multiple regression models after MS, comparisons with which of those obtained in real studies will help researchers to avoid falsepositive result. Analytical solutions obtained in the work are implemented in a free Excel spreadsheet program.

Our website is actively being updated, and changes may occur frequently. Please clear your browser cache if needed. For feedback or error reporting, please email [email protected]

Request Access

Please fill out the form below to request access to this research paper. Your request will be reviewed by the editorial or author team.
X

Quote and Order Details

Contact Person

Invoice Address

Notes or Comments

This is the heading

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Ut elit tellus, luctus nec ullamcorper mattis, pulvinar dapibus leo.

High-quality academic research articles on global topics and journals.

Null Distribution of the Test Statistic for Model Selection via Marginal Screening: Implications for Multivariate Regression Analysis

A.V. Rubanovich
A.V. Rubanovich
V.A. Saenko
V.A. Saenko

Research Journals