Discriminative Gene Selection Employing Linear Regression Model

α
Abid Hasan
Abid Hasan
σ
Shaikh Jeeshan Kabeer
Shaikh Jeeshan Kabeer
ρ
Kamrul Hasan
Kamrul Hasan
Ѡ
Md. Abdul Mottalib
Md. Abdul Mottalib
α Islamic University of Technology Islamic University of Technology

Send Message

To: Author

Discriminative Gene Selection Employing Linear Regression Model

Article Fingerprint

ReserarchID

CSTSDE4DFZY

Discriminative Gene Selection Employing Linear Regression Model Banner

AI TAKEAWAY

Connecting with the Eternal Ground
  • English
  • Afrikaans
  • Albanian
  • Amharic
  • Arabic
  • Armenian
  • Azerbaijani
  • Basque
  • Belarusian
  • Bengali
  • Bosnian
  • Bulgarian
  • Catalan
  • Cebuano
  • Chichewa
  • Chinese (Simplified)
  • Chinese (Traditional)
  • Corsican
  • Croatian
  • Czech
  • Danish
  • Dutch
  • Esperanto
  • Estonian
  • Filipino
  • Finnish
  • French
  • Frisian
  • Galician
  • Georgian
  • German
  • Greek
  • Gujarati
  • Haitian Creole
  • Hausa
  • Hawaiian
  • Hebrew
  • Hindi
  • Hmong
  • Hungarian
  • Icelandic
  • Igbo
  • Indonesian
  • Irish
  • Italian
  • Japanese
  • Javanese
  • Kannada
  • Kazakh
  • Khmer
  • Korean
  • Kurdish (Kurmanji)
  • Kyrgyz
  • Lao
  • Latin
  • Latvian
  • Lithuanian
  • Luxembourgish
  • Macedonian
  • Malagasy
  • Malay
  • Malayalam
  • Maltese
  • Maori
  • Marathi
  • Mongolian
  • Myanmar (Burmese)
  • Nepali
  • Norwegian
  • Pashto
  • Persian
  • Polish
  • Portuguese
  • Punjabi
  • Romanian
  • Russian
  • Samoan
  • Scots Gaelic
  • Serbian
  • Sesotho
  • Shona
  • Sindhi
  • Sinhala
  • Slovak
  • Slovenian
  • Somali
  • Spanish
  • Sundanese
  • Swahili
  • Swedish
  • Tajik
  • Tamil
  • Telugu
  • Thai
  • Turkish
  • Ukrainian
  • Urdu
  • Uzbek
  • Vietnamese
  • Welsh
  • Xhosa
  • Yiddish
  • Yoruba
  • Zulu

Abstract

Microarray datasets enables the analysis of expression of thousands of genes across hundreds of samples. Usually classifiers do not perform well for large number of features (genes) as is the case of microarray datasets. That is why a small number of informative and discriminative features are always desirable for efficient classification. Many existing feature selection approaches have been proposed which attempts sample classification based on the analysis of gene expression values. In this paper a linear regression based feature selection algorithm for two class microarray datasets has been developed which divides the training dataset into two subtypes based on the class information. Using one of the classes as the base condition, a linear regression based model is developed. Using this regression model the divergence of each gene across the two classes are calculated and thus genes with higher divergence values are selected as important features from the second subtype of the training data. The classification performance of the proposed approach is evaluated with SVM, Random Forest and AdaBoost classifiers. Results show that the proposed approach provides better accuracy values compared to other existing approaches i.e. Relief F, CFS, decision tree based attribute selector and attribute selection using correlation analysis.

References

17 Cites in Article
  1. R Kohavi,G John (1997). Wrappers for Feature Subset Selection.
  2. Iñaki Inza,Pedro Larrañaga,Rosa Blanco,Antonio Cerrolaza (2004). Filter versus wrapper gene selection approaches in DNA microarray domains.
  3. Beatrice Duval,Jin-Kao Hao (2009). Advances in meta heuristics for gene selection and classification of microarray data.
  4. M Hall (1999). Correlation-based feature selection for machine learning.
  5. Lei Yu,Huan Liu (2004). Redundancy based feature selection for microarray data.
  6. Iñaki Inza,Basilio Sierra,Rosa Blanco,Pedro Larrañaga (2002). Gene selection by sequential search wrapper approaches in microarray cancer class prediction.
  7. T Golub,D Slonim,P Tamayo,C Huard,M Gaasenbeek,J Mesirov,H Coller,M Loh,J Downing,M Caligiuri,C Bloomfield,E Lander (1999). Molecular Classification of Cancer: Class Discovery and Class Prediction by Gene Expression Monitoring.
  8. Ting Yu,Simeon Simoff,Donald Stokes (2007). Incorporating Prior Domain Knowledge into a Kernel Based Feature Selection Algorithm.
  9. V Ofir Barzilay,Brailovsky (1999). On domain knowledge and feature selection using a support vector machine.
  10. Xin Yan,Xiao Su (2009). Linear Regression Analysis.
  11. Dinesh Singh,Phillip Febbo,Kenneth Ross,Donald Jackson,Judith Manola,Christine Ladd,Pablo Tamayo,Andrew Renshaw,Anthony D'amico,Jerome Richie,Eric Lander,Massimo Loda,Philip Kantoff,Todd Golub,William Sellers (2002). Gene expression correlates of clinical prostate cancer behavior.
  12. C Best,J Gillespie,Y Yi,G Chandramouli (2005). Molecular alternations in primary prostate cancer after androgen ablation therapy.
  13. G Gordon,R Jensen,L Hsiao,S Gullans,J Blumenstock,S Ramaswamy,W Yvan Saeys,Iñaki Inza,Pedro Larrañaga (2007). A review of feature selection techniques in bioinformatics.
  14. Therese Sørlie,Robert Tibshirani,Joel Parker,Trevor Hastie,J Marron,Andrew Nobel,Shibing Deng,Hilde Johnsen,Robert Pesich,Stephanie Geisler,Janos Demeter,Charles Perou,E Per,Patrick Lønning,Anne-Lise Brown,David Børresen-Dale,Botstein (2003). Repeated Observation of breast tumor subtypes in independent gene expression data sets.
  15. G Richards,D Suqarbaker,R Bueno (2002). Translation of microarray data into clinically relevant cancer diagnostic tests using gene expression ratios in lung cancer and meothelioma.
  16. U Alon,N Barkai,D Notterman,K Gish,S Ybarra,D Mack,A Levine (1999). Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays.
  17. I Witten,E Frank (2005). Data Mining: Practical machine learning tools and techniques.

Funding

No external funding was declared for this work.

Conflict of Interest

The authors declare no conflict of interest.

Ethical Approval

No ethics committee approval was required for this article type.

Data Availability

Not applicable for this article.

How to Cite This Article

Abid Hasan. 2013. \u201cDiscriminative Gene Selection Employing Linear Regression Model\u201d. Global Journal of Computer Science and Technology - C: Software & Data Engineering GJCST-C Volume 13 (GJCST Volume 13 Issue C4): .

Download Citation

Journal Specifications

Crossref Journal DOI 10.17406/gjcst

Print ISSN 0975-4350

e-ISSN 0975-4172

Version of record

v1.2

Issue date

May 2, 2013

Language
en
Experiance in AR

Explore published articles in an immersive Augmented Reality environment. Our platform converts research papers into interactive 3D books, allowing readers to view and interact with content using AR and VR compatible devices.

Read in 3D

Your published article is automatically converted into a realistic 3D book. Flip through pages and read research papers in a more engaging and interactive format.

Article Matrices
Total Views: 9756
Total Downloads: 2543
2026 Trends
Related Research

Published Article

Microarray datasets enables the analysis of expression of thousands of genes across hundreds of samples. Usually classifiers do not perform well for large number of features (genes) as is the case of microarray datasets. That is why a small number of informative and discriminative features are always desirable for efficient classification. Many existing feature selection approaches have been proposed which attempts sample classification based on the analysis of gene expression values. In this paper a linear regression based feature selection algorithm for two class microarray datasets has been developed which divides the training dataset into two subtypes based on the class information. Using one of the classes as the base condition, a linear regression based model is developed. Using this regression model the divergence of each gene across the two classes are calculated and thus genes with higher divergence values are selected as important features from the second subtype of the training data. The classification performance of the proposed approach is evaluated with SVM, Random Forest and AdaBoost classifiers. Results show that the proposed approach provides better accuracy values compared to other existing approaches i.e. Relief F, CFS, decision tree based attribute selector and attribute selection using correlation analysis.

Our website is actively being updated, and changes may occur frequently. Please clear your browser cache if needed. For feedback or error reporting, please email [email protected]

Request Access

Please fill out the form below to request access to this research paper. Your request will be reviewed by the editorial or author team.
X

Quote and Order Details

Contact Person

Invoice Address

Notes or Comments

This is the heading

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Ut elit tellus, luctus nec ullamcorper mattis, pulvinar dapibus leo.

High-quality academic research articles on global topics and journals.

Discriminative Gene Selection Employing Linear Regression Model

Abid Hasan
Abid Hasan Islamic University of Technology
Shaikh Jeeshan Kabeer
Shaikh Jeeshan Kabeer
Kamrul Hasan
Kamrul Hasan
Md. Abdul Mottalib
Md. Abdul Mottalib

Research Journals