An under-Sampled Approach for Handling Skewed Data Distribution using Cluster Disjuncts

Article ID

CSTSDEGP9YL

An under-Sampled Approach for Handling Skewed Data Distribution using Cluster Disjuncts

Syed Ziaur Rahman
Syed Ziaur Rahman Andhra University
Dr. G Samuel Vara Prasad Raju
Dr. G Samuel Vara Prasad Raju
Dr. Ali Mirza Mahmood
Dr. Ali Mirza Mahmood
DOI

Abstract

In Data mining and Knowledge Discovery hidden and valuable knowledge from the data sources is discovered. The traditional algorithms used for knowledge discovery are bottle necked due to wide range of data sources availability. Class imbalance is a one of the problem arises due to data source which provide unequal class i.e. examples of one class in a training data set vastly outnumber examples of the other class(es). Researchers have rigorously studied several techniques to alleviate the problem of class imbalance, including resampling algorithms, and feature selection approaches to this problem. In this paper, we present a new hybrid frame work dubbed as Majority Under-sampling based on Cluster Disjunct (MAJOR_CD) for learning from skewed training data. This algorithm provides a simpler and faster alternative by using cluster disjunct concept. We conduct experiments using twelve UCI data sets from various application domains using five algorithms for comparison on six evaluation metrics. The empirical study suggests that MAJOR_CD have been believed to be effective in addressing the class imbalance problem.

An under-Sampled Approach for Handling Skewed Data Distribution using Cluster Disjuncts

In Data mining and Knowledge Discovery hidden and valuable knowledge from the data sources is discovered. The traditional algorithms used for knowledge discovery are bottle necked due to wide range of data sources availability. Class imbalance is a one of the problem arises due to data source which provide unequal class i.e. examples of one class in a training data set vastly outnumber examples of the other class(es). Researchers have rigorously studied several techniques to alleviate the problem of class imbalance, including resampling algorithms, and feature selection approaches to this problem. In this paper, we present a new hybrid frame work dubbed as Majority Under-sampling based on Cluster Disjunct (MAJOR_CD) for learning from skewed training data. This algorithm provides a simpler and faster alternative by using cluster disjunct concept. We conduct experiments using twelve UCI data sets from various application domains using five algorithms for comparison on six evaluation metrics. The empirical study suggests that MAJOR_CD have been believed to be effective in addressing the class imbalance problem.

Syed Ziaur Rahman
Syed Ziaur Rahman Andhra University
Dr. G Samuel Vara Prasad Raju
Dr. G Samuel Vara Prasad Raju
Dr. Ali Mirza Mahmood
Dr. Ali Mirza Mahmood

No Figures found in article.

Syed Ziaur Rahman. 2014. “. Global Journal of Computer Science and Technology – C: Software & Data Engineering GJCST-C Volume 14 (GJCST Volume 14 Issue C7): .

Download Citation

Journal Specifications

Crossref Journal DOI 10.17406/gjcst

Print ISSN 0975-4350

e-ISSN 0975-4172

Classification
Not Found
Article Matrices
Total Views: 8842
Total Downloads: 2396
2026 Trends
Research Identity (RIN)
Related Research
Our website is actively being updated, and changes may occur frequently. Please clear your browser cache if needed. For feedback or error reporting, please email [email protected]

Request Access

Please fill out the form below to request access to this research paper. Your request will be reviewed by the editorial or author team.
X

Quote and Order Details

Contact Person

Invoice Address

Notes or Comments

This is the heading

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Ut elit tellus, luctus nec ullamcorper mattis, pulvinar dapibus leo.

High-quality academic research articles on global topics and journals.

An under-Sampled Approach for Handling Skewed Data Distribution using Cluster Disjuncts

Syed Ziaur Rahman
Syed Ziaur Rahman Andhra University
Dr. G Samuel Vara Prasad Raju
Dr. G Samuel Vara Prasad Raju
Dr. Ali Mirza Mahmood
Dr. Ali Mirza Mahmood

Research Journals