Query Join Processing over Uncertain Data for Decision Tree Classifiers

α
Dr. V. Yaswanth Kumar
Dr. V. Yaswanth Kumar
σ
G. Kalyani
G. Kalyani
α Jawaharlal Nehru Technological University, Kakinada Jawaharlal Nehru Technological University, Kakinada

Send Message

To: Author

Query Join Processing over Uncertain Data for Decision Tree Classifiers

Article Fingerprint

ReserarchID

CSTSDE2EE3E

Query Join Processing over Uncertain Data for Decision Tree Classifiers Banner

AI TAKEAWAY

Connecting with the Eternal Ground
  • English
  • Afrikaans
  • Albanian
  • Amharic
  • Arabic
  • Armenian
  • Azerbaijani
  • Basque
  • Belarusian
  • Bengali
  • Bosnian
  • Bulgarian
  • Catalan
  • Cebuano
  • Chichewa
  • Chinese (Simplified)
  • Chinese (Traditional)
  • Corsican
  • Croatian
  • Czech
  • Danish
  • Dutch
  • Esperanto
  • Estonian
  • Filipino
  • Finnish
  • French
  • Frisian
  • Galician
  • Georgian
  • German
  • Greek
  • Gujarati
  • Haitian Creole
  • Hausa
  • Hawaiian
  • Hebrew
  • Hindi
  • Hmong
  • Hungarian
  • Icelandic
  • Igbo
  • Indonesian
  • Irish
  • Italian
  • Japanese
  • Javanese
  • Kannada
  • Kazakh
  • Khmer
  • Korean
  • Kurdish (Kurmanji)
  • Kyrgyz
  • Lao
  • Latin
  • Latvian
  • Lithuanian
  • Luxembourgish
  • Macedonian
  • Malagasy
  • Malay
  • Malayalam
  • Maltese
  • Maori
  • Marathi
  • Mongolian
  • Myanmar (Burmese)
  • Nepali
  • Norwegian
  • Pashto
  • Persian
  • Polish
  • Portuguese
  • Punjabi
  • Romanian
  • Russian
  • Samoan
  • Scots Gaelic
  • Serbian
  • Sesotho
  • Shona
  • Sindhi
  • Sinhala
  • Slovak
  • Slovenian
  • Somali
  • Spanish
  • Sundanese
  • Swahili
  • Swedish
  • Tajik
  • Tamil
  • Telugu
  • Thai
  • Turkish
  • Ukrainian
  • Urdu
  • Uzbek
  • Vietnamese
  • Welsh
  • Xhosa
  • Yiddish
  • Yoruba
  • Zulu

Abstract

Traditional decision tree classifiers work with the data whose values are known and precise. We can also extend those classifiers to handle data with uncertain information. Value uncertainty arises in many applications during the data collection process. Example sources of uncertainty measurement/quantization errors, data staleness, and multiple repeated measurements. Rather than abstracting uncertain data by statistical derivatives, such as mean and median, the accuracy of a decision tree classifier can be improved much if the complete information of a data item is used by utilizing the Probability Density Function (PDF). In particular, an attribute value can be modelled as a range of possible values, associated with a PDF. The PDF function has only addressed simple queries such as range and nearestneighbour queries. Queries that join multiple relations have not been addressed with PDF. Despite the significance of joins in databases, we address join queries over uncertain data. We propose semantics for the join operation, define probabilistic operators over uncertain data, and propose join algorithms that provide efficient execution of probabilistic joins especially threshold. In which we avoid the semantic complexities that deals with uncertain data. For this class of joins we develop three sets of optimization techniques: item-level, page-level, and index-level pruning. We will compare the performance of these techniques experimentally.

References

10 Cites in Article
  1. J Quinlan (1986). Induction of Decision Trees.
  2. (1993). C4.5: Programs for Machine Learning.
  3. J Chen,R Cheng (2007). Efficient evaluation of imprecise location dependent queries.
  4. Michael Chau,Reynold Cheng,Ben Kao,Jackey Ng (2006). Uncertain Data Mining: An Example in Clustering Location Data.
  5. Reynold Cheng,Yuni Xia,Sunil Prabhakar,Rahul Shah,Jeffrey Vitter (2004). Efficient Indexing Methods for Probabilistic Threshold Queries over Uncertain Data.
  6. R Cheng,D Kalashnikov,S Prabhakar (2004). Querying imprecise data in moving object environments.
  7. T Mitchell (1997). Machine Learning.
  8. Reynold Cheng,Dmitri Kalashnikov,Sunil Prabhakar (2003). Evaluating probabilistic queries over imprecise data.
  9. R Cheng,Y Xia,S Prabhakar,R Shah,J Vitter (2004). Efficient indexing methods for probabilistic threshold queries over uncertain data.
  10. D Zhang,V Tsotras,Seeger (2002). Efficient temporal join processing using indicies.

Funding

No external funding was declared for this work.

Conflict of Interest

The authors declare no conflict of interest.

Ethical Approval

No ethics committee approval was required for this article type.

Data Availability

Not applicable for this article.

How to Cite This Article

Dr. V. Yaswanth Kumar. 2012. \u201cQuery Join Processing over Uncertain Data for Decision Tree Classifiers\u201d. Global Journal of Computer Science and Technology - C: Software & Data Engineering GJCST-C Volume 12 (GJCST Volume 12 Issue C12): .

Download Citation

Issue Cover
GJCST Volume 12 Issue C12
Pg. 19- 22
Journal Specifications

Crossref Journal DOI 10.17406/gjcst

Print ISSN 0975-4350

e-ISSN 0975-4172

Keywords
Version of record

v1.2

Issue date

August 21, 2012

Language
en
Experiance in AR

Explore published articles in an immersive Augmented Reality environment. Our platform converts research papers into interactive 3D books, allowing readers to view and interact with content using AR and VR compatible devices.

Read in 3D

Your published article is automatically converted into a realistic 3D book. Flip through pages and read research papers in a more engaging and interactive format.

Article Matrices
Total Views: 10440
Total Downloads: 2668
2026 Trends
Related Research

Published Article

Traditional decision tree classifiers work with the data whose values are known and precise. We can also extend those classifiers to handle data with uncertain information. Value uncertainty arises in many applications during the data collection process. Example sources of uncertainty measurement/quantization errors, data staleness, and multiple repeated measurements. Rather than abstracting uncertain data by statistical derivatives, such as mean and median, the accuracy of a decision tree classifier can be improved much if the complete information of a data item is used by utilizing the Probability Density Function (PDF). In particular, an attribute value can be modelled as a range of possible values, associated with a PDF. The PDF function has only addressed simple queries such as range and nearestneighbour queries. Queries that join multiple relations have not been addressed with PDF. Despite the significance of joins in databases, we address join queries over uncertain data. We propose semantics for the join operation, define probabilistic operators over uncertain data, and propose join algorithms that provide efficient execution of probabilistic joins especially threshold. In which we avoid the semantic complexities that deals with uncertain data. For this class of joins we develop three sets of optimization techniques: item-level, page-level, and index-level pruning. We will compare the performance of these techniques experimentally.

Our website is actively being updated, and changes may occur frequently. Please clear your browser cache if needed. For feedback or error reporting, please email [email protected]

Request Access

Please fill out the form below to request access to this research paper. Your request will be reviewed by the editorial or author team.
X

Quote and Order Details

Contact Person

Invoice Address

Notes or Comments

This is the heading

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Ut elit tellus, luctus nec ullamcorper mattis, pulvinar dapibus leo.

High-quality academic research articles on global topics and journals.

Query Join Processing over Uncertain Data for Decision Tree Classifiers

Dr. V. Yaswanth Kumar
Dr. V. Yaswanth Kumar Jawaharlal Nehru Technological University, Kakinada
G. Kalyani
G. Kalyani

Research Journals