Probability of Semantic Similarity and N-grams Pattern Learning for Data Classification

Article ID

CSTITO5986

Probability of Semantic Similarity and N-grams Pattern Learning for Data Classification

V Vineeth Kumar
V Vineeth Kumar JNTU Hyderabad
Dr. N Satyanarayana.
Dr. N Satyanarayana.
DOI

Abstract

Semantic learning is an important mechanism for the document classification, but most classification approaches are only considered the content and words distribution. Traditional classification algorithms cannot accurately represent the meaning of a document because it does not take into account semantic relations between words. In this paper, we present an approach for classification of documents by incorporating two similarity computing score method. First, a semantic similarity method which computes the probable similarity based on the Bayes’ method and second, n-grams pairs based on the frequent terms probability similarity score. Since, both semantic and N-grams pairs can play important roles in a separated views for the classification of the document, we design a semantic similarity learning (SSL) algorithm to improves the performance of document classification for a huge quantity of unclassified documents. The experiment evaluation shows an improvisation in accuracy and effectiveness of the proposal for the unclassified documents.

Probability of Semantic Similarity and N-grams Pattern Learning for Data Classification

Semantic learning is an important mechanism for the document classification, but most classification approaches are only considered the content and words distribution. Traditional classification algorithms cannot accurately represent the meaning of a document because it does not take into account semantic relations between words. In this paper, we present an approach for classification of documents by incorporating two similarity computing score method. First, a semantic similarity method which computes the probable similarity based on the Bayes’ method and second, n-grams pairs based on the frequent terms probability similarity score. Since, both semantic and N-grams pairs can play important roles in a separated views for the classification of the document, we design a semantic similarity learning (SSL) algorithm to improves the performance of document classification for a huge quantity of unclassified documents. The experiment evaluation shows an improvisation in accuracy and effectiveness of the proposal for the unclassified documents.

V Vineeth Kumar
V Vineeth Kumar JNTU Hyderabad
Dr. N Satyanarayana.
Dr. N Satyanarayana.

No Figures found in article.

V Vineeth Kumar. 2017. “. Global Journal of Computer Science and Technology – H: Information & Technology GJCST-H Volume 17 (GJCST Volume 17 Issue H2): .

Download Citation

Journal Specifications

Crossref Journal DOI 10.17406/gjcst

Print ISSN 0975-4350

e-ISSN 0975-4172

Classification
GJCST-H Classification: G.3 I.5, I.5.2
Keywords
Article Matrices
Total Views: 6456
Total Downloads: 1705
2026 Trends
Research Identity (RIN)
Related Research
Our website is actively being updated, and changes may occur frequently. Please clear your browser cache if needed. For feedback or error reporting, please email [email protected]

Request Access

Please fill out the form below to request access to this research paper. Your request will be reviewed by the editorial or author team.
X

Quote and Order Details

Contact Person

Invoice Address

Notes or Comments

This is the heading

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Ut elit tellus, luctus nec ullamcorper mattis, pulvinar dapibus leo.

High-quality academic research articles on global topics and journals.

Probability of Semantic Similarity and N-grams Pattern Learning for Data Classification

V Vineeth Kumar
V Vineeth Kumar JNTU Hyderabad
Dr. N Satyanarayana.
Dr. N Satyanarayana.

Research Journals