Annotated Bangla News Corpus and Lexicon Development with POS Tagging and Stemming

Annotated Bangla News Corpus and Lexicon Development with POS Tagging and Stemming

Abdul Matin

Contact

Tasnim Haider Chaudhury

Contact

M.S. Hossain

Contact

Asie Uzzaman

Contact

Md. Masum

Contact

α Shahjalal University of Science and Technology

Annotated Bangla News Corpus and Lexicon Development with POS Tagging and Stemming

Article Fingerprint

ReserarchID

027C9

Annotated Bangla News Corpus and Lexicon Development with POS Tagging and Stemming Banner

AI TAKEAWAY

Connecting with the Eternal Ground

Abstract

In this paper, we have developed a mono-linguistic Bengali news corpus using knowledge based AI (Artificial Intelligence) technique from some widely read Bengali newspapers which will be used as a reference corpus and will be very useful for lexicon development, morphological analysis, and automatic parts of speech detection. The corpus contains 74,698 word forms. The words in the lexicon are annotated with a combination of manual tags addressing Parts-of-Speech, Stemming, Morphemes, and other grammatical features are very important for almost all Natural Language Processing (NLP) applications. The lexicon contains around 14 thousand entries.

References

10 Cites in Article

Reference Format

Peiman Habibollahi,Stephen Hunt,Therese Bitterman,Terence Gade,Michael Soulen,Gregory Nadolski (2018). Definitive locoregional therapy (LRT) versus bridging LRT and liver transplantation with wait-and-not-treat approach for very early stage hepatocellular carcinoma.
J Hasan (2001). Automatic dictionary construction from large collections of text.
Md. Mahtab,Monirul Haque,Mehedi Hasan,Farig Sadeque (2023). BanglaBait: Semi-Supervised Adversarial Approach for Clickbait Detection on Bangla Clickbait Dataset.
Md. Mahtab,Monirul Haque,Mehedi Hasan,Farig Sadeque (2023). BanglaBait: Semi-Supervised Adversarial Approach for Clickbait Detection on Bangla Clickbait Dataset.
A Bharati,R Sangal,S Bendre (1998). Some Observations Regarding Corpora of Some Indian Languages.
N Dash (2005). Corpus Linguistics and Language Technology.
Md Nur Hossain Khan,Md Farukuzzaman Khan,Md Islam,Bappa Habibur Rahman,Sarker (2014). Verification of Bangla Sentence Structure using N-Gram.
Md Hanif,Seddiqui Rana,Abdullah Al Mahmud,Taufique Sayeed Parts of speech tagging using morphological analysis in bangla.
Samsi Ara,Md. Islam,Jugal Das,Md. Saklayen,Md. Rahman (2003). Alzheimer Classifications Combining Machine Learning and Signal Processing.
Kristina Toutanova,Colin Cherry A global model for joint lemmatization and part-of-speech prediction.

Download References

Funding

No external funding was declared for this work.

Conflict of Interest

The authors declare no conflict of interest.

Ethical Approval

No ethics committee approval was required for this article type.

Data Availability

Not applicable for this article.

How to Cite This Article

Abdul Matin. 2017. \u201cAnnotated Bangla News Corpus and Lexicon Development with POS Tagging and Stemming\u201d. Global Journal of Research in Engineering - J: General Engineering GJRE-J Volume 17 (GJRE Volume 17 Issue J1): .

More Citation Formats

Select Citation Style:

Download Citation

Download Article

GJRE Volume 17 Issue J1
Pg. 5- 12

Explore Journals Explore Volume Read This Issue

Journal Specifications

Crossref Journal DOI 10.17406/gjre

Print ISSN 0975-5861

e-ISSN 2249-4596

Keywords

Not Found

Classification

GJRE-J Classification: FOR Code: 200402, 170203

Submission ReceivedDecember 15, 2016
Peer Review Double Blind
Handling Editor
Accepted January 2, 2017
Published January 15, 2017

Version of record

v1.2

Issue date

May 18, 2017

Language

Experiance in AR

Explore published articles in an immersive Augmented Reality environment. Our platform converts research papers into interactive 3D books, allowing readers to view and interact with content using AR and VR compatible devices.

View in VR

Read in 3D

Your published article is automatically converted into a realistic 3D book. Flip through pages and read research papers in a more engaging and interactive format.

View in 3D

Article Matrices

Total Score: 105

Country: Bangladesh

Subject: Global Journal of Research in Engineering - J: General Engineering

Authors: Tasnim Haider Chaudhury, Abdul Matin, M.S. Hossain, Asie Uzzaman, Md. Masum (PhD/Dr. count: 0)

View Count (all-time): 220

Total Views (Real + Logic): 3576

Total Downloads (simulated): 1687

Publish Date: 2017 05, Thu

Monthly Totals (Real + Logic):

Month 1: 55 views
Month 2: 60 views
Month 3: 57 views
Month 4: 38 views
Month 5: 18 views
Month 6: 34 views
Month 7: 25 views
Month 8: 18 views
Month 9: 20 views
Month 10: 51 views
Month 11: 31 views
Month 12: 31 views
Month 13: 18 views
Month 14: 40 views
Month 15: 12 views
Month 16: 34 views
Month 17: 42 views
Month 18: 35 views
Month 19: 21 views
Month 20: 35 views
Month 21: 37 views
Month 22: 19 views
Month 23: 29 views
Month 24: 45 views
Month 25: 41 views
Month 26: 22 views
Month 27: 44 views
Month 28: 32 views
Month 29: 40 views
Month 30: 29 views
Month 31: 35 views
Month 32: 32 views
Month 33: 39 views
Month 34: 48 views
Month 35: 18 views
Month 36: 32 views
Month 37: 44 views
Month 38: 15 views
Month 39: 23 views
Month 40: 27 views
Month 41: 24 views
Month 42: 23 views
Month 43: 28 views
Month 44: 24 views
Month 45: 36 views
Month 46: 25 views
Month 47: 44 views
Month 48: 28 views
Month 49: 25 views
Month 50: 18 views
Month 51: 30 views
Month 52: 26 views
Month 53: 30 views
Month 54: 38 views
Month 55: 21 views
Month 56: 30 views
Month 57: 39 views
Month 58: 34 views
Month 59: 20 views
Month 60: 32 views
Month 61: 20 views
Month 62: 37 views
Month 63: 32 views
Month 64: 35 views
Month 65: 24 views
Month 66: 43 views
Month 67: 35 views
Month 68: 40 views
Month 69: 18 views
Month 70: 34 views
Month 71: 36 views
Month 72: 27 views
Month 73: 32 views
Month 74: 28 views
Month 75: 35 views
Month 76: 26 views
Month 77: 16 views
Month 78: 15 views
Month 79: 36 views
Month 80: 27 views
Month 81: 26 views
Month 82: 18 views
Month 83: 34 views
Month 84: 33 views
Month 85: 34 views
Month 86: 40 views
Month 87: 34 views
Month 88: 34 views
Month 89: 34 views
Month 90: 38 views
Month 91: 26 views
Month 92: 28 views
Month 93: 36 views
Month 94: 40 views
Month 95: 40 views
Month 96: 24 views
Month 97: 30 views
Month 98: 45 views
Month 99: 27 views
Month 100: 24 views
Month 101: 35 views
Month 102: 42 views
Month 103: 51 views
Month 104: 24 views
Month 105: 33 views
Month 106: 30 views
Month 107: 39 views

Total Views: 3576

Total Downloads: 1687

2026 Trends

Published Article

Our website is actively being updated, and changes may occur frequently. Please clear your browser cache if needed. For feedback or error reporting, please email [email protected]