Hybrid Technique for Arabic Text Compression

Article ID

CSTSDE2XAL2

Hybrid Technique for Arabic Text Compression

Arafat Awajan
Arafat Awajan Princess Sumaya Unversity for Technology
Enas Abu Jrai
Enas Abu Jrai
DOI

Abstract

Arabic content on the Internet and other digital media is increasing exponentially, and the number of Arab users of these media has multiplied by more than 20 over the past five years. There is a real need to save allocated space for this content as well as allowing more efficient usage, searching, and retrieving information operations on this content. Using techniques borrowed from other languages or general data compression techniques, ignoring the proper features of Arabic has limited success in terms of compression ratio. In this paper, we present a hybrid technique that uses the linguistic features of Arabic language to improve the compression ratio of Arabic texts. This technique works in phases. In the first phase, the text file is split into four different files using a multilayer model-based approach. In the second phase, each one of these four files is compressed using the Burrows-Wheeler compression algorithm.

Hybrid Technique for Arabic Text Compression

Arabic content on the Internet and other digital media is increasing exponentially, and the number of Arab users of these media has multiplied by more than 20 over the past five years. There is a real need to save allocated space for this content as well as allowing more efficient usage, searching, and retrieving information operations on this content. Using techniques borrowed from other languages or general data compression techniques, ignoring the proper features of Arabic has limited success in terms of compression ratio. In this paper, we present a hybrid technique that uses the linguistic features of Arabic language to improve the compression ratio of Arabic texts. This technique works in phases. In the first phase, the text file is split into four different files using a multilayer model-based approach. In the second phase, each one of these four files is compressed using the Burrows-Wheeler compression algorithm.

Arafat Awajan
Arafat Awajan Princess Sumaya Unversity for Technology
Enas Abu Jrai
Enas Abu Jrai

No Figures found in article.

Arafat Awajan. 2015. “. Global Journal of Computer Science and Technology – C: Software & Data Engineering GJCST-C Volume 15 (GJCST Volume 15 Issue C1): .

Download Citation

Journal Specifications

Crossref Journal DOI 10.17406/gjcst

Print ISSN 0975-4350

e-ISSN 0975-4172

Classification
C.1.3
Keywords
Article Matrices
Total Views: 8138
Total Downloads: 2133
2026 Trends
Research Identity (RIN)
Related Research
Our website is actively being updated, and changes may occur frequently. Please clear your browser cache if needed. For feedback or error reporting, please email [email protected]

Request Access

Please fill out the form below to request access to this research paper. Your request will be reviewed by the editorial or author team.
X

Quote and Order Details

Contact Person

Invoice Address

Notes or Comments

This is the heading

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Ut elit tellus, luctus nec ullamcorper mattis, pulvinar dapibus leo.

High-quality academic research articles on global topics and journals.

Hybrid Technique for Arabic Text Compression

Arafat Awajan
Arafat Awajan Princess Sumaya Unversity for Technology
Enas Abu Jrai
Enas Abu Jrai

Research Journals