Agglomerative Hierarchical Clustering: An Introduction to Essentials. (3) Standardization, Normalization and Dimensionality Reduction of a Data Matrix

α
Refat Aljumily
Refat Aljumily
α Newcastle University Newcastle University

Send Message

To: Author

Agglomerative Hierarchical Clustering: An Introduction to Essentials. (3) Standardization, Normalization and Dimensionality Reduction of a Data Matrix

Article Fingerprint

ReserarchID

6300W

Agglomerative Hierarchical Clustering: An Introduction to Essentials. (3) Standardization, Normalization and Dimensionality Reduction of a Data Matrix Banner

AI TAKEAWAY

Connecting with the Eternal Ground
  • English
  • Afrikaans
  • Albanian
  • Amharic
  • Arabic
  • Armenian
  • Azerbaijani
  • Basque
  • Belarusian
  • Bengali
  • Bosnian
  • Bulgarian
  • Catalan
  • Cebuano
  • Chichewa
  • Chinese (Simplified)
  • Chinese (Traditional)
  • Corsican
  • Croatian
  • Czech
  • Danish
  • Dutch
  • Esperanto
  • Estonian
  • Filipino
  • Finnish
  • French
  • Frisian
  • Galician
  • Georgian
  • German
  • Greek
  • Gujarati
  • Haitian Creole
  • Hausa
  • Hawaiian
  • Hebrew
  • Hindi
  • Hmong
  • Hungarian
  • Icelandic
  • Igbo
  • Indonesian
  • Irish
  • Italian
  • Japanese
  • Javanese
  • Kannada
  • Kazakh
  • Khmer
  • Korean
  • Kurdish (Kurmanji)
  • Kyrgyz
  • Lao
  • Latin
  • Latvian
  • Lithuanian
  • Luxembourgish
  • Macedonian
  • Malagasy
  • Malay
  • Malayalam
  • Maltese
  • Maori
  • Marathi
  • Mongolian
  • Myanmar (Burmese)
  • Nepali
  • Norwegian
  • Pashto
  • Persian
  • Polish
  • Portuguese
  • Punjabi
  • Romanian
  • Russian
  • Samoan
  • Scots Gaelic
  • Serbian
  • Sesotho
  • Shona
  • Sindhi
  • Sinhala
  • Slovak
  • Slovenian
  • Somali
  • Spanish
  • Sundanese
  • Swahili
  • Swedish
  • Tajik
  • Tamil
  • Telugu
  • Thai
  • Turkish
  • Ukrainian
  • Urdu
  • Uzbek
  • Vietnamese
  • Welsh
  • Xhosa
  • Yiddish
  • Yoruba
  • Zulu

Abstract

In a previous tutorial article I looked at a proximity coefficient and, in the light of that proximity created a vector-distance matrix and used it to construct a hierarchical tree using different hierarchical clustering methods which will be the basis for exploratory multivariate analysis. The present article deals with three topics: (i) standardization for variable scales variation, (ii) normalization for sample length variation, and (iii) dimensionality reduction or minimization of data space. These techniques reflect the author’s academic background and particular area of interest and are, by necessity, not a particular purpose and are straightforwardly applicable to other kinds of data, and thus to a wide range of analysis in Linguistics. My treatment of these techniques is, necessarily, introductory and brief. I hope that this article will provide practitioners with an introductory overview of these techniques used for cluster analysis of electronic corpora of linguistic data.

References

14 Cites in Article
  1. R Belew (2000). Finding Out About: A Cognitive Perspective on Search Engine Technology and the WWW.
  2. I Borg,P Groenen (2005). Modern Multidimensional Scaling.
  3. C Chu,J Holliday,P Willett (2009). Effect of data standardization on chemical clustering and similarity searching.
  4. J Dy (2008). Unsupervised feature selection.
  5. J Dy,C Bodley (2004). Feature selection for unsupervised learning.
  6. R Gnanadesikan,J Kettenring,S Tsao (1995). Weighting and selection of variables for cluster analysis.
  7. A Gordon,A Chapman And Halljain,M Murty,P Flynn (1999). Data clustering: a review.
  8. Hermann Moisl (2015). Cluster Analysis for Corpus Linguistics.
  9. T Kohonen (2001). Self-Organizing Maps.
  10. G Milligan,Cooper (1985). An examination of procedures for determining the number of clusters in a data set.
  11. Kevin Priddy,Paul Keller (2005). Artificial Neural Networks: An Introduction.
  12. A Singhal,C Buckley,M Mitra (1996). Pivoted document length normalization.
  13. Amit Singhal,Gerard Salton,Mandar Mitra,Chris Buckley (1995). Document length normalization.
  14. J Tenenbaum,V,J Langford (2000). A global geometric framework for nonlinear dimensionality reduction.

Funding

No external funding was declared for this work.

Conflict of Interest

The authors declare no conflict of interest.

Ethical Approval

No ethics committee approval was required for this article type.

Data Availability

Not applicable for this article.

How to Cite This Article

Refat Aljumily. 2016. \u201cAgglomerative Hierarchical Clustering: An Introduction to Essentials. (3) Standardization, Normalization and Dimensionality Reduction of a Data Matrix\u201d. Global Journal of Human-Social Science - G: Linguistics & Education GJHSS-G Volume 16 (GJHSS Volume 16 Issue G3): .

Download Citation

Issue Cover
GJHSS Volume 16 Issue G3
Pg. 55- 63
Journal Specifications

Crossref Journal DOI 10.17406/GJHSS

Print ISSN 0975-587X

e-ISSN 2249-460X

Keywords
Classification
GJHSS-G Classification: FOR Code: 139999
Version of record

v1.2

Issue date

April 29, 2016

Language
en
Experiance in AR

Explore published articles in an immersive Augmented Reality environment. Our platform converts research papers into interactive 3D books, allowing readers to view and interact with content using AR and VR compatible devices.

Read in 3D

Your published article is automatically converted into a realistic 3D book. Flip through pages and read research papers in a more engaging and interactive format.

Article Matrices
Total Views: 4092
Total Downloads: 1951
2026 Trends
Related Research

Published Article

In a previous tutorial article I looked at a proximity coefficient and, in the light of that proximity created a vector-distance matrix and used it to construct a hierarchical tree using different hierarchical clustering methods which will be the basis for exploratory multivariate analysis. The present article deals with three topics: (i) standardization for variable scales variation, (ii) normalization for sample length variation, and (iii) dimensionality reduction or minimization of data space. These techniques reflect the author’s academic background and particular area of interest and are, by necessity, not a particular purpose and are straightforwardly applicable to other kinds of data, and thus to a wide range of analysis in Linguistics. My treatment of these techniques is, necessarily, introductory and brief. I hope that this article will provide practitioners with an introductory overview of these techniques used for cluster analysis of electronic corpora of linguistic data.

Our website is actively being updated, and changes may occur frequently. Please clear your browser cache if needed. For feedback or error reporting, please email [email protected]

Request Access

Please fill out the form below to request access to this research paper. Your request will be reviewed by the editorial or author team.
X

Quote and Order Details

Contact Person

Invoice Address

Notes or Comments

This is the heading

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Ut elit tellus, luctus nec ullamcorper mattis, pulvinar dapibus leo.

High-quality academic research articles on global topics and journals.

Agglomerative Hierarchical Clustering: An Introduction to Essentials. (3) Standardization, Normalization and Dimensionality Reduction of a Data Matrix

Refat Aljumily
Refat Aljumily

Research Journals