Agglomerative Hierarchical Clustering: An Introduction to Essentials (1) Proximity Coefficients and Creation of a Vector-Distance Matrix and (2) Construction of the Hierarchical Tree and a Selection of Methods

Refat Aljumily

Volume 16 Issue 3

Global Journal of Human-Social Science

The article is on a particular type of cluster analysis, agglomerative hierarchical analysis, and is a series of four main parts. The first part deals with proximity coefficients and the creation of a vector-distance matrix. The second part deals with the construction of the hierarchical tree and introduces a selection of clustering methods. The third deals with a variety of ways to transform data prior to agglomerative cluster analysis. The fourth deals with deals with measures and methods of cluster validity. The fifth and final part deals with hypothesis generation. The present article covers the first and second partsonly. It explains how agglomerative cluster analysis works by implementing it in a data matrix step by step. Different types of agglomerative hierarchical clustering methods are applied on purposely-made data matrix so different types of cluster structures are made from that same dataset. The last three parts will be covered in the next publication(s).There are many articles, tutorials, and books on this subject. The article has two main objectives: (1) to keep the discussion short and easy to understand by (hopefully) any reader and (2) to develop the motivation for using agglomerative hierarchical clustering to analyse any highdimensional data of interest with respect to some research question.