Clustering Provinces in Indonesia based on Community Welfare Indicators

Sekti Kartika Dini, Achmad Fauzan


The Preamble of the 1945 Constitution of the Republic of Indonesia explicitly states that the main task of the government of the Republic of Indonesia is to advance general prosperity, to develop the nation's intellectual life, and to realize social justice for all Indonesian people. Social inequality is a problem that is still faced by Indonesian people today. To solve the problem required supporting data analysis as a basis for policy formulation. This research was conducted with the aim of clustering provinces in Indonesia based on community welfare indicators using K-Means cluster analysis. K-Means cluster analysis is chosen based on the variance value (0.101), which is smaller than the variance value in the average linkage cluster analysis (0.152). Based on data analysis, provinces in Indonesia are clustered into three where the first cluster consists of 21 provinces, the second cluster consists of 3 provinces, and the third cluster consists of 10 provinces. Each cluster has different characteristics that can be of concern to the parties concerned to overcome the social welfare gap. Besides, in order cluster results are more easily understood, visualization of results is added with a Geographic Information System (GIS) using Indonesian maps accompanied by differences in color gradations for each cluster


Average Linkage Clustering; Clustering; GIS; K-Means; Public Welfare

Full Text:



Abdurrahman, G. (2019). Clustering Data Kredit Bank Menggunakan Algoritma Agglomerative Hierarchical Clustering Average Linkage. JUSTINDO (Jurnal Sistem & Teknologi Informasi Indonesia), 13-20.

Agusta, Y. (2007). K-Means – Penerapan, Permasalahan dan Metode Terkait. Jurnal Sistem dan Informatika, 47-60.

Andayani, S. (2007). Pembentukan cluster dalam Knowledge Discovery in Database dengan Algorita K-Means. SEMNAS Matematika dan Pendidikan Matematika. Yogyakarta.

Asian Human Rights Commission. (n.d.). Retrieved from UNESCO:

Ball, G. H., & Hall, D. J. (1967). A Clustering Technique for Summarizing Multivariate Data. Behavioral Science, 153-155.

Bangun, R. H. (2016). Analisis Klaster Non-Hierarki dalam Pengelompokan Kabupaten/Kota di Sumatera Utara Berdasarkan Faktor Produksi Padi. Agrica (Jurnal Agribisnis Sumatera Utara), 54-61.

Barakbah, A. R., & Arai, K. (2004). Identifying moving variance make automatic clustering for normal data set. Proceedings of the IECI Japan Workshop (pp. 26-30). Japan: Musashi Institute of Technology.

Barakbah, A. R., & Kiyoki, Y. (2009). A Pillar Algorithm for K-Means Optimization by Distance Maximization for Initial Centroid Designation. 61-68.

Bhagat, A., Kshirsagar, N., Khodke, P., Dongre, K., & Ali, S. (2016). Penalty parameter selection for hierarchical data stream clustering. 7th International Conference on Communication, Computing and Virtualization 2016 (pp. 24-31). Procedia Computer Science.

BPS-Statistics. (2019, June). Statistik Indonesia 2019. Retrieved from BPS - Statistics Indonesia:

Bunch, M. J., Kumaran, T. V., & Joseph, R. (2012). Using Geographic Information Systems (GIS) For Spatial Planning and Environmental Management in India: Critical Considerations. International Journal of Applied Science and Technology, 2(2), 40-54.

Govender, P., & Sivakumar, V. (2019). Application of k-means and hierarchical clustering techniques for analysis of air pollution: A review (1980–2019). Atmospheric Pollution Research, 40-56.

Grace, I. U. (2016). Application of Geographic Information Systems (GIS) in the Selection. South American Journal of Public Health(May).

Harahap, S. A., & Yanuarsyah, I. (2012). Aplikasi Sistem Informasi Geografis (SIG0 untuk Zonasi Jalur Penangkapan Ikan di Perairan Kalimantan Barat. Jurnal Akuatika, III(1), 40-48.

Hijmans, R. J., & Ghosh, A. (Feb 20, 2019). Spatial Data Analysis with R.

Johnson, R., & Wichern, D. (2007). Applied Multivariate Statistical Analysis sixth edition. New Jersey: Pearson Education,Inc.

Khan, S. S., & Ahmad, A. (2004). Cluster center initialization algorithm for K-means clustering. Pattern Recognition Letters, pp. 1293-1302.

Lathifaturrahmah. (2014). Perbandingan Hasil Penggerombolan K-Means, Fuzzy K-Means, dan Two Step Clustering. JPM IAIN Antasari, 39-62.

MacQueen, J. B. (1967). Some Methods for Classification and Analysis of Multivariate Observations. 281-297.

Ray, S., & Turi, R. (2000). Determination of Number of Clusters in K-Means Clustering and Application in Colour Image Segmentation. The 4th International Conference on Advances in Pattern Recognition and Digital Techniques (ICAPEDT'99).

Saxena, A., Prasad, M., Gupta, A., Bharill, N., Patel, O. P., Tiwari, A., . . . Lin, C.-T. (2017). Review of Clustering Techniques and Developments. Neurocomputing.

Simamora, B. (2005). Analisis Multivariat Pemasaran. Jakarta: PT.Gramedia Pustaka Utama.

Soemartini, & Supartini, E. (2017). Analisis K-Means Cluster untuk Pengelompokan Kabupaten /Kota Di Jawabarat Berdasarkan Indikator Masyarakat. Konferensi Nasional Penelitian Matematika dan Pembelajarannya II (KNPMP II) (pp. 144-154). Surakarta: Universitas Muhammadiyah Surakarta.

Steinhaus, H. (1956). Bulletin De L'Academie, 801-804.

UU Kesejahteraan Sosial. (2009). Retrieved from Undang-undang Republik Indonesia Nomor 11 Tahun 2009 Tentang Kesejahteraan Sosial:

Ward, J. (1963). Hierarchical Grouping to Optimize an Objective Function. Journal of The American Statistical Association, 236-244.

Yulianto, S., & Hidayatullah, K. H. (2014). Analisis Klaster untuk Pengelompokan Kabupaten/Kota di Provinsi Jawa Tengah Berdasarkan Indikator Kesejahteraan Masyarakat. Statistika, Vol.2, No.1, 56-63.

Zhang, C., & Fang, Z. (2013). An Improved K-means Clustering Algorithm. Journal of Information & Computational Science, 193–199.

Article Metrics

Abstract view : 189 times
PDF - 92 times

Article Metrics

Metrics Loading ...

Metrics powered by PLOS ALM