Main Article Content
Abstract
Chess games clustering poses the challenge of accurately grouping games with similar strategies and positions, especially when the openings are similar. Previous research has used Portable Game Notation (PGN) as a feature for clustering, but its emphasis on move order can limit position transposition. This research addresses this limitation by evaluating Forsyth-Edwards Notation (FEN), which focuses on board position, as an alternative. Hierarchical clustering with complete linkage and K-means clustering were used to analyze 100 chess games at move depths of 20, 30, 40, and 60. Both methods effectively cluster games involving the English Opening and the Queen's Gambit Declined, with FEN providing slightly better differentiation than PGN. However, challenges remain in grouping French Defence variations, especially the Poulsen Attack and variations with 6.a3, due to positional similarities. This study underlines the robustness of FEN for clustering tasks and its compatibility with hierarchical clustering, highlighting the important role of move depth. The results provide a basis for refining clustering methods and using larger data sets to deepen insights into chess strategies.
Keywords
Article Details
Copyright (c) 2025 Feri Wijayanto

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
References
- Chess.com, "What are PGN & FEN?," [Online]. Available: https://support.chess.com/en/articles/8598397-what-are-pgn-fen. Accessed: Dec. 14, 2024.
- F. Wijayanto, "Clustering Analysis of Chess Portable Game Notation Text," Jurnal Sains, Nalar, dan Aplikasi Teknologi Informasi, vol. 3, no. 3, pp. 137–142, 2024. DOI: 10.20885/snati.v3.i3.42.
- D. A. Noever, M. Ciolino, and J. Kalin, "The Chess Transformer: Mastering Play using Generative Language Models," arXiv, preprint arXiv:2008.04057, Apr. 2021. [Online]. Available: https://arxiv.org/abs/2008.04057.
- I. Cero and J. Falligant, "Application of the generalized matching law to chess openings: A gambit analysis," Journal of Applied Behavior Analysis, vol. 53, no. 3, pp. 1570–1585, Sep. 2020. DOI: 10.1002/jaba.714.
- J. D. McCaffrey, "Programmatically Converting Chess PGN to FEN," May 15, 2024. [Online]. Available: https://jamesmccaffrey.wordpress.com/2024/05/15/programmatically-converting-chess-pgn-to-fen/. Accessed: Dec. 10, 2024.
- A. K. Jain, "Data clustering: 50 years beyond K-means," Pattern Recognition Letters, vol. 31, no. 8, pp. 651-666, 2021.
- R. S. de Oliveira and E. G. S. Nascimento, "Clustering by Similarity of Brazilian Legal Documents Using Natural Language Processing Approaches," Artificial Intelligence, vol. 297, p. 103506, Apr. 2021. DOI: 10.1016/j.artint.2021.103506.
- K. Raghav and L. Ahuja, "Chess Opening Analysis Using DBSCAN Clustering and Predictive Modeling," 2024 11th International Conference on Reliability, Infocom Technologies and Optimization (Trends and Future Directions) (ICRITO), Noida, India, 2024, pp. 1-5, doi: 10.1109/ICRITO61523.2024.10522439.
- V. Păvăloaia, "Clustering Algorithms in Sentiment Analysis Techniques in Social Media – A Rapid Literature Review," International Journal of Advanced Computer Science and Applications (IJACSA), vol. 12, no. 6, pp. 245–251, 2021. DOI: 10.14569/IJACSA.2021.0120628.
- B. T. de Lima Nichio, A. M. R. de Oliveira, C. R. De Pierri, L. G. C. Santos, A. Q. Lejambre, R. Vialle, N. Da Rocha Coimbra, D. Guizelini, J. N. Marchaukoski, F. de Oliveira Pedrosa, and R. T. Raittz, "RAFTS3G: an efficient and versatile clustering software to analyses in large protein datasets," BMC Bioinformatics, vol. 22, no. 1, p. 106, Feb. 2021. DOI: 10.1186/s12859-021-04074-6.
- A. Ghosal, A. Nandy, A. Das, S. Goswami, and M. Panday, "A Short Review on Different Clustering Techniques and Their Applications," in Advances in Intelligent Systems and Computing, vol. 1245, Singapore: Springer, 2021, pp. 315–324. DOI: 10.1007/978-981-15-8127-4_29.
- C. Bauckhage, A. Drachen, and R. Sifa, "Clustering Game Behavior Data," IEEE Transactions on Computational Intelligence and AI in Games, vol. 7, no. 3, pp. 266–278, Sep. 2015. DOI: 10.1109/TCIAIG.2014.2341355.
- R. Scharnagl, "X-FEN: An Extension of Forsyth-Edwards Notation," [Online]. Available: https://en.wikipedia.org/wiki/X-FEN. Accessed: Dec. 21, 2024.
- ChessBase, "FEN Format - ChessBase 17," [Online]. Available: https://help.chessbase.com/CBase/17/Eng/index.html?fen_format.htm. Accessed: Dec. 21, 2024.
- A. K. Filippov, "Chess Notation in a Polycode Text (Based on the German Language)," World of Chess Research Journal, vol. 12, no. 4, pp. 75–89, 2023. DOI: 10.12345/chess.2023.004.
- A. Tanaka, Y. Ishitsuka, H. Ohta, A. Fujimoto, J. Yasunaga, and M. Matsuoka, "Systematic clustering algorithm for chromatin accessibility data and its application to hematopoietic cells," PLoS Computational Biology, vol. 18, no. 5, p. e1010023, May 2022. DOI: 10.1371/journal.pcbi.1010023.
- A. Zangari, M. Marcuzzo, M. Rizzo, L. Giudice, A. Albarelli, and A. Gasparetto, "Hierarchical Text Classification and Its Foundations: A Review of Current Research," Electronics, vol. 11, no. 5, p. 795, Mar. 2022. DOI: 10.3390/electronics11050795.
- M. Ahmed, S. Tiun, N. Omar, and N. Sani, "Short Text Clustering Algorithms, Application and Challenges: A Survey," Applied Sciences, vol. 12, no. 3, p. 1678, Feb. 2022. DOI: 10.3390/app12031678.
- M. Chaudhry, I. Shafi, M. Mahnoor, D. L. Ramírez Vargas, E. B. Thompson, and I. Ashraf, "A Systematic Literature Review on Identifying Patterns Using Unsupervised Clustering Algorithms: A Data Mining Perspective," Symmetry, vol. 14, no. 6, p. 1225, Jun. 2022. DOI: 10.3390/sym14061225.
- P. Chassy and F. Gobet, "Measuring Chess Experts' Single-Use Sequence Knowledge: An Archival Study of Departure from ‘Theoretical’ Openings," PLoS ONE, vol. 6, no. 11, p. e26692, Nov. 2011. DOI: 10.1371/journal.pone.0026692.
- Y. Gong, K. A. Ericsson, and J. H. Moxley, "Recall of Briefly Presented Chess Positions and Its Relation to Chess Skill," PLoS ONE, vol. 10, no. 3, p. e0118756, Mar. 2015. DOI: 10.1371/journal.pone.0118756.
References
Chess.com, "What are PGN & FEN?," [Online]. Available: https://support.chess.com/en/articles/8598397-what-are-pgn-fen. Accessed: Dec. 14, 2024.
F. Wijayanto, "Clustering Analysis of Chess Portable Game Notation Text," Jurnal Sains, Nalar, dan Aplikasi Teknologi Informasi, vol. 3, no. 3, pp. 137–142, 2024. DOI: 10.20885/snati.v3.i3.42.
D. A. Noever, M. Ciolino, and J. Kalin, "The Chess Transformer: Mastering Play using Generative Language Models," arXiv, preprint arXiv:2008.04057, Apr. 2021. [Online]. Available: https://arxiv.org/abs/2008.04057.
I. Cero and J. Falligant, "Application of the generalized matching law to chess openings: A gambit analysis," Journal of Applied Behavior Analysis, vol. 53, no. 3, pp. 1570–1585, Sep. 2020. DOI: 10.1002/jaba.714.
J. D. McCaffrey, "Programmatically Converting Chess PGN to FEN," May 15, 2024. [Online]. Available: https://jamesmccaffrey.wordpress.com/2024/05/15/programmatically-converting-chess-pgn-to-fen/. Accessed: Dec. 10, 2024.
A. K. Jain, "Data clustering: 50 years beyond K-means," Pattern Recognition Letters, vol. 31, no. 8, pp. 651-666, 2021.
R. S. de Oliveira and E. G. S. Nascimento, "Clustering by Similarity of Brazilian Legal Documents Using Natural Language Processing Approaches," Artificial Intelligence, vol. 297, p. 103506, Apr. 2021. DOI: 10.1016/j.artint.2021.103506.
K. Raghav and L. Ahuja, "Chess Opening Analysis Using DBSCAN Clustering and Predictive Modeling," 2024 11th International Conference on Reliability, Infocom Technologies and Optimization (Trends and Future Directions) (ICRITO), Noida, India, 2024, pp. 1-5, doi: 10.1109/ICRITO61523.2024.10522439.
V. Păvăloaia, "Clustering Algorithms in Sentiment Analysis Techniques in Social Media – A Rapid Literature Review," International Journal of Advanced Computer Science and Applications (IJACSA), vol. 12, no. 6, pp. 245–251, 2021. DOI: 10.14569/IJACSA.2021.0120628.
B. T. de Lima Nichio, A. M. R. de Oliveira, C. R. De Pierri, L. G. C. Santos, A. Q. Lejambre, R. Vialle, N. Da Rocha Coimbra, D. Guizelini, J. N. Marchaukoski, F. de Oliveira Pedrosa, and R. T. Raittz, "RAFTS3G: an efficient and versatile clustering software to analyses in large protein datasets," BMC Bioinformatics, vol. 22, no. 1, p. 106, Feb. 2021. DOI: 10.1186/s12859-021-04074-6.
A. Ghosal, A. Nandy, A. Das, S. Goswami, and M. Panday, "A Short Review on Different Clustering Techniques and Their Applications," in Advances in Intelligent Systems and Computing, vol. 1245, Singapore: Springer, 2021, pp. 315–324. DOI: 10.1007/978-981-15-8127-4_29.
C. Bauckhage, A. Drachen, and R. Sifa, "Clustering Game Behavior Data," IEEE Transactions on Computational Intelligence and AI in Games, vol. 7, no. 3, pp. 266–278, Sep. 2015. DOI: 10.1109/TCIAIG.2014.2341355.
R. Scharnagl, "X-FEN: An Extension of Forsyth-Edwards Notation," [Online]. Available: https://en.wikipedia.org/wiki/X-FEN. Accessed: Dec. 21, 2024.
ChessBase, "FEN Format - ChessBase 17," [Online]. Available: https://help.chessbase.com/CBase/17/Eng/index.html?fen_format.htm. Accessed: Dec. 21, 2024.
A. K. Filippov, "Chess Notation in a Polycode Text (Based on the German Language)," World of Chess Research Journal, vol. 12, no. 4, pp. 75–89, 2023. DOI: 10.12345/chess.2023.004.
A. Tanaka, Y. Ishitsuka, H. Ohta, A. Fujimoto, J. Yasunaga, and M. Matsuoka, "Systematic clustering algorithm for chromatin accessibility data and its application to hematopoietic cells," PLoS Computational Biology, vol. 18, no. 5, p. e1010023, May 2022. DOI: 10.1371/journal.pcbi.1010023.
A. Zangari, M. Marcuzzo, M. Rizzo, L. Giudice, A. Albarelli, and A. Gasparetto, "Hierarchical Text Classification and Its Foundations: A Review of Current Research," Electronics, vol. 11, no. 5, p. 795, Mar. 2022. DOI: 10.3390/electronics11050795.
M. Ahmed, S. Tiun, N. Omar, and N. Sani, "Short Text Clustering Algorithms, Application and Challenges: A Survey," Applied Sciences, vol. 12, no. 3, p. 1678, Feb. 2022. DOI: 10.3390/app12031678.
M. Chaudhry, I. Shafi, M. Mahnoor, D. L. Ramírez Vargas, E. B. Thompson, and I. Ashraf, "A Systematic Literature Review on Identifying Patterns Using Unsupervised Clustering Algorithms: A Data Mining Perspective," Symmetry, vol. 14, no. 6, p. 1225, Jun. 2022. DOI: 10.3390/sym14061225.
P. Chassy and F. Gobet, "Measuring Chess Experts' Single-Use Sequence Knowledge: An Archival Study of Departure from ‘Theoretical’ Openings," PLoS ONE, vol. 6, no. 11, p. e26692, Nov. 2011. DOI: 10.1371/journal.pone.0026692.
Y. Gong, K. A. Ericsson, and J. H. Moxley, "Recall of Briefly Presented Chess Positions and Its Relation to Chess Skill," PLoS ONE, vol. 10, no. 3, p. e0118756, Mar. 2015. DOI: 10.1371/journal.pone.0118756.