Main Article Content

Abstract

This study explores the safest variations in the French Defense using 5,156 artificially generated chess games with Stockfish 17. Unlike prior work reliant on historical data, this method reduces theory bias by randomly selecting from the engine's top five moves at each position. We applied k-means clustering with cosine similarity to group move sequences based on evaluation scores. Both two-cluster and three-cluster models were tested. Stability was assessed via 50 resamples using 50% of the data. The three-cluster model, which includes a neutral group, had excellent stability (ARI = 0.99) but moderate cohesion (silhouette = 0.53). The two-cluster model showed better cohesion (silhouette = 0.65) but lower stability (ARI = 0.68). Among the variations, e5 (Advance) and exd5 (Exchange) stood out, with about 54% of games in each line falling into clusters favoring White. This suggests they are the safest and most reliable options. In contrast, Bb5+ performed well in simulations but poorly in real-world data, indicating theoretical risks. In summary, clustering on simulated games reveals hidden strategic insights, confirming e5 and exd5 as strong, low-risk choices for White in the French Defense.

Keywords

clustering analysis k-means with cosine similarity cluster stability with resampling adjusted rand index silhouette index

Article Details

How to Cite
Wijayanto, F. (2025). Mapping the Safest Routes: A Clustering Study of the French Defense. Jurnal Sains, Nalar, Dan Aplikasi Teknologi Informasi, 4(2), 54–61. https://doi.org/10.20885/snati.v4.i2.40910

References

  1. H. V. Ribeiro, R. S. Mendes, E. K. Lenzi, M. del Castillo-Mussot, and L. A. N. Amaral, “Move-by-Move Dynamics of the Advantage in Chess Matches Reveals Population-Level Learning of the Game,” PLoS One, vol. 8, no. 1, pp. 1–7, 2013, doi: 10.1371/journal.pone.0054165.
  2. S. Djuric, D. Komarov, and C. Pantaleoni, Chess Opening Essentials. Alkmaar: New in Chess, 2007.
  3. T. Cook, “The Advantage of Moving First in Amateur Online Chess,” ICGA Journal: The Journal of the Computer Games Community, vol. 46, no. 3–4, pp. 112–120, 2024, doi: 10.1177/13896911251315903.
  4. M. Barthelemy, “Space control and tipping points,” pp. 1–10.
  5. A. Gupta, C. Grattoni, and A. Gupta, “Determining Chess Piece Values Using Machine Learning,” Journal of Student Research, vol. 12, no. 1, pp. 1–20, 2023, doi: 10.47611/jsrhs.v12i1.4356.
  6. S. Chowdhary, I. Iacopini, and F. Battiston, “Quantifying human performance in chess,” Sci Rep, vol. 13, no. 1, pp. 1–8, 2023, doi: 10.1038/s41598-023-27735-9.
  7. F. Wijayanto, “Forsyth-Edwards Notation in Chess Game Clustering: A Depth-Based Evaluation,” Jurnal Sains, Nalar, dan Aplikasi Teknologi Informasi, vol. 4, no. 1, pp. 18–25, 2024, doi: 10.20885/snati.v4.i1.3.
  8. F. Wijayanto, “Clustering Analysis of Chess Portable Game Notation Text,” Jurnal Sains, Nalar, dan Aplikasi Teknologi Informasi, vol. 3, no. 3, pp. 137–142, 2024, doi: 10.20885/snati.v3.i3.42.
  9. M. K. Khan, S. Sarker, S. M. Ahmed, and M. H. A. Khan, “K-Cosine-Means Clustering Algorithm,” in Proceedings of International Conference on Electronics, Communications and Information Technology, ICECIT 2021, 2021. doi: 10.1109/ICECIT54077.2021.9641480.
  10. D. Dhillon, I. S. and Modha, “Concept Decompositions for Large Sparse Text Data.pdf,” Mach Learn, vol. 42, pp. 143–175, 2001, [Online]. Available: https://link-springer-com.unimib.idm.oclc.org/content/pdf/10.1023/A:1007612920971.pdf?pdf=inline link
  11. G. De Marzo and V. D. P. Servedio, “Quantifying the complexity and similarity of chess openings using online chess community data,” Sci Rep, vol. 13, no. 1, 2023, doi: 10.1038/s41598-023-31658-w.
  12. I. Cheng and I. Cheng, “Machine Learning to Study Patterns in Chess Games,” University of Exeter, 2024. doi: 10.13140/RG.2.2.30894.52807.
  13. V. Mehta, S. Bawa, and J. Singh, “Analytical review of clustering techniques and proximity measures,” Artif Intell Rev, vol. 53, no. 8, pp. 5995–6023, 2020, doi: 10.1007/s10462-020-09840-7.
  14. M. Rahul, P. Pal, V. Yadav, D. K. Dellwar, and S. Singh, “Impact of similarity measures in K-means clustering method used in movie recommender systems,” IOP Conf Ser Mater Sci Eng, vol. 1022, no. 1, 2021, doi: 10.1088/1757-899X/1022/1/012101.
  15. A. Dudek, “Silhouette Index as Clustering Evaluation Tool,” in Classification and Data Analysis, K. Jajuga, J. Batóg, and M. Walesiak, Eds., Cham: Springer International Publishing, 2020, pp. 19–33.
  16. S. Dudoit and J. Fridlyand, “A prediction-based resampling method for estimating the number of clusters in a dataset,” Genome Biol, vol. 3, no. 7, pp. 1–21, 2002, doi: 10.1186/gb-2002-3-7-research0036.
  17. W. Wu, Z. Xu, G. Kou, and Y. Shi, “Decision-Making Support for the Evaluation of Clustering Algorithms Based on MCDM,” Complexity, vol. 2020, 2020, doi: 10.1155/2020/9602526.
  18. C. Hennig, “Cluster-wise assessment of cluster stability,” Comput Stat Data Anal, vol. 52, no. 1, pp. 258–271, 2007, doi: https://doi.org/10.1016/j.csda.2006.11.025.
  19. N. Meinshausen and P. Bühlmann, “Stability selection,” J R Stat Soc Series B Stat Methodol, vol. 72, no. 4, pp. 417–473, 2010, doi: 10.1111/j.1467-9868.2010.00740.x.
  20. M. J. Warrens and H. van der Hoef, “Understanding the Adjusted Rand Index and Other Partition Comparison Indices Based on Counting Object Pairs,” J Classif, vol. 39, no. 3, pp. 487–509, 2022, doi: 10.1007/s00357-022-09413-z.
  21. “Chess.com.” [Online]. Available: https://www.chess.com