kn1ght
A work in progress...
Tokenizer Differences
kn1ght's tokenizer is optimized for Chess' Portable Game Notation (PGN) format.
Note: kn1ght's tokenizer does not currently account for PGN metadata (Event, Site, Date, etc.), PGN comments ({...}), notes about clock times ({[%clk ...]}), or other miscellaneous PGN data. It only focuses on the actual moves played in the game.
It has been trained on a small dataset of 3.5M chess games from ChessDB cleaned up by Kaggle user milesh1.
On this page
Languages
Jupyter Notebook99.5%Python0.5%
Contributors
Created January 16, 2025
Updated January 28, 2025
