Limeta
Limeta (Lichess metadata) is a conversion tool for extracting chess game metadata from Lichess' public PGN files and exporting it to database tables.
What is a PGN file?
PGN (Portable Game Notation) files are plain-text files that provide records of chess games along with their metadata.
Although PGN files generally have the same structure, there are many variations which store different metadata fields.
Lichess PGN format
The extensive game database provided by Lichess (over 740 million games as of July 2019) uses the following format for each game of within their PGN files, each storing the data of millions of chess games.
This format is essentially the same as standard PGN formats, but with a few differences - particularly with the record moves section and some of the metadata fields.
[Event "Rated Bullet tournament https://lichess.org/tournament/yc1WW2Ox"]
[Site "https://lichess.org/PpwPOZMq"]
[White "Abbot"]
[Black "Costello"]
[Result "0-1"]
[UTCDate "2017.04.01"]
[UTCTime "11:32:01"]
[WhiteElo "2100"]
[BlackElo "2000"]
[WhiteRatingDiff "-4"]
[BlackRatingDiff "+1"]
[WhiteTitle "FM"]
[BlackTitle "GM"]
[ECO "B30"]
[Opening "Sicilian Defense: Old Sicilian"]
[TimeControl "300+0"]
[Termination "Time forfeit"]
1. e4 { [%eval 0.17] [%clk 0:00:30] } 1... c5 { [%eval 0.19] [%clk 0:00:30] }
2. Nf3 { [%eval 0.25] [%clk 0:00:29] } 2... Nc6 { [%eval 0.33] [%clk 0:00:30] }
3. Bc4 { [%eval -0.13] [%clk 0:00:28] } 3... e6 { [%eval -0.04] [%clk 0:00:30] }
4. c3 { [%eval -0.4] [%clk 0:00:27] } 4... b5? { [%eval 1.18] [%clk 0:00:30] }
5. Bb3?! { [%eval 0.21] [%clk 0:00:26] } 5... c4 { [%eval 0.32] [%clk 0:00:29] }
6. Bc2 { [%eval 0.2] [%clk 0:00:25] } 6... a5 { [%eval 0.6] [%clk 0:00:29] }
7. d4 { [%eval 0.29] [%clk 0:00:23] } 7... cxd3 { [%eval 0.6] [%clk 0:00:27] }
8. Qxd3 { [%eval 0.12] [%clk 0:00:22] } 8... Nf6 { [%eval 0.52] [%clk 0:00:26] }
9. e5 { [%eval 0.39] [%clk 0:00:21] } 9... Nd5 { [%eval 0.45] [%clk 0:00:25] }
10. Bg5?! { [%eval -0.44] [%clk 0:00:18] } 10... Qc7 { [%eval -0.12] [%clk 0:00:23] }
11. Nbd2?? { [%eval -3.15] [%clk 0:00:14] } 11... h6 { [%eval -2.99] [%clk 0:00:23] }
12. Bh4 { [%eval -3.0] [%clk 0:00:11] } 12... Ba6? { [%eval -0.12] [%clk 0:00:23] }
13. b3?? { [%eval -4.14] [%clk 0:00:02] } 13... Nf4? { [%eval -2.73] [%clk 0:00:21] } 0-1
The data is divided into two sections:
Metadata
This section consists of data about the chess game itself along with information about the players of the game.
[Event "Rated Bullet tournament https://lichess.org/tournament/yc1WW2Ox"]
[Site "https://lichess.org/PpwPOZMq"]
[White "Abbot"]
[Black "Costello"]
[Result "0-1"]
[UTCDate "2017.04.01"]
[UTCTime "11:32:01"]
[WhiteElo "2100"]
[BlackElo "2000"]
[WhiteRatingDiff "-4"]
[BlackRatingDiff "+1"]
[WhiteTitle "FM"]
[BlackTitle "GM"]
[ECO "B30"]
[Opening "Sicilian Defense: Old Sicilian"]
[TimeControl "300+0"]
[Termination "Time forfeit"]
Limeta extracts these individual key-value pairs and stores them in a record for each game in the PGN file, where the fields of the table represent the keys above.
Game record
The game record is an ordered sequence of the moves played in an individual chess game. These moves are recorded in standard algebraic notation (SAN). In the case of Lichess PGN files, chess engine evaluations are also sometimes included for each move (if enabled for that game).
1. e4 { [%eval 0.17] [%clk 0:00:30] } 1... c5 { [%eval 0.19] [%clk 0:00:30] }
2. Nf3 { [%eval 0.25] [%clk 0:00:29] } 2... Nc6 { [%eval 0.33] [%clk 0:00:30] }
3. Bc4 { [%eval -0.13] [%clk 0:00:28] } 3... e6 { [%eval -0.04] [%clk 0:00:30] }
4. c3 { [%eval -0.4] [%clk 0:00:27] } 4... b5? { [%eval 1.18] [%clk 0:00:30] }
5. Bb3?! { [%eval 0.21] [%clk 0:00:26] } 5... c4 { [%eval 0.32] [%clk 0:00:29] }
6. Bc2 { [%eval 0.2] [%clk 0:00:25] } 6... a5 { [%eval 0.6] [%clk 0:00:29] }
7. d4 { [%eval 0.29] [%clk 0:00:23] } 7... cxd3 { [%eval 0.6] [%clk 0:00:27] }
8. Qxd3 { [%eval 0.12] [%clk 0:00:22] } 8... Nf6 { [%eval 0.52] [%clk 0:00:26] }
9. e5 { [%eval 0.39] [%clk 0:00:21] } 9... Nd5 { [%eval 0.45] [%clk 0:00:25] }
10. Bg5?! { [%eval -0.44] [%clk 0:00:18] } 10... Qc7 { [%eval -0.12] [%clk 0:00:23] }
11. Nbd2?? { [%eval -3.15] [%clk 0:00:14] } 11... h6 { [%eval -2.99] [%clk 0:00:23] }
12. Bh4 { [%eval -3.0] [%clk 0:00:11] } 12... Ba6? { [%eval -0.12] [%clk 0:00:23] }
13. b3?? { [%eval -4.14] [%clk 0:00:02] } 13... Nf4? { [%eval -2.73] [%clk 0:00:21] } 0-1
As Limeta is only concerned with the metadata of chess games, this section is discarded during the parsing process.
Installation
To install the CLI:
$ gem install limeta
Usage
Once you have installed the CLI, you can inspect it using the command:
$ limeta -h
Commands:
limeta [FILES]
limeta --version, -v
Options:
-a, --adapter=ADAPTER
-c, [--conn=CONN]
-t, --table=TABLE
Limeta supports two database adapters: sqlite3
and psql
.
Options
-
--adapter, -a
: Database adapter (Required)
The database adapter to use for connecting to the database.
Must be either sqlite3
or psql
.
-
--table, -t
, Table name (Required)
The name of the database table to export the PGN metadata to.
-
--conn, -c
: Connection string (Not required)
The connection string for connecting to the database through the specified adapter. If not provided as an option argument, the user will be prompted to enter the information to form a connection string.
If using the sqlite3
adapter, the connection string is simply a path to the desired database file (with extension .sqlite3
, .sqlite
, .db
or .sql
).
Example: /Users/eonu/development/chess/data/lichess.sqlite3
If using the psql
adapter, the connection string is in the form of postgresql://[user[:password]@][netloc][:port][/dbname]
. Read more about PostgreSQL connection strings here.
Examples:
postgresql://eonu:munchy7@localhost:5432/chess
postgresql://eonu@46.101.90.215:5432/chess
Examples
$ limeta *.pgn -a sqlite3 -t games -c /Users/eonu/development/chess/data/lichess.sqlite3
$ limeta . -a psql -t games -c postgresql://eonu@46.101.90.215:5432/chess
$ limeta 2018-02.pgn -a psql -t games
Contributors
All contributions to this repository are greatly appreciated. Contribution guidelines can be found here.
© 2019, Edwin Onuonga - Released under the MIT License.
Authored and maintained by Edwin Onuonga.