Translating FEN in a CQL Universe

This document (still under development) is a companion to the CQL by Example document presented on the Scid++ project site. The material that follows addresses a method which may be employed in the translation of a FEN string to and from a format which is perhaps more useful in a CQL query.

Such translations might be particularly useful in multipass queries or, for example, in saving manipulated (e.g., transformed) positions to a file in EPD format. Those applications of the language are not of much interest for most CQL users, and the solutions and examples presented are (we admit) a bit out there in left field. Caveat emptor.

Motivation

A FEN string gives us a convenient, compact way to carry around a board's state. But CQL filters (for the most part) don't evaluate a representation of the board manifest as a FEN string. Most filters are applied to and/or yield a set of squares represented as a set variable (bitboard). We need a method for translating between the compact format and a useful format.

While translation of FEN strings can be useful in a single pass query, the most common application probably is found in multipass queries where the persistent data is stored as a list of FEN strings. Version 6.1 of the language introduces the services that we'll need for passing along that data from one query to the next. We'll give a couple of examples at the end of this document.

Board representations

A board representation should include (at a minimum) the type and color of every piece on the board and the rank and file of the square occupied by the piece. A FEN string's piece placement data (PPD) field carries that information. But the format of that field is not particularly convenient, and so we set out in this document to provide a format and structure for the data that is more friendly to the CQL language.

Probably the most rational format for a board representation in a CQL context is the bitboard (set variable). But a single bitboard is limited in the amount of information that it can carry, and so we need a dozen of them to carry the same information that a FEN PPD carries (one for each piece colortype). We really don't want to have to name and manage a dozen set variables, but we can throw them all into a CQL dictionary and give the collection our one convenient handle.

We'll develop some methods for translating between a FEN PPD and a set of bitboards below. But first we'll give a visual dump that might assist in understanding the various formats that we recognize as board representations.

R7/8/8/1K6/1N6/1k6/4pp2/B7

Given the FEN PPD above, we'll construct the dictionary below with a bitboard per piece colortype.

Dictionary importBB with 12 entries: 
B: a1
K: b5
N: b4
P: []
Q: []
R: a8
b: []
k: b3
n: []
p: [e2,f2]
q: []
r: []

After shifting every bitboard right by one square, we get the representation below.

Dictionary exportBB with 12 entries: 
B: b1
K: c5
N: c4
P: []
Q: []
R: b8
b: []
k: c3
n: []
p: [f2,g2]
q: []
r: []

When translating back to a FEN PPD, we reorder the data above to give us ranks of piece types by file (the PBR format). Since the PPD is arranged by rank, the reordered board representation is considerably easier to process.

Dictionary PBR with 8 entries: 
1: Bb 
2: pf pg 
3: kc 
4: Nc 
5: Kc 
6: 
7: 
8: Rb 

Sliced pumpernickel

Or the next best thing...

Thank the CQL gods for the v6.1 release. At last we have a selection of awesome features which can be abused with abandon and without breaking something (we are renowned for breaking things).

The ~~ form of the while filter — a fantastic gift for the lazy hack — with the application of regular expressions, will iterate over any repeating pattern that can be squeezed into a string. Abusive how? We include in our definition of a repeating pattern any random sequence of characters. We are in iterator heaven.

To be sure, we take this abusive practice to a point of malpractice, but it's just too good to let go of. We hope the CQL developers can forgive the transgression.

The unfathomable solution

The translation machinery is not what one would call "elegant". We're rather feeble-minded down here on the farm and are simply happy that it works and that it serves our purpose.

Our decision to enlist the dictionary as a bitboard container brings along with it a minor awkwardness. From the online reference:

Dictionaries cannot currently be used as the actual argument to a function or returned as the value of a filter. Thus, whenever a dictionary is used, its actual name, as a variable, must be used.

And so we must name our incoming and outgoing dicts as a matter of convention. Not a big deal. We can do that. For our incoming collection of bitboards translated from FEN, we'll use the highly imaginative importBB handle. And on the outgoing end of things — translating back to the FEN format — the exportBB handle.

FEN<=>BB

The translation facilities comprising libfen.cql. The meat on the bone, so to speak. There's no point in trying to decipher the code. We can't even figure it out for ourselves. But it works. It gets the job done.

Note that we give local variables something of a "namespace" by prepending the library's extension to the variable names for names which are not "public". CQL has no scoping of names and so we employ a naming scheme to minimize the likelihood of naming conflicts between libraries and/or queries which employ those libraries.

  • FEN to BB
  • Take a FEN PPD and convert it to a collection of bitboards.
  • dictionary importBB[""] = []
    function fen2bb(FEN) {
      unbind importBB
      while ("KQRBNPkqrbnp" ~~ ".") importBB[\0] = []
    
      fen_Rank = 8
      while ((FEN ~~ "^([^ ]+)") ~~ "[^/]+") {
        fen_File = 1
        while (\0 ~~ ".")
          if int \0
            then fen_File += int \0
            else {
              importBB[\0] = importBB[\0] | makesquare(fen_File fen_Rank)
              fen_File += 1
            }
        fen_Rank -= 1
      }
    }
    

  • BB to FEN
  • Take a dictionary of [presumably manipulated] bitboards and convert them to a FEN PPD string. The data is first reorganized to a "by rank" ordering. We make no assumption about the ordering of the elements along the rank.
  • dictionary exportBB[""] = []
    function bb2fen() {
      dictionary fen_PBR  // pieces by rank
      unbind fen_PBR   while ("12345678" ~~ ".") fen_PBR[\0] = ""
    
      fen_Key ∊ exportBB
        fen_Square ∊ exportBB[fen_Key]
          fen_PBR[str(fen_Square)[1]] =
          fen_PBR[str(fen_Square)[1]] + fen_Key + str(fen_Square)[0] + " "
    
      fen_PPD = ""  // the FEN piece placement data field
      while ("87654321" ~~ ".") {
        fen_Key1 = \0
        fen_CES = 0  // contiguous empty squares
        while ("abcdefgh" ~~ ".") {
          fen_Key2 = \0
          fen_Found = 0
          while (fen_PBR[fen_Key1] ~~ "[^ ]+") {
            if fen_Key2 == \0[1] {
              if fen_CES > 0 then fen_PPD += str(fen_CES) and fen_CES = 0
              fen_PPD += \0[0]  fen_Found = 1
            }
          }
          if fen_Found == 0 then fen_CES += 1
        }
        if fen_CES > 0 then fen_PPD += str(fen_CES)
        if fen_Key1 != "1" then fen_PPD += "/"
      }
      fen_PPD
    }
    

Examples

We give examples of FEN translation for both single pass and multipass queries. Note the use of the include pseudo-directive in the CQL header, pulling in the library given above.

Single pass queries

Explanation...

  • Transforms to EPD
  • We'll demonstrate writing EPD entries to file of shifted boards that have been imported from FEN. The query is executed on a slice of a database of problems with non-standard starting positions. The EPD file will contain before and after entries for each initial position for easy comparison in an EPD viewer.1
  • cql(include libcql/libfen.cql quiet)
    initial
    
    fen2bb(fen)
    
    // Shift the board.
    unbind exportBB
    while ("KQRBNPkqrbnp" ~~ ".")
      exportBB[\0] = right 1 importBB[\0]
    
    FEN = bb2fen()
    
    if gamenumber == 1 then persistent ID = 0
    
    // Before and after positions.
    ID += 1
    writefile("/tmp/out.cqo"
      str(fen ~~ "^[^ ]+ [bw] " "- - c0 " \" "original position" \" "; id " \" ID \" \n))
    ID += 1
    writefile("/tmp/out.cqo"
      str(FEN + fen ~~ " [bw] " "- - c0 " \" "transformed position" \" "; id " \" ID \" \n))
    

  • Example II
  • Explanation...

Multiple pass queries

Version 6.1 of the CQL language gives us the ability to write persistent data to file, giving us a means of pasting together multiple passes, one query acting on the determinations of the prior query. The application of multiple passes represents a whole new universe for CQL. The possibilities are boundless. We'll give a couple of samples below.

  • What am I looking for?
  • This category of multipass query addresses a scenario in which the user has no clue about what the question is. The first pass is designed to determine the question, the second pass to answer the question.

    Suppose we wake up one morning (no coffee yet) wondering why, more prevalently with some pawn structures than with others, a queen exchange with queens on d1 and d8 and a recapture by the king is or is not likely to happen. What does the pawn structure have to do with the determination? Where is the correlation?

    We could throw together any number of single pass queries that would help to answer the question. But suppose we're wondering about positions having the same pawn structure in which the exchange is and is not accepted. What are the differences in the board configurations for a given pawn structure?

    The point is that we don't know what the pawn structures are that we're looking for. We could filter for all games in which the exchange is accepted and all games in which the exchange is not accepted and then step our way through every matching position looking for a pattern. Good luck with that. (A user points out that sorting by pawn structure in a single pass might produce better results. We hate it when that happens.)

    Or we could devise a multipass approach in which we collect a sample of pawn structures in which the exchange is accepted in the first pass, and then look for the same pawn structures in a second pass in which the exchange is declined. A contrived and pointless scenario? Perhaps. User's choice.

    In any event, we'll demonstrate how one would go about devising the multiple passes, not because they yield anything useful but as a demonstration of a technique that may be applied in a scenario that maybe would produce useful results.

  • cql(include libcql/libfen.cql)
    
    function pass1CQL() {
      dictionary noDups
      ply <= 20
      {line quiet lastposition
        --> move from Qd1 capture qd8 --> move from k capture Q
      }:{ if not noDups[fen]
            then {
              noDups[fen] = ""
              writefile("/tmp/out.cqo"  str(fen \n))
            } else false
        }
    }
    
    function pass2CQL() {
      initial
      if (gamenumber == 1) {
        dictionary Pawns
        Fens = readfile "/tmp/out.cqo"
        // Cherry pick the relevant.
        while (Fens ~~ "[^\n]+") {    
          fen2bb(\0)
          Pawns[str(importBB["P"]) + str(importBB["p"])] = ""
        }
      }
    
      find { 
        ply <= 20
        Qd1 attacks qd8  k attacks q
        Pawns[str(P) + str(p)]
      }:{
          not find { ply <= 20  move from Qd1 capture qd8 }
          comment("CQL " P p)
        }
    }
    
    pass1CQL()
    //pass2CQL()
    
  • In the first pass we collect FEN strings in the opening phase of the game in which the exchange has occurred. Duplicate positions are filtered out. In the follow-on pass we cherry pick the pawn structures from the positions and save them to a dict. For each game we look for matching pawn structures and ask if the line allows for the exchange but has no exchange.

    To select between pass one and pass two, just do a bit of commenting and uncommenting of the last two lines of the query. Windows users might have an issue with path designations for the files. We have no idea.


  • Example II
  • Explanation...

See also

Credits

This document is Copyright (c) 2019-2022 Lionel Hampton. The Chess Query Language was originally developed by Gady Costeff and Lewis Stiller. The upstream Scid vs PC project is managed by Steven Atkinson.

Footnotes

1 The Scid application, unfortunately, does not provide for friendly viewing of EPD files, as it does not allow for duplicate positions (who decided that?). We use the PyChess application for viewing an EPD file... not perfect but better.