Extended Position Description - Het Familieklokje

Section 16.2 from Portable Game Notation.

EPD is “Extended Position Description”; it is a standard for describing chess positions along with an extended set of structured attribute values using the ASCII character set. It is intended for data and command interchange among chessplaying programs. It is also intended for the representation of portable opening library repositories.

A single EPD uses one text line of variable length composed of four data field followed by zero or more operations. The four fields of the EPD specification are the same as the first four fields of the FEN specification.

A text file composed exclusively of EPD data records should have a file name with the suffix “.epd”.

1: History

EPD is based in part on the earlier FEN standard; it has added extensions for use with opening library preparation and also for general data and command interchange among advanced chess programs. EPD was developed by John Stanback and Steven Edwards; its first implementation is in Stanback’s master strength chessplaying program Zarkov.

2: Uses for an extended position notation

Like FEN, EPD can also be used for general position description. However, unlike FEN, EPD is designed to be expandable by the addition of new operations that provide new functionality as needs arise.

Many interesting chess problem sets represented using EPD can be found at the chess.uoknor.edu ftp site in the directory pub/chess/SAN_testsuites.

3: Data fields

EPD specifies the piece placement, the active color, the castling availability, and the en passant target square of a position. These can all fit on a single text line in an easily read format. The length of an EPD position description varies somewhat according to the position and any associated operations. In some cases, the description could be eighty or more characters in length and so may not fit conveniently on some displays. However, most EPD descriptions pass among programs only and these are not usually seen by program users.

(Note: due to the likelihood of future expansion of EPD, implementors are encouraged to have their programs handle EPD text lines of up to 1024 characters long.)

Each EPD data field is composed only of non-blank printing ASCII characters. Adjacent data fields are separated by a single ASCII space character.

3.1: Piece placement data

The first field represents the placement of the pieces on the board. The board contents are specified starting with the eighth rank and ending with the first rank. For each rank, the squares are specified from file a to file h. White pieces are identified by uppercase SAN piece letters (”PNBRQK”) and black pieces are identified by lowercase SAN piece letters (”pnbrqk”). Empty squares are represented by the digits one through eight; the digit used represents the count of contiguous empty squares along a rank. A solidus character “/” is used to separate data of adjacent ranks.

3.2: Active color

The second field represents the active color. A lower case “w” is used if White is to move; a lower case “b” is used if Black is the active player.

3.3: Castling availability

The third field represents castling availability. This indicates potential future castling that may or may not be possible at the moment due to blocking pieces or enemy attacks. If there is no castling availability for either side, the single character symbol “-” is used. Otherwise, a combination of from one to four characters are present. If White has kingside castling availability, the uppercase letter “K” appears. If White has queenside castling availability, the uppercase letter “Q” appears. If Black has kingside castling availability, the lowercase letter “k” appears. If Black has queenside castling availability, then the lowercase letter “q” appears. Those letters which appear will be ordered first uppercase before lowercase and second kingside before queenside. There is no white space between the letters.

3.4: En passant target square

The fourth field is the en passant target square. If there is no en passant target square then the single character symbol “-” appears. If there is an en passant target square then is represented by a lowercase file character immediately followed by a rank digit. Obviously, the rank digit will be “3″ following a white pawn double advance (Black is the active color) or else be the digit “6″ after a black pawn double advance (White being the active color).

An en passant target square is given if and only if the last move was a pawn advance of two squares. Therefore, an en passant target square field may have a square name even if there is no pawn of the opposing side that may immediately execute the en passant capture.

4: Operations

An EPD operation is composed of an opcode followed by zero or more operands and is concluded by a semicolon.

Multiple operations are separated by a single space character. If there is at least one operation present in an EPD line, it is separated from the last (fourth) data field by a single space character.

4.1: General format

An opcode is an identifier that starts with a letter character and may be followed by up to fourteen more characters. Each additional character may be a letter or a digit or the underscore character.

An operand is either a set of contiguous non-white space printing characters or a string. A string is a set of contiguous printing characters delimited by a quote character at each end. A string value must have less than 256 bytes of data.

If at least one operand is present in an operation, there is a single space between the opcode and the first operand. If more than one operand is present in an operation, there is a single blank character between every two adjacent operands. If there are no operands, a semicolon character is appended to the opcode to mark the end of the operation. If any operands appear, the last operand has an appended semicolon that marks the end of the operation.

Any given opcode appears at most once per EPD record. Multiple operations in a single EPD record should appear in ASCII order of their opcode names (mnemonics). However, a program reading EPD records may allow for operations not in ASCII order by opcode mnemonics; the semantics are the same in either
case.

Some opcodes that allow for more than one operand may have special ordering requirements for the operands. For example, the “pv” (predicted variation) opcode requires its operands (moves) to appear in the order in which they would be played. All other opcodes that allow for more than one operand should have operands appearing in ASCII order. An example of the latter set is the “bm” (best move[s]) opcode; its operands are moves that are all immediately playable from the current position.

Some opcodes require one or more operands that are chess moves. These moves should be represented using SAN. If a different representation is used, there is no guarantee that the EPD will be read correctly during subsequent processing.

Some opcodes require one or more operands that are integers. Some opcodes may require that an integer operand must be within a given range; the details are described in the opcode list given below. A negative integer is formed with a hyphen (minus sign) preceding the integer digit sequence. An optional plus sign may be used for indicating a non-negative value, but such use is not required and is indeed discouraged.

Some opcodes require one or more operands that are floating point numbers. Some opcodes may require that a floating point operand must be within a given range; the details are described in the opcode list given below. A floating point operand is constructed from an optional sign character (”+” or “-”), a digit sequence (with at least one digit), a radix point (always “.”), and a final digit sequence (with at least one digit).

4.2: Opcode mnemonics

An opcode mnemonic used for archival storage and for interprogram communication starts with a lower case letter and is composed of only lower case letters, digits, and the underscore character (i.e., no upper case letters). These mnemonics will also all be at least two characters in length.

Opcode mnemonics used only by a single program or an experimental suite of programs should start with an upper case letter. This is so they may be easily distinguished should they be inadvertently be encountered by other programs. When a such a “private” opcode be demonstrated to be widely useful, it should be brought into the official list (appearing below) in a lower case form.

If a given program does not recognize a particular opcode, that operation is simply ignored; it is not signaled as an error.

5: Opcode list

The opcodes are listed here in ASCII order of their mnemonics. Suggestions for new opcodes should be sent to the PGN standard coordinator listed near the start of this document.

5.1: Opcode “acn”: analysis count: nodes

The opcode “acn” takes a single non-negative integer operand. It is used to represent the number of nodes examined in an analysis. Note that the value may be quite large for some extended searches and so use of (at least) a long (four byte) representation is suggested.

5.2: Opcode “acs”: analysis count: seconds

The opcode “acs” takes a single non-negative integer operand. It is used to represent the number of seconds used for an analysis. Note that the value may be quite large for some extended searches and so use of (at least) a long (four byte) representation is suggested.

5.3: Opcode “am”: avoid move(s)

The opcode “am” indicates a set of zero or more moves, all immediately playable from the current position, that are to be avoided in the opinion of the EPD writer. Each operand is a SAN move; they appear in ASCII order.

5.4: Opcode “bm”: best move(s)

The opcode “bm” indicates a set of zero or more moves, all immediately playable from the current position, that are judged to the best available by the EPD writer. Each operand is a SAN move; they appear in ASCII order.

5.5: Opcode “c0″: comment (primary, also “c1″ though “c9″)

The opcode “c0″ (lower case letter “c”, digit character zero) indicates a top level comment that applies to the given position. It is the first of ten ranked comments, each of which has a mnemonic formed from the lower case letter “c” followed by a single decimal digit. Each of these opcodes takes either a single string operand or no operand at all.

This ten member comment family of opcodes is intended for use as descriptive commentary for a complete game or game fragment. The usual processing of these opcodes are as follows:

1) At the beginning of a game (or game fragment), a move sequence scanning program initializes each element of its set of ten comment string registers to be null.

2) As the EPD record for each position in the game is processed, the comment operations are interpreted from left to right. (Actually, all operations in n EPD record are interpreted from left to right.) Because operations appear in ASCII order according to their opcode mnemonics, opcode “c0″ (if present) will be handled prior to all other opcodes, then opcode “c1″ (if present), and so forth until opcode “c9″ (if present).

3) The processing of opcode “cN” (0 <= N <= 9) involves two steps. First, all comment string registers with an index equal to or greater than N are set to null. (This is the set “cN” though “c9″.) Second, and only if a string operand is present, the value of the corresponding comment string register is set equal to the string operand.

5.6: Opcode “ce”: centipawn evaluation

The opcode “ce” indicates the evaluation of the indicated position in centipawn units. It takes a single operand, an optionally signed integer that gives an evaluation of the position from the viewpoint of the active player; i.e., the player with the move. Positive values indicate a position favorable to the moving player while negative values indicate a position favorable to the passive player; i.e., the player without the move. A centipawn evaluation value close to zero indicates a neutral positional evaluation.

Values are restricted to integers that are equal to or greater than -32767 and are less than or equal to 32766.

A value greater than 32000 indicates the availability of a forced mate to the active player. The number of plies until mate is given by subtracting the evaluation from the value 32767. Thus, a winning mate in N fullmoves is a mate in ((2 * N) – 1) halfmoves (or ply) and has a corresponding centipawn evaluation of (32767 – ((2 * N) – 1)). For example, a mate on the move (mate in one) has a centipawn evaluation of 32766 while a mate in five has a centipawn evaluation of 32758.

A value less than -32000 indicates the availability of a forced mate to the passive player. The number of plies until mate is given by subtracting the evaluation from the value -32767 and then negating the result. Thus, a losing mate in N fullmoves is a mate in (2 * N) halfmoves (or ply) and has a corresponding centipawn evaluation of (-32767 + (2 * N)). For example, a mate after the move (losing mate in one) has a centipawn evaluation of -32765 while a losing mate in five has a centipawn evaluation of -32757.

A value of -32767 indicates an illegal position. A stalemate position has a centipawn evaluation of zero as does a position drawn due to insufficient mating material. Any other position known to be a certain forced draw also has a centipawn evaluation of zero.

5.7: Opcode “dm”: direct mate fullmove count

The “dm” opcode is used to indicate the number of fullmoves until checkmate is to be delivered by the active color for the indicated position. It always takes a single operand which is a positive integer giving the fullmove count. For example, a position known to be a “mate in three” would have an operation of “dm 3;” to indicate this.

This opcode is intended for use with problem sets composed of positions requiring direct mate answers as solutions.

5.8: Opcode “draw_accept”: accept a draw offer

The opcode “draw_accept” is used to indicate that a draw offer made after the move that lead to the indicated position is accepted by the active player. This opcode takes no operands.

5.9: Opcode “draw_claim”: claim a draw

The opcode “draw_claim” is used to indicate claim by the active player that a draw exists. The draw is claimed because of a third time repetition or because of the fifty move rule or because of insufficient mating material. A supplied move (see the opcode “sm”) is also required to appear as part of the same EPD record. The draw_claim opcode takes no operands.

5.10: Opcode “draw_offer”: offer a draw

The opcode “draw_offer” is used to indicate that a draw is offered by the active player. A supplied move (see the opcode “sm”) is also required to appear as part of the same EPD record; this move is considered played from the indicated position. The draw_offer opcode takes no operands.

5.11: Opcode “draw_reject”: reject a draw offer

The opcode “draw_reject” is used to indicate that a draw offer made after the move that lead to the indicated position is rejected by the active player. This opcode takes no operands.

5.12: Opcode “eco”: _Encyclopedia of Chess Openings_ opening code

The opcode “eco” is used to associate an opening designation from the _Encyclopedia of Chess Openings_ taxonomy with the indicated position. The opcode takes either a single string operand (the ECO opening name) or no operand at all. If an operand is present, its value is associated with an “ECO” string register of the scanning program. If there is no operand, the ECO string register of the scanning program is set to null.

The usage is similar to that of the “ECO” tag pair of the PGN standard.

5.13: Opcode “fmvn”: fullmove number

The opcode “fmvn” represents the fullmove n umber associated with the position. It always takes a single operand that is the positive integer value of the move number.

This opcode is used to explicitly represent the fullmove number in EPD that is present by default in FEN as the sixth field. Fullmove number information is usually omitted from EPD because it does not affect move generation (commonly needed for EPD-using tasks) but it does affect game notation (commonly needed for FEN-using tasks). Because of the desire for space optimization for large EPD files, fullmove numbers were dropped from EPD’s parent FEN. The halfmove clock information was similarly dropped.

5.14: Opcode “hmvc”: halfmove clock

The opcode “hmvc” represents the halfmove clock associated with the position. The halfmove clock of a position is equal to the number of plies since the last pawn move or capture. This information is used to implement the fifty move draw rule. It always takes a single operand that is the non-negative integer value of the halfmove clock.

This opcode is used to explicitly represent the halfmove clock in EPD that is present by default in FEN as the fifth field. Halfmove clock information is usually omitted from EPD because it does not affect move generation (commonly needed for EPD-using tasks) but it does affect game termination issues (commonly needed for FEN-using tasks). Because of the desire for space optimization for large EPD files, halfmove clock values were dropped from EPD’s parent FEN. The fullmove number information was similarly dropped.

5.15: Opcode “id”: position identification

The opcode “id” is used to provide a simple identifying label for the indicated position. It takes a single string operand.

This opcode is intended for use with test suites used for measuring chessplaying program strength. An example “id” operand for the seven hundred fifty seventh position of the one thousand one problems in Reinfeld’s _1001 Winning Chess Sacrifices and Combinations_ would be “WCSAC.0757″ while the fifteenth position in the twenty four problem Bratko-Kopec test suite would have an “id” operand of “BK.15″.

5.16: Opcode “nic”: _New In Chess_ opening code

The opcode “nic” is used to associate an opening designation from the _New In Chess_ taxonomy with the indicated position. The opcode takes either a single string operand (the NIC opening name) or no operand at all. If an operand is present, its value is associated with an “NIC” string register of the scanning program. If there is no operand, the NIC string register of the scanning program is set to null.

The usage is similar to that of the “NIC” tag pair of the PGN standard.

5.17: Opcode “noop”: no operation

The “noop” opcode is used to indicate no operation. It takes zero or more operands, each of which may be of any type. The operation involves no processing. It is intended for use by developers for program testing purposes.

5.18: Opcode “pm”: predicted move

The “pm” opcode is used to provide a single predicted move for the indicated position. It has exactly one operand, a move playable from the position. This move is judged by the EPD writer to represent the best move available to the active player.

If a non-empty “pv” (predicted variation) line of play is also present in the same EPD record, the first move of the predicted variation is the same as the predicted move.

The “pm” opcode is intended for use as a general “display hint” mechanism.

5.19: Opcode “pv”: predicted variation

The “pv” opcode is used to provide a predicted variation for the indicated position. It has zero or more operands which represent a sequence of moves playable from the position. This sequence is judged by the EPD writer to represent the best play available.

If a “pm” (predicted move) operation is also present in the same EPD record, the predicted move is the same as the first move of the predicted variation.

5.20: Opcode “rc”: repetition count

The “rc” opcode is used to indicate the number of occurrences of the indicated position. It takes a single, positive integer operand. Any position, including the initial starting position, is considered to have an “rc” value of at least one. A value of three indicates a candidate for a draw claim by the position repetition rule.

5.21: Opcode “resign”: game resignation

The opcode “resign” is used to indicate that the active player has resigned the game. This opcode takes no operands.

5.22: Opcode “sm”: supplied move

The “sm” opcode is used to provide a single supplied move for the indicated position. It has exactly one operand, a move playable from the position. This move is the move to be played from the position.

The “sm” opcode is intended for use to communicate the most recent played move in an active game. It is used to communicate moves between programs in automatic play via a network. This includes correspondence play using e-mail and also programs acting as network front ends to human players.

5.23: Opcode “tcgs”: telecommunication: game selector

The “tcgs” opcode is one of the telecommunication family of opcodes used for games conducted via e-mail and similar means. This opcode takes a single operand that is a positive integer. It is used to select among various games in progress between the same sender and receiver.

5.24: Opcode “tcri”: telecommunication: receiver identification

The “tcri” opcode is one of the telecommunication family of opcodes used for games conducted via e-mail and similar means. This opcode takes two order dependent string operands. The first operand is the e-mail address of the receiver of the EPD record. The second operand is the name of the player (program or human) at the address who is the actual receiver of the EPD record.

5.25: Opcode “tcsi”: telecommunication: sender identification

The “tcsi” opcode is one of the telecommunication family of opcodes used for games conducted via e-mail and similar means. This opcode takes two order dependent string operands. The first operand is the e-mail address of the sender of the EPD record. The second operand is the name of the player (program or human) at the address who is the actual sender of the EPD record.

5.26: Opcode “v0″: variation name (primary, also “v1″ though “v9″)

The opcode “v0″ (lower case letter “v”, digit character zero) indicates a top level variation name that applies to the given position. It is the first of ten ranked variation names, each of which has a mnemonic formed from the lower case letter “v” followed by a single decimal digit. Each of these opcodes takes either a single string operand or no operand at all.

This ten member variation name family of opcodes is intended for use as traditional variation names for a complete game or game fragment. The usual processing of these opcodes are as follows:

1) At the beginning of a game (or game fragment), a move sequence scanning program initializes each element of its set of ten variation name string registers to be null.

2) As the EPD record for each position in the game is processed, the variation name operations are interpreted from left to right. (Actually, all operations in n EPD record are interpreted from left to right.) Because operations appear in ASCII order according to their opcode mnemonics, opcode “v0″ (if present) will be handled prior to all other opcodes, then opcode “v1″ (if present), and so forth until opcode “v9″ (if present).

3) The processing of opcode “vN” (0 <= N <= 9) involves two steps. First, all variation name string registers with an index equal to or greater than N are set to null. (This is the set “vN” though “v9″.) Second, and only if a string operand is present, the value of the corresponding variation name string register is set equal to the string operand.