Move the docs folder one above, it was in src by mistake.

This commit is contained in:
Tomasz Sobczyk
2020-10-09 10:16:02 +02:00
committed by nodchip
parent ef57ac78a3
commit 2af4bf7eac
4 changed files with 0 additions and 0 deletions

42
docs/binpack.md Normal file
View File

@@ -0,0 +1,42 @@
# Binpack
Binpack is a binary training data storage format designed to take advantage of position chains differing by a single move. Therefore it is very good at compactly storing data generated from real games (as opposed to random positions for example sourced from an opening book).
It is currently implemented through a single header library in `extra/nnue_data_binpack_format.h`.
Below follows a rough description of the format in a BNF-like notation.
```
[[nodiscard]] std::uint16_t signedToUnsigned(std::int16_t a) {
std::uint16_t r;
std::memcpy(&r, &a, sizeof(std::uint16_t));
if (r & 0x8000) r ^= 0x7FFF; // flip value bits if negative
r = (r << 1) | (r >> 15); // store sign bit at bit 0
return r;
}
file := <block>*
block := BINP<chain>*
chain := <stem><movetext>
stem := <pos><move><score><ply_and_result><rule50> (32 bytes)
pos := https://github.com/Sopel97/nnue_data_compress/blob/master/src/chess/Position.h#L1166 (24 bytes)
move := https://github.com/Sopel97/nnue_data_compress/blob/master/src/chess/Chess.h#L1044 (2 bytes)
score := signedToUnsigned(score) (2 bytes, big endian)
ply_and_result := ply bitwise_or (signedToUnsigned(result) << 14) (2 bytes, big endian)
rule50 := rule_50_counter (2 bytes, big endian)
// this is a small defect from old version,
I didn't want to break backwards compatibility. Effectively means that there's
one byte left for something else in the future because rule50 always fits in one byte.
movetext := <count><move_and_score>*
count := number of plies in the movetext (2 bytes, big endian). Can be 0.
move_and_score := <encoded_move><encoded_score> (~2 bytes)
encoded_move := oof this one is complicated to explain.
https://github.com/Sopel97/nnue_data_compress/blob/master/src/compress_file.cpp#L827.
https://github.com/Sopel97/chess_pos_db/blob/master/docs/bcgn/variable_length.md
encoded_score := https://en.wikipedia.org/wiki/Variable-width_encoding
with block size of 4 bits + 1 bit for extension bit.
Encoded value is signedToUnsigned(-prev_score - current_score)
(scores are always seen from the perspective of side to move in <pos>, that's why the '-' before prev_score)
```

15
docs/convert.md Normal file
View File

@@ -0,0 +1,15 @@
# Convert
`convert` allows conversion of training data between any of `.plain`, `.bin`, and `.binpack`.
As all commands in stockfish `convert` can be invoked either from command line (as `stockfish.exe convert ...`) or in the interactive prompt.
The syntax of this command is as follows:
```
convert from_path to_path [append]
```
`from_path` is the path to the file to convert from. The type of the data is deduced based on its extension (one of `.plain`, `.bin`, `.binpack`).
`to_path` is the path to an output file. The type of the data is deduced from its extension. If the file does not exist it is created.
The last argument is optional. If not specified then the output file will be truncated prior to any writes. If the last argument is `append` then the converted training data will be appended to the end of the output file.

63
docs/gensfen.md Normal file
View File

@@ -0,0 +1,63 @@
# Gensfen
`gensfen` command allows generation of training data from self-play in a manner that suits training better than traditional games. It introduces random moves to diversify openings, and fixed depth evaluation.
As all commands in stockfish `gensfen` can be invoked either from command line (as `stockfish.exe gensfen ...`, but this is not recommended because it's not possible to specify UCI options before `gensfen` executes) or in the interactive prompt.
It is recommended to set the `PruneAtShallowDepth` UCI option to `false` as it will increase the quality of fixed depth searches.
It is recommended to keep the `EnableTranspositionTable` UCI option at the default `true` value as it will make the generation process faster without noticably harming the uniformity of the data.
`gensfen` takes named parameters in the form of `gensfen param_1_name param_1_value param_2_name param_2_value ...`.
Currently the following options are available:
`set_recommended_uci_options` - this is a modifier not a parameter, no value follows it. If specified then some UCI options are set to recommended values.
`depth` - minimum depth of evaluation of each position. Default: 3.
`depth2` - maximum depth of evaluation of each position. If not specified then the same as `depth`.
`nodes` - the number of nodes to use for evaluation of each position. This number is multiplied by the number of PVs of the current search. This does NOT override the `depth` and `depth2` options. If specified then whichever of depth or nodes limit is reached first applies.
`loop` - the number of training data entries to generate. 1 entry == 1 position. Default: 8000000000 (8B).
`output_file_name` - the name of the file to output to. If the extension is not present or doesn't match the selected training data format the right extension will be appened. Default: generated_kifu
`eval_limit` - evaluations with higher absolute value than this will not be written and will terminate a self-play game. Should not exceed 10000 which is VALUE_KNOWN_WIN, but is only hardcapped at mate in 2 (\~30000). Default: 3000
`random_move_minply` - the minimal ply at which a random move may be executed instead of a move chosen by search. Default: 1.
`random_move_maxply` - the maximal ply at which a random move may be executed instead of a move chosen by search. Default: 24.
`random_move_count` - maximum number of random moves in a single self-play game. Default: 5.
`random_move_like_apery` - either 0 or 1. If 1 then random king moves will be followed by a random king move from the opponent whenever possible with 50% probability. Default: 0.
`random_multi_pv` - the number of PVs used for determining the random move. If not specified then a truly random move will be chosen. If specified then a multiPV search will be performed the random move will be one of the moves chosen by the search.
`random_multi_pv_diff` - Makes the multiPV random move selection consider only moves that are at most `random_multi_pv_diff` worse than the next best move. Default: 30000 (all multiPV moves).
`random_multi_pv_depth` - the depth to use for multiPV search for random move. Default: `depth2`.
`write_minply` - minimum ply for which the training data entry will be emitted. Default: 16.
`write_maxply` - maximum ply for which the training data entry will be emitted. Default: 400.
`save_every` - the number of training data entries per file. If not specified then there will be always one file. If specified there may be more than one file generated (each having at most `save_every` training data entries) and each file will have a unique number attached.
`random_file_name` - if specified then the output filename will be chosen randomly. Overrides `output_file_name`.
`write_out_draw_game_in_training_data_generation` - either 0 or 1. If 1 then training data from drawn games will be emitted too. Default: 1.
`use_draw_in_training_data_generation` - deprecated, alias for `write_out_draw_game_in_training_data_generation`
`detect_draw_by_consecutive_low_score` - either 0 or 1. If 1 then drawn games will be adjudicated when the score remains 0 for at least 8 plies after ply 80. Default: 1.
`use_game_draw_adjudication` - deprecated, alias for `detect_draw_by_consecutive_low_score`
`detect_draw_by_insufficient_mating_material` - either 0 or 1. If 1 then position with insufficient material will be adjudicated as draws. Default: 1.
`sfen_format` - format of the training data to use. Either `bin` or `binpack`. Default: `binpack`.
`seed` - seed for the PRNG. Can be either a number or a string. If it's a string then its hash will be used. If not specified then the current time will be used.

100
docs/learn.md Normal file
View File

@@ -0,0 +1,100 @@
# Learn
`learn` command allows training a network from training data.
As all commands in stockfish `learn` can be invoked either from command line (as `stockfish.exe learn ...`, but this is not recommended because it's not possible to specify UCI options before `learn` executes) or in the interactive prompt.
`learn` takes named parameters in the form of `learn param_1_name param_1_value param_2_name param_2_value ...`. Unrecognized parameters form a list of paths to training data files.
It is recommended to set the `EnableTranspositionTable` UCI option to `false` to reduce the interference between qsearches which are used to provide shallow evaluation. Using TT may cause the shallow evaluation to diverge from the real evaluation of the net, hiding imperfections.
It is recommended to set the `PruneAtShallowDepth` UCI option to `false` as it will provide more accurate shallow evaluation.
It is **required** to set the `Use NNUE` UCI option to `pure` as otherwise the function being optimized will not always match the function being probed, in which case not much can be learned.
Currently the following options are available:
`set_recommended_uci_options` - this is a modifier not a parameter, no value follows it. If specified then some UCI options are set to recommended values.
`bat` - the size of a batch in multiples of 10000. This determines how many entries are read and shuffled at once during training. Default: 1000 (meaning batch size of 1000000).
`targetdir` - path to the direction from which training data will be read. All files in this directory are read sequentially. If not specified then only the list of files from positional arguments will be used. If specified then files from the given directory will be used after the explicitly specified files.
`loop` - the number of times to loop over all training data.
`basedir` - the base directory for the paths. Default: "" (current directory)
`batchsize` - same as `bat` but doesn't scale by 10000
`lr` - initial learning rate. Default: 1.
`use_draw_games_in_training` - either 0 or 1. If 1 then draws will be used in training too. Default: 1.
`use_draw_in_training` - deprecated, alias for `use_draw_games_in_training`
`use_draw_games_in_validation` - either 0 or 1. If 1 then draws will be used in validation too. Default: 1.
`use_draw_in_validation` - deprecated, alias for `use_draw_games_in_validation`
`skip_duplicated_positions_in_training` - either 0 or 1. If 1 then a small hashtable will be used to try to eliminate duplicated position from training. Default: 0.
`use_hash_in_training` - deprecated, alias for `skip_duplicated_positions_in_training`
`winning_probability_coefficient` - some magic value for winning probability. If you need to read this then don't touch it. Default: 1.0 / PawnValueEg / 4.0 * std::log(10.0)
`use_wdl` - either 0 or 1. If 1 then the evaluations will be converted to win/draw/loss percentages prior to learning on them. (Slightly changes the gradient because eval has a different derivative than wdl). Default: 0.
`lambda` - value in range [0..1]. 1 means that only evaluation is used for learning, 0 means that only game result is used. Values inbetween result in interpolation between the two contributions. See `lambda_limit` for when this is applied. Default: 1.0.
`lambda2` - value in range [0..1]. 1 means that only evaluation is used for learning, 0 means that only game result is used. Values inbetween result in interpolation between the two contributions. See `lambda_limit` for when this is applied. Default: 1.0.
`lambda_limit` - the maximum absolute score value for which `lambda` is used as opposed to `lambda2`. For positions with absolute evaluation higher than `lambda_limit` `lambda2` will be used. Default: 32000 (so always `lambda`).
`reduction_gameply` - the minimum ply after which positions won't be skipped. Positions at plies below this value are skipped with a probability that lessens linearly with the ply (reaching 0 at `reduction_gameply`). Default: 1.
`eval_limit` - positions with absolute evaluation higher than this will be skipped. Default: 32000 (nothing is skipped).
`save_only_once` - this is a modifier not a parameter, no value follows it. If specified then there will be only one network file generated.
`no_shuffle` - this is a modifier not a parameter, no value follows it. If specified then data within a batch won't be shuffled.
`nn_batch_size` - minibatch size used for learning. Should be smaller than batch size. Default: 1000.
`newbob_decay` - learning rate will be multiplied by this factor every time a net is rejected (so in other words it controls LR drops). Default: 0.5 (no LR drops)
`newbob_num_trials` - determines after how many subsequent rejected nets the training process will be terminated. Default: 4.
`nn_options` - if you're reading this you don't use it. It passes messages directly to the network evaluation. I don't know what it can do either.
`eval_save_interval` - every `eval_save_interval` positions the network will be saved and either accepted or rejected (in which case an LR drop follows). Default: 100000000 (100M). (generally people use values in 10M-100M range)
`loss_output_interval` - every `loss_output_interval` fitness statistics are displayed. Default: 1000000 (1M)
`validation_set_file_name` - path to the file with training data to be used for validation (loss computation and move accuracy)
`seed` - seed for the PRNG. Can be either a number or a string. If it's a string then its hash will be used. If not specified then the current time will be used.
## Legacy subcommands and parameters
### Convert
`convert_plain`
`convert_bin`
`interpolate_eval`
`check_invalid_fen`
`check_illegal_move`
`convert_bin_from_pgn-extract`
`pgn_eval_side_to_move`
`convert_no_eval_fens_as_score_zero`
`src_score_min_value`
`src_score_max_value`
`dest_score_min_value`
`dest_score_max_value`
### Shuffle
`shuffle`
`buffer_size`
`shuffleq`
`shufflem`
`output_file_name`