Compare commits

...

124 Commits

Author SHA1 Message Date
FauziAkram
ce73441f20 Simplify sudden death time optimization
Passed Sudden Death STC:
https://tests.stockfishchess.org/tests/view/68455fe5375c2b77d9855351
LLR: 2.91 (-2.94,2.94) <-1.75,0.25>
Total: 49248 W: 13008 L: 12798 D: 23442
Ptnml(0-2): 309, 5491, 12821, 5687, 316

Passed Sudden Death LTC:
https://tests.stockfishchess.org/tests/view/6845a392375c2b77d98553cf
LLR: 3.01 (-2.94,2.94) <-1.75,0.25>
Total: 551070 W: 141699 L: 142031 D: 267340
Ptnml(0-2): 1923, 60608, 150916, 60054, 2034

Passed Standard STC:
https://tests.stockfishchess.org/tests/view/683c5ebb6ec7634154f9d989
LLR: 2.95 (-2.94,2.94) <-1.75,0.25>
Total: 142624 W: 36808 L: 36709 D: 69107
Ptnml(0-2): 302, 15448, 39745, 15483, 334

Passed Standard LTC:
https://tests.stockfishchess.org/tests/view/683f1a4f6ec7634154f9dc5a
LLR: 2.95 (-2.94,2.94) <-1.75,0.25>
Total: 146922 W: 37381 L: 37296 D: 72245
Ptnml(0-2): 69, 13552, 46117, 13671, 52

closes https://github.com/official-stockfish/Stockfish/pull/6132

Bench: 2249459
2025-07-02 18:41:46 +02:00
FauziAkram
e695b9537e Remove eval & beta diff from NM reduction
Passed STC:
LLR: 2.94 (-2.94,2.94) <-1.75,0.25>
Total: 43456 W: 11178 L: 10966 D: 21312
Ptnml(0-2): 114, 5078, 11114, 5326, 96
https://tests.stockfishchess.org/tests/view/6849ae13e84567164b5c9de9

Passed LTC:
LLR: 2.94 (-2.94,2.94) <-1.75,0.25>
Total: 63090 W: 16302 L: 16125 D: 30663
Ptnml(0-2): 37, 6837, 17603, 7048, 20
https://tests.stockfishchess.org/tests/view/684ab516e84567164b5ca02f

closes https://github.com/official-stockfish/Stockfish/pull/6134

Bench: 2249459
2025-07-02 18:41:46 +02:00
mstembera
ce7254b5ea Optimize find_nnz() using AVX512
About a 1% speedup for ARCH x86-64-avx512 and x86-64-vnni512.

Note:  This could be optimized further if we wanted to add an ARCH
supporting VBMI2 which is even more modern than VNNI.

https://en.wikichip.org/wiki/x86/avx512_vbmi2

closes https://github.com/official-stockfish/Stockfish/pull/6139

No functional change
2025-07-02 18:41:45 +02:00
MinetaS
ea85a54fef Fix trivial errors in Makefile
1. Remove "default" rule as "default" has no special meaning as a rule
name. Make runs the very first rule whose name doesn't begin with a dot
(which is "help" currently).

2. Make "format" rule not update dependencies.

closes https://github.com/official-stockfish/Stockfish/pull/6140

No functional change
2025-07-02 18:32:12 +02:00
Robert Nurnberg @ elitebook
84e2f3851d Introduce a constant for ValueList size in search()
Having the size of these lists in two separate places likely contributed
to the crashes seen during the recent tuning attempt
https://tests.stockfishchess.org/tests/view/685c413343ce022d15794536.

Thanks to @MinetaS for spotting this.

closes https://github.com/official-stockfish/Stockfish/pull/6142

No functional change
2025-07-02 18:32:02 +02:00
Daniel Monroe
3a0fff96cf Simplify quiet move streak logic
Passed non-regression STC
LLR: 2.95 (-2.94,2.94) <-1.75,0.25>
Total: 148960 W: 38409 L: 38312 D: 72239
Ptnml(0-2): 372, 17664, 38318, 17747, 379
https://tests.stockfishchess.org/tests/view/684c5773703522d4f129c5f7

Passed non-regression LTC
LLR: 2.96 (-2.94,2.94) <-1.75,0.25>
Total: 180720 W: 46188 L: 46130 D: 88402
Ptnml(0-2): 84, 19608, 50929, 19644, 95
https://tests.stockfishchess.org/tests/view/68505fa5703522d4f129cbab

closes https://github.com/official-stockfish/Stockfish/pull/6143

Bench: 2055894
2025-07-02 18:31:04 +02:00
Shawn Xu
318c948c4d Remove non-functional low-ply history fill
lowPlyHistory is always cleared at the start of `iterative_deepening`, so clearing it here is non-functional.

closes https://github.com/official-stockfish/Stockfish/pull/6144

No functional change
2025-07-02 18:31:04 +02:00
Daniel Monroe
a7a56c41f6 Simplify history term in futility pruning
Passed simplification STC
LLR: 2.95 (-2.94,2.94) <-1.75,0.25>
Total: 298816 W: 76814 L: 76881 D: 145121
Ptnml(0-2): 726, 35477, 77057, 35434, 714
https://tests.stockfishchess.org/tests/view/683f440f6ec7634154f9dc7f

Passed simplification LTC
LLR: 2.94 (-2.94,2.94) <-1.75,0.25>
Total: 237774 W: 60801 L: 60802 D: 116171
Ptnml(0-2): 91, 26088, 66532, 26083, 93
https://tests.stockfishchess.org/tests/view/68441189ffbc71bd236778de

closes https://github.com/official-stockfish/Stockfish/pull/6130

Bench: 2411502
2025-07-02 18:31:04 +02:00
pb00067
34b75f1575 Restore integrity of MovePicker::can_move_king_or_pawn
PR6005 broken by PR6071

passed STC non regression
https://tests.stockfishchess.org/tests/view/6839791f6ec7634154f9d312
LLR: 2.96 (-2.94,2.94) <-1.75,0.25>
Total: 31776 W: 8353 L: 8130 D: 15293
Ptnml(0-2): 74, 3566, 8382, 3795, 71

passed LTC non-regression
https://tests.stockfishchess.org/tests/view/6839c87a6ec7634154f9d367
LLR: 2.94 (-2.94,2.94) <-1.75,0.25>
Total: 120756 W: 31015 L: 30899 D: 58842
Ptnml(0-2): 50, 12732, 34703, 12838, 55

closes https://github.com/official-stockfish/Stockfish/pull/6119

Bench: 1945300
2025-07-02 18:31:02 +02:00
disservin
15555e8f4a Disable linux gcc riscv64 (#6145)
Temporarily disable it, until we figure out the toolchain issues which are causing the crashes.

closes https://github.com/official-stockfish/Stockfish/pull/6145

No functional change
2025-06-29 12:33:20 +02:00
Shawn Xu
5337edfdb6 remove non-functional else
since we break out of the loop in the other branch

closes https://github.com/official-stockfish/Stockfish/pull/6116

no functional change
2025-06-02 22:12:37 +02:00
Daniel Monroe
9ac756695e reduce depth by 5 in probcut
Passed simplification STC
LLR: 2.93 (-2.94,2.94) <-1.75,0.25>
Total: 63328 W: 16402 L: 16213 D: 30713
Ptnml(0-2): 174, 7378, 16340, 7629, 143
https://tests.stockfishchess.org/tests/view/6833530e6ec7634154f9be7f

Passed simplification LTC
LLR: 2.94 (-2.94,2.94) <-1.75,0.25>
Total: 69936 W: 17795 L: 17625 D: 34516
Ptnml(0-2): 29, 7631, 19474, 7809, 25
https://tests.stockfishchess.org/tests/view/68335e386ec7634154f9c266

closes https://github.com/official-stockfish/Stockfish/pull/6120

Bench: 2307268
2025-06-02 22:09:11 +02:00
Daniel Monroe
254b6d5e85 Simplify corrections in extension margins
Passed simplification STC
LLR: 2.93 (-2.94,2.94) <-1.75,0.25>
Total: 96192 W: 25002 L: 24852 D: 46338
Ptnml(0-2): 242, 10868, 25716, 11038, 232
https://tests.stockfishchess.org/tests/view/683b44cb6ec7634154f9d6ac

Passed simplification LTC
LLR: 2.96 (-2.94,2.94) <-1.75,0.25>
Total: 83334 W: 21473 L: 21317 D: 40544
Ptnml(0-2): 37, 8877, 23674, 9051, 28
https://tests.stockfishchess.org/tests/view/683b79786ec7634154f9d75a

closes https://github.com/official-stockfish/Stockfish/pull/6117

Bench: 2294814
2025-06-02 22:05:11 +02:00
Shawn Xu
259bdaaa9f Remove an unnecessary bound check
When failing high, it is always true that `alpha < beta` and `beta <=
bestValue`. Therefore if alpha and bestValue is not in decisive range, it is
guaranteed that beta is not.

closes https://github.com/official-stockfish/Stockfish/pull/6115

no functional change
2025-06-02 21:57:19 +02:00
Shawn Xu
c9af7674bc Introduce Secondary TT Aging
When a high-depth TT entry fail to produce a cutoff, decrease the stored depth
by 1. This is intended to help cases such as #5023
(https://github.com/official-stockfish/Stockfish/issues/5023#issuecomment-2814209391),
where entries with extremely high depths prevent TT cutoffs, contributing to
search explosions.

Passed STC:
LLR: 2.95 (-2.94,2.94) <0.00,2.00>
Total: 128800 W: 33502 L: 33053 D: 62245
Ptnml(0-2): 331, 15084, 33149, 15477, 359
https://tests.stockfishchess.org/tests/view/683958e56ec7634154f9d2a9

Passed LTC:
LLR: 2.97 (-2.94,2.94) <0.50,2.50>
Total: 63288 W: 16376 L: 16005 D: 30907
Ptnml(0-2): 26, 6712, 17798, 7081, 27
https://tests.stockfishchess.org/tests/view/683aa4026ec7634154f9d469

closes https://github.com/official-stockfish/Stockfish/pull/6113

Bench: 2144705
2025-06-02 21:55:55 +02:00
Daniel Monroe
3747a19937 Simplify away depth condition in IIR
Passed simplification STC
LLR: 2.94 (-2.94,2.94) <-1.75,0.25>
Total: 359520 W: 92714 L: 92849 D: 173957
Ptnml(0-2): 977, 42640, 92614, 42599, 930
https://tests.stockfishchess.org/tests/view/6833705d6ec7634154f9c302

Passed simplification LTC
LLR: 2.96 (-2.94,2.94) <-1.75,0.25>
Total: 201756 W: 51544 L: 51507 D: 98705
Ptnml(0-2): 89, 21965, 56728, 22012, 84
https://tests.stockfishchess.org/tests/view/68338e386ec7634154f9c790

Passed simplification VLTC
LLR: 2.95 (-2.94,2.94) <-1.75,0.25>
Total: 48558 W: 12675 L: 12492 D: 23391
Ptnml(0-2): 9, 4779, 14516, 4970, 5
https://tests.stockfishchess.org/tests/view/6838e0b26ec7634154f9d25b

closes https://github.com/official-stockfish/Stockfish/pull/6112

Bench: 2302583
2025-06-02 21:52:38 +02:00
Daniel Monroe
70ff5e3163 Simplify away cutoff term in prior countermove bonus
Passed simplification STC
LLR: 2.93 (-2.94,2.94) <-1.75,0.25>
Total: 61120 W: 16010 L: 15819 D: 29291
Ptnml(0-2): 150, 7105, 15869, 7276, 160
https://tests.stockfishchess.org/tests/view/683560226ec7634154f9ce0f

Passed simplification LTC
LLR: 2.94 (-2.94,2.94) <-1.75,0.25>
Total: 225090 W: 57555 L: 57543 D: 109992
Ptnml(0-2): 104, 24367, 63603, 24355, 116
https://tests.stockfishchess.org/tests/view/6836420c6ec7634154f9cf5c

closes https://github.com/official-stockfish/Stockfish/pull/6111

Bench: 2472910
2025-06-02 21:49:43 +02:00
Daniel Monroe
ddefd6eb6b Simplify away check term in statscore
Passed simplification STC
LLR: 2.93 (-2.94,2.94) <-1.75,0.25>
Total: 61696 W: 16031 L: 15841 D: 29824
Ptnml(0-2): 151, 7160, 16046, 7330, 161
https://tests.stockfishchess.org/tests/view/68353fcc6ec7634154f9cdd5

Passed simplification LTC
LLR: 2.95 (-2.94,2.94) <-1.75,0.25>
Total: 237990 W: 60994 L: 60995 D: 116001
Ptnml(0-2): 95, 25964, 66903, 25913, 120
https://tests.stockfishchess.org/tests/view/683642256ec7634154f9cf5e

closes https://github.com/official-stockfish/Stockfish/pull/6110

Bench: 2521003
2025-06-02 21:47:26 +02:00
Nonlinear2
8da3c2155a Simplify NMP eval in qsearch
Passed non-regression STC:
https://tests.stockfishchess.org/tests/view/6834e9436ec7634154f9cd6e
LLR: 2.95 (-2.94,2.94) <-1.75,0.25>
Total: 24864 W: 6626 L: 6394 D: 11844
Ptnml(0-2): 62, 2806, 6477, 3012, 75

Passed non-regression LTC:
https://tests.stockfishchess.org/tests/view/683598fd6ec7634154f9ce82
LLR: 2.94 (-2.94,2.94) <-1.75,0.25>
Total: 200148 W: 51461 L: 51424 D: 97263
Ptnml(0-2): 92, 21672, 56503, 21721, 86

closes https://github.com/official-stockfish/Stockfish/pull/6109

Bench: 2316591
2025-06-02 21:44:29 +02:00
Carlos Esparza
5695486db9 Fix outdated comment
closes https://github.com/official-stockfish/Stockfish/pull/6108

No functional change
2025-06-02 21:40:13 +02:00
Robert Nurnberg @ elitebook
9debc540e5 Fix clang-format version in CONTRIBUTING.md
closes https://github.com/official-stockfish/Stockfish/pull/6107

No functional change
2025-06-02 21:38:53 +02:00
Shawn Xu
d0212906bd Simplify stat eval history adjustment further
closes https://github.com/official-stockfish/Stockfish/pull/6106

bench 2074807
2025-06-02 21:37:37 +02:00
mstembera
29b0c07ac8 Simplify Position::pieces()
closes https://github.com/official-stockfish/Stockfish/pull/6104

No functional change
2025-06-02 21:28:18 +02:00
mstembera
d27298d7dc Remove unused threatenedPieces
threatenedPieces is no longer used since #6023

Also can_move_king_or_pawn() can be const.
Also remove a couple of redundant declarations.

closes https://github.com/official-stockfish/Stockfish/pull/6101

No functional change
2025-06-02 21:24:38 +02:00
Shawn Xu
dc85c5a4c9 Remove nnz lookup table load optimization
Passed Non-regression STC:
LLR: 2.95 (-2.94,2.94) <-1.75,0.25>
Total: 63296 W: 16491 L: 16311 D: 30494
Ptnml(0-2): 129, 6624, 17972, 6784, 139
https://tests.stockfishchess.org/tests/view/6833ce486ec7634154f9cb22

Passed 2nd Non-regression STC:
LLR: 2.97 (-2.94,2.94) <-1.75,0.25>
Total: 369568 W: 95314 L: 95451 D: 178803
Ptnml(0-2): 897, 40231, 102601, 40222, 833
https://tests.stockfishchess.org/tests/view/68355c956ec7634154f9ce07

closes https://github.com/official-stockfish/Stockfish/pull/6100

no functional change
2025-06-02 21:21:54 +02:00
Daniel Monroe
9fd40b9ea8 Simplify tt depth in stat eval history adjustment
Passed simplification STC
LLR: 2.95 (-2.94,2.94) <-1.75,0.25>
Total: 102208 W: 26498 L: 26349 D: 49361
Ptnml(0-2): 284, 12095, 26166, 12306, 253
https://tests.stockfishchess.org/tests/view/683354c76ec7634154f9be88

Passed simplification LTC
LLR: 2.94 (-2.94,2.94) <-1.75,0.25>
Total: 133422 W: 34050 L: 33945 D: 65427
Ptnml(0-2): 56, 14473, 37559, 14556, 67
https://tests.stockfishchess.org/tests/view/683363626ec7634154f9c298

closes https://github.com/official-stockfish/Stockfish/pull/6099

Bench: 2652411
2025-06-02 21:19:17 +02:00
Daniel Monroe
dfa176fc7e Small tt verify simplification
Also fix probcut comment

Passed non-regression STC
LLR: 2.94 (-2.94,2.94) <-1.75,0.25>
Total: 69728 W: 18080 L: 17909 D: 33739
Ptnml(0-2): 161, 7157, 20044, 7354, 148
https://tests.stockfishchess.org/tests/view/68324b116ec7634154f9b478

closes https://github.com/official-stockfish/Stockfish/pull/6094

No functional change
2025-06-02 21:15:56 +02:00
FauziAkram
9b79b75c9b Enforce minimum compiler versions
gcc 9.3
clang 10

using unsupported compiler versions will generate an error,
older version might miscompile SF

CI: improves output on failed bench output

closes https://github.com/official-stockfish/Stockfish/pull/6032

No functional change
2025-06-02 21:09:19 +02:00
FauziAkram
73c55e8949 Simplify Double Margin Formula
Passed STC:
LLR: 2.94 (-2.94,2.94) <-1.75,0.25>
Total: 267296 W: 69214 L: 69248 D: 128834
Ptnml(0-2): 760, 31511, 69141, 31475, 761
https://tests.stockfishchess.org/tests/view/682f5d9a6ec7634154f9b01e

Passed LTC:
LLR: 2.95 (-2.94,2.94) <-1.75,0.25>
Total: 67872 W: 17460 L: 17289 D: 33123
Ptnml(0-2): 25, 7238, 19243, 7401, 29
https://tests.stockfishchess.org/tests/view/6833074b6ec7634154f9b5ae

Passed VLTC:
LLR: 2.94 (-2.94,2.94) <-1.75,0.25>
Total: 118000 W: 30337 L: 30222 D: 57441
Ptnml(0-2): 15, 11783, 35289, 11898, 15
https://tests.stockfishchess.org/tests/view/683336c56ec7634154f9ba46

closes https://github.com/official-stockfish/Stockfish/pull/6097

Bench: 2312696
2025-05-25 21:31:12 +02:00
ppigazzini
e3adfaf8fc build & ci: update to NDK r27c API level 29
Update to the latest LTS version NDK r27c (27.2.12479018),
the previous NDK are unsupported by Google, see:
https://developer.android.com/ndk/downloads

A build with NDK r27c and API level < 29 returns this error:
"executable's TLS segment is underaligned: alignment is 8 (skew 0), needs to be at least 64 for ARM64 Bionic"

Update the API level to 29 to use the native ELF LTS and avoid the error:
https://android.googlesource.com/platform/bionic/+/HEAD/docs/elf-tls.md
https://android.googlesource.com/platform/bionic/+/HEAD/android-changes-for-ndk-developers.md#elf-tls-available-for-api-level-29

A dynamic link build of Stockfish uses these libraries:

ldd stockfish-android-armv8-dynamic-api35
        libm.so => /system/lib64/libm.so
        libdl.so => /system/lib64/libdl.so
        libc.so => /system/lib64/libc.so
        ld-android.so => /system/lib64/ld-android.so

ld-android.so : the dynamic linker used by Android (on Linux is named ld-linux.so),
                responsible for loading and linking shared libraries into an executable at runtime.
libdl.so      : interface/library layer that provides function for dynamic loading,
                relies on the underlying functionality provided by the dynamic linker
libm.so       : math library for Android
libc.so       : standard C library for Android

References:
Doc for native (C/C++) API
https://developer.android.com/ndk/guides/stable_apis

C libraries (libc, libm, libdl):
https://developer.android.com/ndk/guides/stable_apis#c_library

Bionic changes with API levels:
https://android.googlesource.com/platform/bionic/+/HEAD/docs/status.md

NDK r27c build system:
https://android.googlesource.com/platform/ndk/+/ndk-r27-release/docs/BuildSystemMaintainers.md

CI: Update to NDK r27c (27.2.12479018), the default version in GitHub runner,
to switch to a recent clang 18.

A PGO build requires static linking, because the NDK doesn't ship
the Android loaders (linker/linker64), see:
https://groups.google.com/g/android-ndk/c/3Ep6zD3xxSY

The API level should not be an issue when distributing a static build,
use the API 29, the oldest one not affected by the LTS alignement issue.

closes https://github.com/official-stockfish/Stockfish/pull/6081

No functional change
2025-05-25 21:28:53 +02:00
Кирилл Зарипов
bebffc5622 Adjust futility pruning thresholds using history
Passed STC:
https://tests.stockfishchess.org/tests/view/6833095a6ec7634154f9b5b3
LLR: 2.93 (-2.94,2.94) <0.00,2.00>
Total: 56896 W: 14946 L: 14604 D: 27346
Ptnml(0-2): 117, 6674, 14561, 6942, 154

Passed LTC:
https://tests.stockfishchess.org/tests/view/6833179d6ec7634154f9b5da
LLR: 2.94 (-2.94,2.94) <0.50,2.50>
Total: 200742 W: 51660 L: 51012 D: 98070
Ptnml(0-2): 96, 21520, 56473, 22204, 78

Passed Non-regression SMP STC:
LLR: 2.94 (-2.94,2.94) <-1.75,0.25>
Total: 29080 W: 7591 L: 7373 D: 14116
Ptnml(0-2): 38, 3178, 7881, 3414, 29
https://tests.stockfishchess.org/tests/view/6833689d6ec7634154f9c2ba

closes https://github.com/official-stockfish/Stockfish/pull/6092

Bench: 2305697
2025-05-25 21:24:09 +02:00
Shawn Xu
00b1540e01 Always Decrease Reduction on TTMove
Passed VVLTC w/ LTC Bounds:
LLR: 2.94 (-2.94,2.94) <0.50,2.50>
Total: 57792 W: 15005 L: 14676 D: 28111
Ptnml(0-2): 2, 5241, 18082, 5568, 3
https://tests.stockfishchess.org/tests/view/682a0e3c6ec7634154f9a07e

Passed VVLTC w/ STC Bounds:
LLR: 2.94 (-2.94,2.94) <0.00,2.00>
Total: 372298 W: 96342 L: 95655 D: 180301
Ptnml(0-2): 37, 34598, 116181, 35307, 26
https://tests.stockfishchess.org/tests/view/682a45b16ec7634154f9a3b3

STC Elo Estimate:
Elo: 0.15 ± 1.4 (95%) LOS: 58.3%
Total: 59612 W: 15414 L: 15388 D: 28810
Ptnml(0-2): 166, 6959, 15527, 6991, 163
nElo: 0.30 ± 2.8 (95%) PairsRatio: 1.00
https://tests.stockfishchess.org/tests/view/68335d276ec7634154f9c25c

closes https://github.com/official-stockfish/Stockfish/pull/6095

bench 2634355
2025-05-25 21:19:59 +02:00
Daniel Samek
805a2c1672 Simplify FutilityMoveCount
Inlined condition, instead of a function.

closes https://github.com/official-stockfish/Stockfish/pull/6096

no functional change
2025-05-25 21:12:27 +02:00
Кирилл Зарипов
eb27d9420f Make ProbCut search shallower in cutNode
Passed STC:
https://tests.stockfishchess.org/tests/view/6832d2436ec7634154f9b4fc
LLR: 2.93 (-2.94,2.94) <0.00,2.00>
Total: 455072 W: 118162 L: 117237 D: 219673
Ptnml(0-2): 1233, 53409, 117362, 54264, 1268

Passed LTC:
https://tests.stockfishchess.org/tests/view/6833323e6ec7634154f9ba17
LLR: 2.95 (-2.94,2.94) <0.50,2.50>
Total: 128436 W: 32916 L: 32415 D: 63105
Ptnml(0-2): 50, 13737, 36137, 14250, 44

closes https://github.com/official-stockfish/Stockfish/pull/6093

Bench: 2232447
2025-05-25 21:05:53 +02:00
FauziAkram
fe7b9b14d2 Implement smoother reduction in time management
Implement smoother time reduction in time management by replacing a conditional
assignment with a continuous sigmoid-based function. The updated logic employs
a sigmoid-like function for a more gradual adjustment.

Passed STC:
LLR: 2.93 (-2.94,2.94) <0.00,2.00>
Total: 64448 W: 16838 L: 16492 D: 31118
Ptnml(0-2): 145, 7214, 17207, 7466, 192
https://tests.stockfishchess.org/tests/view/6829dc046ec7634154f99fba

Passed LTC:
LLR: 2.94 (-2.94,2.94) <0.50,2.50>
Total: 407340 W: 104458 L: 103408 D: 199474
Ptnml(0-2): 196, 42281, 117664, 43335, 194
https://tests.stockfishchess.org/tests/view/6829fe1b6ec7634154f9a036

closes https://github.com/official-stockfish/Stockfish/pull/6091

No functional change
2025-05-25 21:03:16 +02:00
Shawn Xu
e6ec4705a8 Remove deprecated arch from codeql
closes https://github.com/official-stockfish/Stockfish/pull/6090

no functional change
2025-05-25 21:00:38 +02:00
Nonlinear2
b1b5893a8e Minor code improvements
- Remove / add empty lines
- fix the `ttcapture` comment
- remove the `bonus` variable for `ttMoveHistory`
- remove unnecessary parentheses / brackets
- refactor the movepick good quiet stage
- rename `endMoves` to `endCur`, as the previous name suggests that it points to the end of all generated moves, which it does not.

closes https://github.com/official-stockfish/Stockfish/pull/6089

No functional change.

Co-Authored-By: xu-shawn <50402888+xu-shawn@users.noreply.github.com>
2025-05-25 20:59:27 +02:00
pb00067
f58d923fe0 Simplify & improve stalemate detection
Change is functional because now we verify for stalemate also on captures and when not giving check.

Green STC test on stalemate-book
https://tests.stockfishchess.org/tests/view/682d878f6ec7634154f9ad2f
Elo: 2.29 ± 1.3 (95%) LOS: 100.0%
Total: 10000 W: 4637 L: 4571 D: 792
Ptnml(0-2): 2, 132, 4664, 202, 0
nElo: 12.42 ± 6.8 (95%) PairsRatio: 1.51

Green LTC test on stalemate-book
https://tests.stockfishchess.org/tests/view/682daa2d6ec7634154f9ad67
Elo: 0.80 ± 0.8 (95%) LOS: 96.9%
Total: 10000 W: 4727 L: 4704 D: 569
Ptnml(0-2): 0, 64, 4849, 87, 0
nElo: 6.51 ± 6.8 (95%) PairsRatio: 1.36

Passed non-regression test @ LTC
https://tests.stockfishchess.org/tests/view/682dd10d6ec7634154f9adb3
LLR: 2.95 (-2.94,2.94) <-1.75,0.25>
Total: 148512 W: 38135 L: 38046 D: 72331
Ptnml(0-2): 55, 15759, 42558, 15810, 74

N.B.: The unique concern I have, is that due changes in future a negative SEE
capture see might be returned in GOOD_CAPTURE stage. In this case the assert in
can_move_king_or_pawn() will trigger since we must guarantee that all moves
(also quiets) are generated in movepicker when calling can_move_king_or_pawn().

closes https://github.com/official-stockfish/Stockfish/pull/6088

bench: 2178135
2025-05-25 20:55:49 +02:00
Shawn Xu
472cc764be Move SIMD code to a separate header
Passed Non-regression STC:
LLR: 2.94 (-2.94,2.94) <-1.75,0.25>
Total: 115328 W: 29903 L: 29777 D: 55648
Ptnml(0-2): 265, 12293, 32444, 12375, 287
https://tests.stockfishchess.org/tests/view/68300e086ec7634154f9b1d1

closes https://github.com/official-stockfish/Stockfish/pull/6086

no functional change
2025-05-25 20:52:57 +02:00
Disservin
2662d6bf35 Update clang-format to v20
closes https://github.com/official-stockfish/Stockfish/pull/6085

No functional change
2025-05-23 08:54:06 +02:00
Disservin
c13c1d2c30 Silence "may be used uninitialized" GCC 15 warning
closes https://github.com/official-stockfish/Stockfish/pull/6083

No functional change
2025-05-23 08:53:00 +02:00
Daniel Monroe
4f021cab3b Simplify allNode term in prior countermove
Passed simplification STC
LLR: 2.96 (-2.94,2.94) <-1.75,0.25>
Total: 53632 W: 14008 L: 13805 D: 25819
Ptnml(0-2): 136, 6253, 13869, 6388, 170
https://tests.stockfishchess.org/tests/view/6828f2b26ec7634154f99b5e

Passed simplification LTC
LLR: 2.94 (-2.94,2.94) <-1.75,0.25>
Total: 82482 W: 21202 L: 21045 D: 40235
Ptnml(0-2): 37, 8986, 23052, 9115, 51
https://tests.stockfishchess.org/tests/view/6829010a6ec7634154f99db3

closes https://github.com/official-stockfish/Stockfish/pull/6068

Bench: 2302782
2025-05-23 08:51:54 +02:00
ppigazzini
e03898b57c ci: add tests and artifacts for windows-11-arm
integrate armv8 and armv8-dotprod builds on windows-11-arm in ci, creating the corresponding artifacts.
Correct Makefile to drop warnings when providing a CXX, add MINGW ARM64 to get_native_properties.sh

fixes https://github.com/official-stockfish/Stockfish/issues/5640
closes https://github.com/official-stockfish/Stockfish/pull/6078

No functional change
2025-05-21 07:29:57 +02:00
Nonlinear2
54fb42ddf8 clean up code
**Non functional changes:**
in search.cpp:
- an unnecessary pair of parenthesis in the IIR condition has been removed.
- refactored the stalemate trap detection code

in movepick.cpp:
- use the variables `from`, `to`, `piece`, `pieceType` and `capturedPiece`  instead of calling the same functions multiple times in `MovePicker::score()`.
- rename `MovePicker::other_piece_types_mobile()`.

**Functional changes:**
- make sure the processed move is always legal in `MovePicker::other_piece_types_mobile()`.

passed non regression STC:
https://tests.stockfishchess.org/tests/view/6829da686ec7634154f99faf
LLR: 2.93 (-2.94,2.94) <-1.75,0.25>
Total: 95680 W: 24962 L: 24820 D: 45898
Ptnml(0-2): 221, 9622, 28025, 9738, 234

Passed non regression LTC:
https://tests.stockfishchess.org/tests/view/682a102c6ec7634154f9a086
LLR: 2.94 (-2.94,2.94) <-1.75,0.25>
Total: 117666 W: 30065 L: 29957 D: 57644
Ptnml(0-2): 45, 10173, 38291, 10277, 47

Run of 10k games on the stalemate opening book:
https://tests.stockfishchess.org/tests/view/682b114e6ec7634154f9aa2d
Elo: 0.76 ± 0.9 (95%) LOS: 95.3%
Total: 10000 W: 4637 L: 4615 D: 748
Ptnml(0-2): 0, 75, 4828, 97, 0
nElo: 5.83 ± 6.8 (95%) PairsRatio: 1.29

closes https://github.com/official-stockfish/Stockfish/pull/6080

Bench: 2422771
2025-05-21 07:25:40 +02:00
Shawn Xu
347e328fdb Simplify TT Replacement Strategy
Passed Non-regression STC:
LLR: 2.95 (-2.94,2.94) <-1.75,0.25>
Total: 50528 W: 13160 L: 12958 D: 24410
Ptnml(0-2): 132, 5681, 13439, 5877, 135
https://tests.stockfishchess.org/tests/view/682a8b296ec7634154f9a785

Passed Non-regression STC w/ High Hash Pressure:
LLR: 2.98 (-2.94,2.94) <-1.75,0.25>
Total: 30048 W: 7849 L: 7621 D: 14578
Ptnml(0-2): 75, 3390, 7884, 3582, 93
https://tests.stockfishchess.org/tests/view/682a9caf6ec7634154f9a7ae

Passed Non-regression LTC w/ High Hash Pressure:
LLR: 2.94 (-2.94,2.94) <-1.75,0.25>
Total: 17610 W: 4584 L: 4362 D: 8664
Ptnml(0-2): 7, 1799, 4974, 2015, 10
https://tests.stockfishchess.org/tests/view/682ab3966ec7634154f9a8c8

closes https://github.com/official-stockfish/Stockfish/pull/6079

Bench: 2422771
2025-05-19 20:49:28 +02:00
Michael Chaly
56ea1fadf1 Tweak low ply history
Increase low ply history maximum ply by 1 and also allow to use it for check evasions scoring.

Pased STC:
https://tests.stockfishchess.org/tests/view/682a2db36ec7634154f9a358
LLR: 2.95 (-2.94,2.94) <0.00,2.00>
Total: 66464 W: 17440 L: 17081 D: 31943
Ptnml(0-2): 191, 7717, 17056, 8078, 190

Passed LTC:
https://tests.stockfishchess.org/tests/view/682a3d406ec7634154f9a39c
LLR: 2.94 (-2.94,2.94) <0.50,2.50>
Total: 384030 W: 98476 L: 97452 D: 188102
Ptnml(0-2): 180, 41564, 107522, 42550, 199

closes https://github.com/official-stockfish/Stockfish/pull/6075

Bench: 2422771
2025-05-19 20:47:34 +02:00
Daniel Samek
ccfa651968 Remove full depth search reduction when cutNode
Passed STC-simplification bounds:
https://tests.stockfishchess.org/tests/view/6829dd6d6ec7634154f99fd3
LLR: 2.95 (-2.94,2.94) <-1.75,0.25>
Total: 67872 W: 17629 L: 17443 D: 32800
Ptnml(0-2): 167, 7988, 17451, 8152, 178

Passed LTC-simplification bounds:
https://tests.stockfishchess.org/tests/view/6829f2176ec7634154f9a01c
LLR: 2.94 (-2.94,2.94) <-1.75,0.25>
Total: 94818 W: 24328 L: 24184 D: 46306
Ptnml(0-2): 52, 10246, 26667, 10394, 50

closes https://github.com/official-stockfish/Stockfish/pull/6074

bench: 2245168
2025-05-19 20:44:14 +02:00
Shawn Xu
0f102f3692 Simplify Quiet Early Move Penalty
Passed STC:
LLR: 2.94 (-2.94,2.94) <-1.75,0.25>
Total: 185344 W: 47995 L: 47939 D: 89410
Ptnml(0-2): 527, 21898, 47754, 21978, 515
https://tests.stockfishchess.org/tests/view/682a47536ec7634154f9a3bc

Passed LTC:
LLR: 2.95 (-2.94,2.94) <-1.75,0.25>
Total: 101706 W: 26050 L: 25912 D: 49744
Ptnml(0-2): 53, 11056, 28499, 11190, 55
https://tests.stockfishchess.org/tests/view/682a61736ec7634154f9a50e

closes https://github.com/official-stockfish/Stockfish/pull/6072

Bench: 2012032
2025-05-19 07:45:28 +02:00
mstembera
009632c465 Simplify handling of good/bad quiets
Simplify the handling of good/bad quiets and make it more similar to the way we
handle good/bad captures. The good quiet limit was adjusted from -7998 to
-14000 to keep the ratio of good/bad quiets about the same as master.  This
also fixes a "bug" that previously returned some bad quiets during the
GOOD_QUIET stage when some qood quiets weren't sorted at low depths.

STC https://tests.stockfishchess.org/tests/view/6827a68c6ec7634154f9992b
LLR: 2.93 (-2.94,2.94) <-1.75,0.25>
Total: 75936 W: 19722 L: 19547 D: 36667
Ptnml(0-2): 186, 8937, 19589, 9028, 228

LTC https://tests.stockfishchess.org/tests/view/6828f8096ec7634154f99b82
LLR: 2.94 (-2.94,2.94) <-1.75,0.25>
Total: 104112 W: 26773 L: 26638 D: 50701
Ptnml(0-2): 51, 11363, 29098, 11488, 56

closes https://github.com/official-stockfish/Stockfish/pull/6071

Bench: 2007023
2025-05-19 07:42:56 +02:00
Shawn Xu
39942db3ff Simplify In-Check Statscore
Passed Non-regression STC:
LLR: 2.98 (-2.94,2.94) <-1.75,0.25>
Total: 129760 W: 33701 L: 33580 D: 62479
Ptnml(0-2): 359, 15248, 33575, 15309, 389
https://tests.stockfishchess.org/tests/view/681a88193629b02d74b17123

Passed Non-regression LTC:
LLR: 2.94 (-2.94,2.94) <-1.75,0.25>
Total: 519612 W: 132224 L: 132512 D: 254876
Ptnml(0-2): 246, 56823, 145960, 56527, 250
https://tests.stockfishchess.org/tests/view/681f9ed43629b02d74b177c3

closes https://github.com/official-stockfish/Stockfish/pull/6070

bench: 2046462
2025-05-19 07:39:56 +02:00
Daniel Monroe
6e9b5af0f0 Check evaluation after ttMove before doing a tt cut
Passed STC
LLR: 2.97 (-2.94,2.94) <0.00,2.00>
Total: 239136 W: 62222 L: 61608 D: 115306
Ptnml(0-2): 675, 28046, 61525, 28634, 688
https://tests.stockfishchess.org/tests/view/681053293629b02d74b1668f

Passed LTC
LLR: 2.94 (-2.94,2.94) <0.50,2.50>
Total: 448770 W: 115237 L: 114088 D: 219445
Ptnml(0-2): 177, 48128, 126619, 49291, 170
https://tests.stockfishchess.org/tests/view/681902de3629b02d74b16f6d

closes https://github.com/official-stockfish/Stockfish/pull/6069

Bench: 2035432
2025-05-19 07:37:15 +02:00
FauziAkram
4f76768fcf Remove a moveCount condition
Passed STC:
LLR: 2.93 (-2.94,2.94) <-1.75,0.25>
Total: 70816 W: 18315 L: 18134 D: 34367
Ptnml(0-2): 210, 8213, 18360, 8436, 189
https://tests.stockfishchess.org/tests/view/68248197a527315e07cccb2d

Passed LTC:
LLR: 2.95 (-2.94,2.94) <-1.75,0.25>
Total: 121770 W: 31248 L: 31130 D: 59392
Ptnml(0-2): 61, 13338, 33995, 13404, 87
https://tests.stockfishchess.org/tests/view/68272ff46ec7634154f998ad

closes https://github.com/official-stockfish/Stockfish/pull/6067

bench: 2319161
2025-05-19 07:31:26 +02:00
Mapika
1b6975ac41 Add quiet move streak tracking to search stack
Passed STC:
LLR: 2.93 (-2.94,2.94) <0.00,2.00>
Total: 109344 W: 28473 L: 28053 D: 52818
Ptnml(0-2): 320, 12756, 28085, 13206, 305
https://tests.stockfishchess.org/tests/view/6828c43e6ec7634154f99a10

Passed LTC:
LLR: 2.96 (-2.94,2.94) <0.50,2.50>
Total: 76308 W: 19721 L: 19323 D: 37264
Ptnml(0-2): 39, 8145, 21386, 8547, 37
https://tests.stockfishchess.org/tests/view/6828f65a6ec7634154f99b72

closes https://github.com/official-stockfish/Stockfish/pull/6066

Bench: 2161814
2025-05-19 07:27:26 +02:00
Shawn Xu
6f445631ab Simplify Futility Margin
Passed STC Non-regression:
LLR: 2.94 (-2.94,2.94) <-1.75,0.25>
Total: 159008 W: 41500 L: 41414 D: 76094
Ptnml(0-2): 501, 18821, 40759, 18937, 486
https://tests.stockfishchess.org/tests/view/680ff9e23629b02d74b1663a

Passed LTC Non-regression:
LLR: 2.94 (-2.94,2.94) <-1.75,0.25>
Total: 163572 W: 41617 L: 41543 D: 80412
Ptnml(0-2): 90, 17755, 46024, 17825, 92
https://tests.stockfishchess.org/tests/view/6814dd973629b02d74b16bac

closes https://github.com/official-stockfish/Stockfish/pull/6065

Bench: 2018775
2025-05-19 07:22:20 +02:00
Shawn Xu
e4b0f37493 Shrink Enum Sizes
Passed STC:
LLR: 2.95 (-2.94,2.94) <0.00,2.00>
Total: 110848 W: 28974 L: 28564 D: 53310
Ptnml(0-2): 302, 12118, 30132, 12612, 260
https://tests.stockfishchess.org/tests/view/68242770a527315e07ccca38

closes https://github.com/official-stockfish/Stockfish/pull/6063

no functional change
2025-05-19 07:17:39 +02:00
Shawn Xu
6b7e05f0c5 Simplify PCM TTMove Bonus
Passed Non-regression STC:
LLR: 2.97 (-2.94,2.94) <-1.75,0.25>
Total: 114048 W: 29597 L: 29459 D: 54992
Ptnml(0-2): 315, 13619, 29045, 13703, 342
https://tests.stockfishchess.org/tests/view/681e83533629b02d74b17701

Passed Non-regression LTC:
LLR: 2.94 (-2.94,2.94) <-1.75,0.25>
Total: 61014 W: 15582 L: 15405 D: 30027
Ptnml(0-2): 25, 6485, 17307, 6668, 22
https://tests.stockfishchess.org/tests/view/68226b523629b02d74b17b89

closes https://github.com/official-stockfish/Stockfish/pull/6061

bench 2016566
2025-05-19 07:13:39 +02:00
FauziAkram
07f6edf934 Refactor Position::pseudo_legal Pawn Move Check
use intermediate variables to make the statement easier to read

closes https://github.com/official-stockfish/Stockfish/pull/6045

No functional change
2025-05-19 07:12:11 +02:00
FauziAkram
c4e2479a75 Introducing a depth component to the penalty.
Passed STC:
LLR: 2.97 (-2.94,2.94) <0.00,2.00>
Total: 31648 W: 8358 L: 8050 D: 15240
Ptnml(0-2): 78, 3596, 8182, 3876, 92
https://tests.stockfishchess.org/tests/view/680fa73d3629b02d74b165a9

Passed LTC:
LLR: 2.95 (-2.94,2.94) <0.50,2.50>
Total: 177720 W: 45524 L: 44920 D: 87276
Ptnml(0-2): 91, 19130, 49813, 19736, 90
https://tests.stockfishchess.org/tests/view/68109e2c3629b02d74b166ee

closes https://github.com/official-stockfish/Stockfish/pull/6060

Bench: 2251724
2025-05-13 20:47:50 +02:00
Shawn Xu
b5f11085dd Do more extensions in nodes far from the root
Parameters found by @FauziAkram in the latest tune.

Passed VVLTC w/ LTC Bound:
LLR: 2.95 (-2.94,2.94) <0.50,2.50>
Total: 243264 W: 63452 L: 62774 D: 117038
Ptnml(0-2): 18, 22494, 75934, 23164, 22
https://tests.stockfishchess.org/tests/view/680e82b23629b02d74b15e27

Passed VVLTC w/ STC Bound:
LLR: 2.94 (-2.94,2.94) <0.00,2.00>
Total: 37838 W: 9935 L: 9667 D: 18236
Ptnml(0-2): 6, 3383, 11873, 3651, 6
https://tests.stockfishchess.org/tests/view/681e707a3629b02d74b176e8

STC Elo Estimate:
Elo: -0.49 ± 2.4 (95%) LOS: 34.8%
Total: 20000 W: 5118 L: 5146 D: 9736
Ptnml(0-2): 55, 2365, 5182, 2349, 49
nElo: -0.96 ± 4.8 (95%) PairsRatio: 0.99
https://tests.stockfishchess.org/tests/view/6822af4b3629b02d74b17be6

closes https://github.com/official-stockfish/Stockfish/pull/6059

Bench: 2135382
2025-05-13 20:46:27 +02:00
Daniel Monroe
1f9af9966f Simplify futility pruning term
Passed simplification STC
LLR: 2.94 (-2.94,2.94) <-1.75,0.25>
Total: 49920 W: 12933 L: 12727 D: 24260
Ptnml(0-2): 141, 5865, 12752, 6051, 151
https://tests.stockfishchess.org/tests/view/681d01d33629b02d74b1756e

Passed simplification LTC
LLR: 2.96 (-2.94,2.94) <-1.75,0.25>
Total: 68808 W: 17573 L: 17402 D: 33833
Ptnml(0-2): 23, 7343, 19511, 7494, 33
https://tests.stockfishchess.org/tests/view/681e33843629b02d74b176b1

closes https://github.com/official-stockfish/Stockfish/pull/6058

Bench: 2041086
2025-05-13 20:42:36 +02:00
FauziAkram
40ef7b1212 Simplify probcut
Passed STC:
LLR: 2.97 (-2.94,2.94) <-1.75,0.25>
Total: 80800 W: 20947 L: 20774 D: 39079
Ptnml(0-2): 217, 9446, 20906, 9609, 222
https://tests.stockfishchess.org/tests/view/680e83163629b02d74b15e2a

Passed LTC:
LLR: 2.94 (-2.94,2.94) <-1.75,0.25>
Total: 359004 W: 91362 L: 91486 D: 176156
Ptnml(0-2): 177, 39133, 101007, 39007, 178
https://tests.stockfishchess.org/tests/view/680e95db3629b02d74b15e7a

closes https://github.com/official-stockfish/Stockfish/pull/6054

Bench: 2060860
2025-05-13 20:41:01 +02:00
gab8192
05e39527a8 Simplify low-ply history weight
Passed Non-regression STC:
LLR: 2.97 (-2.94,2.94) <-1.75,0.25>
Total: 116064 W: 30095 L: 29959 D: 56010
Ptnml(0-2): 333, 13753, 29731, 13875, 340
https://tests.stockfishchess.org/tests/view/681229f53629b02d74b168d3

Passed Non-regression LTC:
LLR: 2.94 (-2.94,2.94) <-1.75,0.25>
Total: 172824 W: 44096 L: 44031 D: 84697
Ptnml(0-2): 91, 18868, 48410, 18971, 72
https://tests.stockfishchess.org/tests/view/681474243629b02d74b16b44

closes https://github.com/official-stockfish/Stockfish/pull/6053

Bench: 2064179
2025-05-13 20:40:26 +02:00
FauziAkram
d4b405a5a6 Remove a cutNode condition
Allow some nodes to spawn even deeper lmr searches, also for cutNodes

Passed STC:
LLR: 2.96 (-2.94,2.94) <-1.75,0.25>
Total: 59072 W: 15387 L: 15190 D: 28495
Ptnml(0-2): 145, 6956, 15166, 7095, 174
https://tests.stockfishchess.org/tests/view/6810b4c13629b02d74b16714

Passed LTC:
LLR: 2.94 (-2.94,2.94) <-1.75,0.25>
Total: 93294 W: 23737 L: 23591 D: 45966
Ptnml(0-2): 47, 10116, 26169, 10274, 41
https://tests.stockfishchess.org/tests/view/681170613629b02d74b167f3

closes https://github.com/official-stockfish/Stockfish/pull/6052

Bench: 1937565
2025-05-13 20:39:23 +02:00
mstembera
63a2ab1510 Simplify Captures scoring
STC https://tests.stockfishchess.org/tests/view/680ee3b43629b02d74b160f3
LLR: 2.96 (-2.94,2.94) <-1.75,0.25>
Total: 200896 W: 52089 L: 52048 D: 96759
Ptnml(0-2): 592, 23821, 51589, 23846, 600

LTC https://tests.stockfishchess.org/tests/view/68106adf3629b02d74b166b1
LLR: 2.94 (-2.94,2.94) <-1.75,0.25>
Total: 128856 W: 33026 L: 32917 D: 62913
Ptnml(0-2): 54, 13916, 36384, 14015, 59

Replaces discovered checks by direct checks which are simpler and more
common.  It's also what's used for scoring quiets.

closes https://github.com/official-stockfish/Stockfish/pull/6047

Bench: 1886966
2025-05-13 20:39:02 +02:00
Shawn Xu
e9925b122f Simplify ttCapture LMR
Passed Non-regression STC:
LLR: 2.93 (-2.94,2.94) <-1.75,0.25>
Total: 51104 W: 13389 L: 13184 D: 24531
Ptnml(0-2): 182, 5940, 13068, 6215, 147
https://tests.stockfishchess.org/tests/view/680ef2503629b02d74b16498

Passed Non-regression LTC:
LLR: 2.94 (-2.94,2.94) <-1.75,0.25>
Total: 73350 W: 18804 L: 18638 D: 35908
Ptnml(0-2): 30, 7906, 20639, 8068, 32
https://tests.stockfishchess.org/tests/view/6810510e3629b02d74b1668a

closes https://github.com/official-stockfish/Stockfish/pull/6044

Bench: 1805151
2025-05-13 20:36:42 +02:00
mstembera
7afd9e859d Simplify MovePicker::score() by using piece types for clarity
STC https://tests.stockfishchess.org/tests/view/680ed6cc3629b02d74b160bf
LLR: 2.93 (-2.94,2.94) <-1.75,0.25>
Total: 200128 W: 51838 L: 51802 D: 96488
Ptnml(0-2): 457, 19956, 59247, 19902, 502

closes https://github.com/official-stockfish/Stockfish/pull/6039

No functional change
2025-05-13 20:34:40 +02:00
Michael Chaly
b73c8982df Another scaling revert
Revert recent passer because it seems to not scale for longer time
controls. Adjust comments accordingly.

Passed VVLTC SPRT with STC bounds:
https://tests.stockfishchess.org/tests/view/680ddff73629b02d74b15b3f
LLR: 2.95 (-2.94,2.94) <0.00,2.00>
Total: 198316 W: 51317 L: 50846 D: 96153
Ptnml(0-2): 17, 18459, 61737, 18926, 19

Passed VVLTC SPRT with LTC bounds:
https://tests.stockfishchess.org/tests/view/680d5b7e3629b02d74b15a21
LLR: 2.95 (-2.94,2.94) <0.50,2.50>
Total: 123274 W: 31738 L: 31283 D: 60253
Ptnml(0-2): 7, 11444, 38282, 11895, 9

closes https://github.com/official-stockfish/Stockfish/pull/6038

Bench: 2173845
2025-05-13 20:32:54 +02:00
Shawn Xu
81cc004060 Remove risk tolerance
Passed STC:
LLR: 2.93 (-2.94,2.94) <-1.75,0.25>
Total: 379328 W: 97567 L: 97724 D: 184037
Ptnml(0-2): 909, 44861, 98314, 44638, 942
https://tests.stockfishchess.org/tests/view/680defc63629b02d74b15b62

Passed LTC:
LLR: 2.94 (-2.94,2.94) <-1.75,0.25>
Total: 160752 W: 40762 L: 40685 D: 79305
Ptnml(0-2): 60, 17548, 45091, 17609, 68
https://tests.stockfishchess.org/tests/view/680e8ff43629b02d74b15e65

closes https://github.com/official-stockfish/Stockfish/pull/6037

Bench: 1897340
2025-05-13 20:31:37 +02:00
Carlos Esparza
63c6f22627 Simplify DirtyPiece
Simplifies the DirtyPiece struct.

passed STC: https://tests.stockfishchess.org/tests/view/68054c9798cd372e3aea05d7
LLR: 2.94 (-2.94,2.94) <-1.75,0.25>
Total: 36608 W: 9641 L: 9437 D: 17530
Ptnml(0-2): 89, 3630, 10668, 3822, 95

passed LTC: https://tests.stockfishchess.org/tests/view/6805514598cd372e3aea0783
LLR: 2.95 (-2.94,2.94) <-1.75,0.25>
Total: 127620 W: 31993 L: 31894 D: 63733
Ptnml(0-2): 42, 10983, 41665, 11074, 46

closes https://github.com/official-stockfish/Stockfish/pull/6016

No functional change
2025-05-13 20:30:22 +02:00
FauziAkram
ed6b8d179a Refactor futility_margin
a small term to the futility calculation that depends on eval - beta.

Passed STC:
LLR: 2.95 (-2.94,2.94) <0.00,2.00>
Total: 203136 W: 52797 L: 52239 D: 98100
Ptnml(0-2): 549, 23827, 52255, 24391, 546
https://tests.stockfishchess.org/tests/view/680e84a43629b02d74b15e2e

Passed LTC:
LLR: 2.94 (-2.94,2.94) <0.50,2.50>
Total: 100356 W: 25950 L: 25507 D: 48899
Ptnml(0-2): 35, 10683, 28302, 11120, 38
https://tests.stockfishchess.org/tests/view/680ebcb03629b02d74b16040

closes https://github.com/official-stockfish/Stockfish/pull/6009
closes https://github.com/official-stockfish/Stockfish/pull/6041

Bench: 1815939

Co-authored-by: Michael Chaly <Vizvezdenec@gmail.com>
Co-authored-by: xu-shawn <50402888+xu-shawn@users.noreply.github.com>
2025-05-13 20:26:28 +02:00
pb00067
0f905b4e88 Preserve all moves in movepicker
Simplifies method otherPieceTypesMobile quite a bit and makes it more
precise. More precise because capturesSearched list not always contains
all processed captures:
although extremely rare, it can happen that a 'good' capture get pruned
at step 14 and doesn't make it to capturesSearched. (functional change
at higher depths)

passed STC-simplification bounds
https://tests.stockfishchess.org/tests/view/68025974cd501869c6698153
LLR: 2.93 (-2.94,2.94) <-1.75,0.25>
Total: 273664 W: 15658 L: 15681 D: 242325
Ptnml(0-2): 166, 10368, 115802, 10315, 181

passed LTC-simplification bounds
https://tests.stockfishchess.org/tests/view/6804b86acd501869c6698673
LLR: 2.94 (-2.94,2.94) <-1.75,0.25>
Total: 96780 W: 24547 L: 24419 D: 47814
Ptnml(0-2): 30, 8466, 31286, 8562, 46

Applied changes requested by disservin & retested to be on the safe side

STC:
https://tests.stockfishchess.org/tests/view/6806110698cd372e3aea5919
LLR: 2.95 (-2.94,2.94) <-1.75,0.25>
Total: 107392 W: 27739 L: 27606 D: 52047
Ptnml(0-2): 266, 10867, 31306, 10982, 275

LTC:
https://tests.stockfishchess.org/tests/view/6806346198cd372e3aea5965
LLR: 2.94 (-2.94,2.94) <-1.75,0.25>
Total: 233484 W: 59106 L: 59103 D: 115275
Ptnml(0-2): 116, 22787, 70939, 22778, 122

closes https://github.com/official-stockfish/Stockfish/pull/6005

Bench: 1857323
2025-05-13 20:23:43 +02:00
FauziAkram
94e6c0498f VVLTC Tune
Passed VVLTC with LTC bounds:
LLR: 2.94 (-2.94,2.94) <0.50,2.50>
Total: 12800 W: 3432 L: 3184 D: 6184
Ptnml(0-2): 1, 1098, 3954, 1346, 1
https://tests.stockfishchess.org/tests/view/680e255e3629b02d74b15d5e

Passed VVLTC with STC bounds:
LLR: 2.94 (-2.94,2.94) <0.00,2.00>
Total: 14402 W: 3865 L: 3625 D: 6912
Ptnml(0-2): 0, 1236, 4490, 1474, 1
https://tests.stockfishchess.org/tests/view/680e4dfb3629b02d74b15da6

Passed VVLTC third test (removing an unrelated change):
LLR: 2.95 (-2.94,2.94) <0.50,2.50>
Total: 25584 W: 6670 L: 6398 D: 12516
Ptnml(0-2): 4, 2290, 7932, 2562, 4
https://tests.stockfishchess.org/tests/view/680e74223629b02d74b15def

closes https://github.com/official-stockfish/Stockfish/pull/6036

Bench: 1857323
2025-04-27 20:47:47 +02:00
Daniel Monroe
af3692b2d0 Simplify second probcut to linear function of depth
Passed non-regression STC
LLR: 2.95 (-2.94,2.94) <-1.75,0.25>
Total: 57472 W: 14962 L: 14765 D: 27745
Ptnml(0-2): 140, 6715, 14817, 6936, 128
https://tests.stockfishchess.org/tests/view/680df1063629b02d74b15b69

Passed non-regression LTC
LLR: 2.95 (-2.94,2.94) <-1.75,0.25>
Total: 88416 W: 22499 L: 22348 D: 43569
Ptnml(0-2): 31, 9565, 24874, 9698, 40
https://tests.stockfishchess.org/tests/view/680df3a93629b02d74b15b7d

closes https://github.com/official-stockfish/Stockfish/pull/6035

Bench: 1792000
2025-04-27 20:31:40 +02:00
Michael Chaly
3e26d3acc7 Do more pruning in moves loop
Effectively reverts one commit from some months ago.

Passed VVLTC SPRT with STC bounds
https://tests.stockfishchess.org/tests/view/680d39373629b02d74b156d7
LLR: 2.95 (-2.94,2.94) <0.00,2.00>
Total: 405058 W: 104843 L: 104111 D: 196104
Ptnml(0-2): 35, 38029, 125672, 38755, 38

Passed VVLTC SPRT with LTC bounds
https://tests.stockfishchess.org/tests/view/680d1a3b3629b02d74b1563d
LLR: 2.94 (-2.94,2.94) <0.50,2.50>
Total: 57032 W: 14917 L: 14588 D: 27527
Ptnml(0-2): 6, 5202, 17768, 5537, 3

closes https://github.com/official-stockfish/Stockfish/pull/6031

Bench: 1643819
2025-04-27 20:28:22 +02:00
FauziAkram
267fd8a3d5 Tweak History Bonus
Inspired by @Ergodice , who came up first with the idea.

Passed STC:
LLR: 2.96 (-2.94,2.94) <0.00,2.00>
Total: 18400 W: 4867 L: 4576 D: 8957
Ptnml(0-2): 52, 2052, 4714, 2317, 65
https://tests.stockfishchess.org/tests/view/68062a3c98cd372e3aea5959

Passed LTC:
LLR: 2.95 (-2.94,2.94) <0.50,2.50>
Total: 454338 W: 116461 L: 115294 D: 222583
Ptnml(0-2): 198, 49139, 127346, 50270, 216
https://tests.stockfishchess.org/tests/view/6806347c98cd372e3aea5967

Passed VLTC non-reg:
LLR: 2.94 (-2.94,2.94) <-1.75,0.25>
Total: 385970 W: 98401 L: 98546 D: 189023
Ptnml(0-2): 51, 38958, 115105, 38827, 44
https://tests.stockfishchess.org/tests/view/680cfe873629b02d74b155cf

closes https://github.com/official-stockfish/Stockfish/pull/6030

Bench: 1715817
2025-04-27 20:14:38 +02:00
FauziAkram
e5aa4b48c6 Simplify Evasion Move Scoring
Passed STC:
LLR: 2.94 (-2.94,2.94) <-1.75,0.25>
Total: 160256 W: 41432 L: 41348 D: 77476
Ptnml(0-2): 485, 19034, 41028, 19074, 507
https://tests.stockfishchess.org/tests/view/680d242c3629b02d74b15662

Passed LTC:
LLR: 2.94 (-2.94,2.94) <-1.75,0.25>
Total: 103086 W: 26388 L: 26252 D: 50446
Ptnml(0-2): 41, 11174, 28982, 11300, 46
https://tests.stockfishchess.org/tests/view/680d47f83629b02d74b1571e

closes https://github.com/official-stockfish/Stockfish/pull/6029

Bench: 1937261
2025-04-27 19:47:08 +02:00
FauziAkram
f98c178960 Improve quiet moves bonus
Inspired by an old test by Peregrine.

Passed STC:
LLR: 2.94 (-2.94,2.94) <0.00,2.00>
Total: 453024 W: 117244 L: 116316 D: 219464
Ptnml(0-2): 1336, 53355, 116258, 54171, 1392
https://tests.stockfishchess.org/tests/view/680ccacc3629b02d74b15532

Passed LTC:
LLR: 2.94 (-2.94,2.94) <0.50,2.50>
Total: 140550 W: 35990 L: 35462 D: 69098
Ptnml(0-2): 65, 15152, 39319, 15668, 71
https://tests.stockfishchess.org/tests/view/680d2ed73629b02d74b15691

closes https://github.com/official-stockfish/Stockfish/pull/6028

Bench: 1708152
2025-04-27 19:45:37 +02:00
Myself
37cc2293ef Replace complex probcut function with a precomputed table.
Passed non-regression STC:
https://tests.stockfishchess.org/tests/view/680de2683629b02d74b15b46
LLR: 2.94 (-2.94,2.94) <-1.75,0.25>
Total: 116000 W: 29864 L: 29742 D: 56394
Ptnml(0-2): 215, 11811, 33854, 11877, 243

closes https://github.com/official-stockfish/Stockfish/pull/6026

No functional change
2025-04-27 19:45:00 +02:00
Shawn Xu
b0a7a34d3f Simplify malus calculation
closes https://github.com/official-stockfish/Stockfish/pull/6024

No functional change
2025-04-27 19:43:47 +02:00
Carlos Esparza
f0de8dc034 Simplify move ordering bonuses for putting piece en prise and escaping capture
Now there is also a penalty for exposing knights and bishops to capture by a pawn.

Passed STC:
https://tests.stockfishchess.org/tests/view/68074379878abf56f9a0d5b1
LLR: 2.94 (-2.94,2.94) <-1.75,0.25>
Total: 96512 W: 24841 L: 24687 D: 46984
Ptnml(0-2): 294, 11336, 24835, 11504, 287

Passed LTC:
https://tests.stockfishchess.org/tests/view/6808954a878abf56f9a0d76d
LLR: 2.95 (-2.94,2.94) <-1.75,0.25>
Total: 221328 W: 56271 L: 56255 D: 108802
Ptnml(0-2): 131, 24149, 62071, 24199, 114

closes https://github.com/official-stockfish/Stockfish/pull/6023

Bench: 1778227
2025-04-27 19:42:06 +02:00
Carlos Esparza
fda269a299 Introduce double incremental accumulator updates
when we need to update an accumulator by two moves and the second move
captures the piece moved in the first move, we can skip computing the
middle accumulator and cancel a feature add with a feature remove to
save work.

Passed STC
https://tests.stockfishchess.org/tests/view/67f70b1c31d7cf8afdc45f51
LLR: 2.93 (-2.94,2.94) <0.00,2.00>
Total: 72800 W: 18878 L: 18529 D: 35393
Ptnml(0-2): 160, 7711, 20374, 7930, 225

closes https://github.com/official-stockfish/Stockfish/pull/5988

No functional change
2025-04-27 19:27:18 +02:00
Shawn Xu
4e49f8dff9 Clean up search
* Correct IIR scaling comments
* Replace `(PvNode || cutNode)` with `!allNode`
* Consistent formatting for scaler tags
* Add comments to some recently-introduced LMR terms
* Add comments on PCM bonus tweaks

Passed Non-regression STC:
LLR: 2.94 (-2.94,2.94) <-1.75,0.25>
Total: 389472 W: 102457 L: 102622 D: 184393
Ptnml(0-2): 1676, 41887, 107798, 41676, 1699
https://tests.stockfishchess.org/tests/view/67a0ea670774dfd78deb23cd

closes https://github.com/official-stockfish/Stockfish/pull/5854

Bench: 1585741
2025-04-27 19:26:02 +02:00
Michael Chaly
27428a61c2 Allow some nodes to spawn even deeper lmr searches
This includes nodes that were PvNode on a previous ply and only for non cutNodes and movecounts < 8.

Passed STC:
https://tests.stockfishchess.org/tests/view/680cdf2e3629b02d74b15576
LLR: 2.94 (-2.94,2.94) <0.00,2.00>
Total: 123552 W: 31979 L: 31539 D: 60034
Ptnml(0-2): 322, 14449, 31803, 14871, 331

Passed LTC:
https://tests.stockfishchess.org/tests/view/680cf09a3629b02d74b15599
LLR: 2.94 (-2.94,2.94) <0.50,2.50>
Total: 114306 W: 29310 L: 28837 D: 56159
Ptnml(0-2): 51, 12247, 32090, 12708, 57

closes https://github.com/official-stockfish/Stockfish/pull/6022

bench: 1585741
2025-04-26 22:40:41 +02:00
Michael Chaly
4b58079485 Simplify and cleanup futility pruning for child nodes
This patch removes (eval - beta) / 8 addition and adjusts constants
accordingly, also moves every calculation into futility_margin function.

Passed STC:
https://tests.stockfishchess.org/tests/view/6806d00f878abf56f9a0d524
LLR: 2.99 (-2.94,2.94) <-1.75,0.25>
Total: 483456 W: 124592 L: 124860 D: 234004
Ptnml(0-2): 1419, 57640, 123927, 57274, 1468

Passed LTC:
https://tests.stockfishchess.org/tests/view/680cceb33629b02d74b1554c
LLR: 2.94 (-2.94,2.94) <-1.75,0.25>
Total: 263868 W: 67076 L: 67104 D: 129688
Ptnml(0-2): 155, 28893, 73846, 28905, 135

closes https://github.com/official-stockfish/Stockfish/pull/6021

bench: 1618439
2025-04-26 22:38:14 +02:00
Myself
7e6a0c464b Check only if good see.
Passed STC:
LLR: 2.95 (-2.94,2.94) <0.00,2.00>
Total: 106400 W: 27863 L: 27444 D: 51093
Ptnml(0-2): 320, 12431, 27275, 12858, 316
https://tests.stockfishchess.org/tests/view/6806239498cd372e3aea5949

Passed LTC:
LLR: 2.96 (-2.94,2.94) <0.50,2.50>
Total: 420534 W: 107594 L: 106494 D: 206446
Ptnml(0-2): 197, 45541, 117722, 46579, 228
https://tests.stockfishchess.org/tests/view/6806b4b3878abf56f9a0d4fc

closes https://github.com/official-stockfish/Stockfish/pull/6020

bench: 1675758
2025-04-26 22:33:49 +02:00
FauziAkram
5f32b3ed4b Remove unneeded return statement
closes https://github.com/official-stockfish/Stockfish/pull/6013

No functional change
2025-04-26 22:26:39 +02:00
breatn
f590767b91 Adaptive beta cut
passed STC:
LLR: 2.94 (-2.94,2.94) <0.00,2.00>
Total: 197088 W: 51084 L: 50533 D: 95471
Ptnml(0-2): 547, 23201, 50577, 23592, 627
https://tests.stockfishchess.org/tests/view/680604d798cd372e3aea58fe

passed LTC
LLR: 2.96 (-2.94,2.94) <0.50,2.50>
Total: 127950 W: 32719 L: 32214 D: 63017
Ptnml(0-2): 69, 13825, 35673, 14348, 60
https://tests.stockfishchess.org/tests/view/6805eae498cd372e3aea588a

closes https://github.com/official-stockfish/Stockfish/pull/6012

bench 1579003
2025-04-26 22:23:26 +02:00
FauziAkram
8b85290313 Remove manual stack alignment workaround for GCC < 9.3
consequently using these old compilers will now error out.

closes https://github.com/official-stockfish/Stockfish/pull/6008

No functional change.
2025-04-26 22:15:08 +02:00
Nonlinear2
0dcfe096d6 Increase full depth search reduction when cutNode
In addition to the core patch, improve the use of `isTTMove`:

- this name was used to mean both `bestMove == ttData.move` and `move == ttData.move`, so i replaced the argument `isTTMove` of `update_all_stats` with `TTMove` directly.

- `ttData.move == move` was  still used in some places instead of `ss->isTTMove`. I replaced these to be more consistent.

Passed STC:
https://tests.stockfishchess.org/tests/view/68057b8f98cd372e3aea3472
LLR: 2.93 (-2.94,2.94) <0.00,2.00>
Total: 38400 W: 10048 L: 9734 D: 18618
Ptnml(0-2): 102, 4360, 9956, 4686, 96

Passed LTC:
https://tests.stockfishchess.org/tests/view/68057f7c98cd372e3aea3842
LLR: 2.94 (-2.94,2.94) <0.50,2.50>
Total: 312666 W: 79494 L: 78616 D: 154556
Ptnml(0-2): 144, 33809, 87563, 34659, 158

closes https://github.com/official-stockfish/Stockfish/pull/6007

Bench: 1623376
2025-04-26 22:11:40 +02:00
Daniel Monroe
88a524c552 Tweak futility formula
Passed STC
LLR: 2.93 (-2.94,2.94) <0.00,2.00>
Total: 248448 W: 64344 L: 63718 D: 120386
Ptnml(0-2): 750, 29172, 63783, 29740, 779
https://tests.stockfishchess.org/tests/view/68056f5598cd372e3aea2901

Passed LTC
LLR: 2.94 (-2.94,2.94) <0.50,2.50>
Total: 118824 W: 30358 L: 29874 D: 58592
Ptnml(0-2): 59, 12797, 33228, 13257, 71
https://tests.stockfishchess.org/tests/view/6805675698cd372e3aea20d0

closes https://github.com/official-stockfish/Stockfish/pull/6004

bench 1839796
2025-04-26 22:06:18 +02:00
Stefan Geschwentner
7988de4aa3 Prefer discovered and double checks in capture ordering.
Increase weight for captured piece value if also a discovered or double check.

Passed STC:
LLR: 2.94 (-2.94,2.94) <0.00,2.00>
Total: 142144 W: 36632 L: 36164 D: 69348
Ptnml(0-2): 429, 16470, 36788, 16974, 411
https://tests.stockfishchess.org/tests/view/68052d7498cd372e3ae9faaa

Passed LTC:
LLR: 2.94 (-2.94,2.94) <0.50,2.50>
Total: 668328 W: 170837 L: 169238 D: 328253
Ptnml(0-2): 308, 72010, 187990, 73487, 369
https://tests.stockfishchess.org/tests/view/68053c9398cd372e3aea043b

closes https://github.com/official-stockfish/Stockfish/pull/6003

Bench: 1636625
2025-04-26 22:01:34 +02:00
FauziAkram
f6b0d53a99 Re-adding the 5th continuation history
the two simplifications (#5992 and  #5978) are strongly correlated.

Passed STC:
LLR: 2.95 (-2.94,2.94) <0.00,2.00>
Total: 98016 W: 25415 L: 25008 D: 47593
Ptnml(0-2): 291, 11509, 25005, 11908, 295
https://tests.stockfishchess.org/tests/view/6802774fcd501869c6698168

Passed LTC:
LLR: 2.95 (-2.94,2.94) <0.50,2.50>
Total: 588588 W: 150073 L: 148633 D: 289882
Ptnml(0-2): 281, 63615, 165078, 65023, 297
https://tests.stockfishchess.org/tests/view/6804cbdacd501869c66987dc

closes https://github.com/official-stockfish/Stockfish/pull/6002

Bench: 1783315
2025-04-26 21:58:46 +02:00
Michael Chaly
449a8b017e Do second step of shallower search
Specifically if lmr depth is not lower than new depth and value is really bad.

Passed STC:
https://tests.stockfishchess.org/tests/view/6805444898cd372e3aea0494
LLR: 2.94 (-2.94,2.94) <0.00,2.00>
Total: 77664 W: 20136 L: 19762 D: 37766
Ptnml(0-2): 214, 9006, 20039, 9338, 235

Passed LTC:
https://tests.stockfishchess.org/tests/view/680549b298cd372e3aea05ac
LLR: 2.94 (-2.94,2.94) <0.50,2.50>
Total: 115458 W: 29447 L: 28971 D: 57040
Ptnml(0-2): 62, 12357, 32421, 12821, 68

closes https://github.com/official-stockfish/Stockfish/pull/6001

bench: 1515501
2025-04-26 21:52:36 +02:00
Shawn Xu
4176ad7b0a simplify risk tolerance
Passed Non-regression STC:
LLR: 2.98 (-2.94,2.94) <-1.75,0.25>
Total: 73408 W: 19028 L: 18844 D: 35536
Ptnml(0-2): 201, 8709, 18743, 8807, 244
https://tests.stockfishchess.org/tests/view/68051f3698cd372e3ae9f63a

Passed Non-regression LTC:
LLR: 2.94 (-2.94,2.94) <-1.75,0.25>
Total: 91236 W: 23193 L: 23045 D: 44998
Ptnml(0-2): 34, 9908, 25599, 10030, 47
https://tests.stockfishchess.org/tests/view/6805239498cd372e3ae9fa41

closes https://github.com/official-stockfish/Stockfish/pull/6000

bench 1864632
2025-04-26 21:49:38 +02:00
Shawn Xu
16cd38dba1 Tweak TT Move Reduction by TT Move History
Passed STC:
LLR: 2.93 (-2.94,2.94) <0.00,2.00>
Total: 62496 W: 16197 L: 15844 D: 30455
Ptnml(0-2): 200, 7234, 16011, 7619, 184
https://tests.stockfishchess.org/tests/view/67f7fa9b31d7cf8afdc4609c

Passed LTC:
LLR: 2.94 (-2.94,2.94) <0.50,2.50>
Total: 400470 W: 102068 L: 101012 D: 197390
Ptnml(0-2): 201, 43207, 112347, 44295, 185
https://tests.stockfishchess.org/tests/view/67f927b0cd501869c66975e0

Passed VVLTC Non-regression:
LLR: 2.94 (-2.94,2.94) <-1.75,0.25>
Total: 125394 W: 32408 L: 32304 D: 60682
Ptnml(0-2): 3, 11702, 39188, 11796, 8
https://tests.stockfishchess.org/tests/view/6804c215cd501869c66986b9

closes https://github.com/official-stockfish/Stockfish/pull/5998

bench 1760988
2025-04-26 21:43:11 +02:00
Shawn Xu
b915ed702a remove StateInfo::next
unused.

closes https://github.com/official-stockfish/Stockfish/pull/5997

no functional change
2025-04-26 21:37:17 +02:00
Joost VandeVondele
f2507d0562 Add x86-64-avxvnni in CI
will result in the corresponding binaries being available for download

closes https://github.com/official-stockfish/Stockfish/pull/5990

No functional change
2025-04-26 21:34:26 +02:00
Shawn Xu
f273eea71f Remove non-functional accumulator reset
Passed Non-regression STC:
LLR: 2.93 (-2.94,2.94) <-1.75,0.25>
Total: 219360 W: 56600 L: 56583 D: 106177
Ptnml(0-2): 582, 23419, 61620, 23518, 541
https://tests.stockfishchess.org/tests/view/67fad20dcd501869c669780f

closes https://github.com/official-stockfish/Stockfish/pull/5986

no functional change
2025-04-26 21:30:01 +02:00
Joost VandeVondele
f9459e4c8e Use matching llvm-profdata
when using multiple clang compilers in parallel, it is necessary to use
a matching llvm-profdata, as the profile data format may change between
versions. To use the proper llvm-profdata binary, the one in the same directory
as the compiler is used. This allows for code like:

```bash
echo "Compiling clang"
for comp in clang++-11 clang++-12 clang++-13 clang++-14  clang++-15  clang++-16  clang++-17 clang++-18 clang++-19 clang++-20
do
   make -j profile-build CXX=$comp COMP=clang >& out.compile.$comp
   mv stockfish stockfish.$comp
done
```

after installing the required versions with the automatic installation script (https://apt.llvm.org/)

closes https://github.com/official-stockfish/Stockfish/pull/5958

No functional change
2025-04-26 21:25:18 +02:00
Daniel Monroe
3d18ad719b Skip 5th continuation history
Passed simplification STC
LLR: 2.94 (-2.94,2.94) <-1.75,0.25>
Total: 70208 W: 18098 L: 17914 D: 34196
Ptnml(0-2): 199, 8300, 17907, 8514, 184
https://tests.stockfishchess.org/tests/view/67ed889b31d7cf8afdc451cb

Passed simplification LTC
LLR: 2.94 (-2.94,2.94) <-1.75,0.25>
Total: 485004 W: 122703 L: 122956 D: 239345
Ptnml(0-2): 288, 53162, 135805, 53009, 238
https://tests.stockfishchess.org/tests/view/67ee810231d7cf8afdc452ea

closes https://github.com/official-stockfish/Stockfish/pull/5992

Bench: 1715901
2025-04-18 14:32:26 +02:00
Guenther Demetz
698c069bba Move node increment inside do_move function
Move node increment inside do_move function so that we can centralize
the increment into a single point.

closes https://github.com/official-stockfish/Stockfish/pull/5989

No functional change
2025-04-18 14:32:26 +02:00
Shawn Xu
2b4926e091 Simplify TT Move History
Part 1 passed Non-regression STC:
LLR: 2.94 (-2.94,2.94) <-1.75,0.25>
Total: 195552 W: 50394 L: 50349 D: 94809
Ptnml(0-2): 581, 23222, 50122, 23273, 578
https://tests.stockfishchess.org/tests/view/67eb6ea831d7cf8afdc44c30

Part 2 passed Non-regression STC:
LLR: 2.92 (-2.94,2.94) <-1.75,0.25>
Total: 181664 W: 46786 L: 46727 D: 88151
Ptnml(0-2): 517, 21403, 46974, 21380, 558
https://tests.stockfishchess.org/tests/view/67eb6f3331d7cf8afdc44c33

Passed Non-regression LTC:
LLR: 2.96 (-2.94,2.94) <-1.75,0.25>
Total: 155454 W: 39496 L: 39412 D: 76546
Ptnml(0-2): 77, 16950, 43580, 17052, 68
https://tests.stockfishchess.org/tests/view/67eee76531d7cf8afdc45358

Passed Non-regression VLTC:
LLR: 2.94 (-2.94,2.94) <-1.75,0.25>
Total: 118266 W: 30263 L: 30148 D: 57855
Ptnml(0-2): 11, 11844, 35309, 11957, 12
https://tests.stockfishchess.org/tests/view/67f2414a31d7cf8afdc45760

closes https://github.com/official-stockfish/Stockfish/pull/5987

Bench: 1792850
2025-04-18 14:32:26 +02:00
pb00067
d2d046c2a4 Improve stalemate detection during search
Currently SF’s quiescence search like most alpha-beta based engines
doesn’t verify for stalemate because doing it each leaf position is to
expensive and costs elo. However in certain positions this creates a
blindspot for SF, not recognizing soon enough that the opponent can
reach a stalemate by sacrifycing his last mobile heavy piece(s). This
tactical motif & it’s measure are similar to zugzwang & verification
search: the measure itself does not gain elo, but prevents SF from
loosing/drawing games in an awkward way.

The fix consists of 3 measures:

1. Make qsearch verify for stalemate on transitions to pure KP-material
   for the side to move with our last Rook/Queen just been captured. In
    fact this is the scenario where stalemate happens with highest
    frequency. The stalemate-verification itself is optimized by merely
    checking for pawn pushes & king mobility (captures were already
    tried by qssearch)

2. Another culprit for the issue figured out to be SEE based pruning for
   checks in step 14. Here often the move forcing the stalemate (or
   forcing the opponent to not retake) get pruned away and it need to
   much time to reach enough depth. To encounter this we verify
   following conditions:

-   a) side to move is happy with a draw (alpha < 0)
-   b) we are about to sacrify our last heavy & unique mobile piece in
    this position.
-   c) this piece doesn’t move away from our kingring giving the king a
    new square to move.
     When all 3 conditions meet we don’t prune the move, because there
     is a good chance that capturing the piece means stalemate.

3. Store terminal nodes (mates & stalemates) in TT with higher depth
   than searched depth. This prevents SF from:
- reanalyzing the node  (=trying to generate legal moves) in vain at
  each iterative deepening step.
- overwriting an already correct draw-evaluation from a previous shallow
  normal search by a qsearch which doesn’t recognize stalemate and might
  store a verry erratic evaluation.
 This is due to the 4 constant in the TT-overwrite condition: d -
 DEPTH_ENTRY_OFFSET + 2 * pv > depth8 – 4 which allows qs to override
 entries made by normal searches with depth <=4.
This 3hrd measure however is not essential for fixing the issue, but
tests (one of vdv & one  of mine) seem to suggest that this measure
brings some small benefit.

Another other position where SF benefits from this fix is for instance
Position FEN 8/8/8/1B6/6p1/8/3K1Ppp/3N2kr w - - 0 1 bm f4 +M9

P.S.: Also this issue higly depends on the used net, how good the net is
at evaluate such mobility restricted positions. SF16 was pretty good in
solving 2rr4/5pBk/PqP3p1/1N3pPp/1PQ1bP1P/8/3R4/R4K2 b - - 0 40 bm Rxc6
(< 1 second) while SF16_1 with introduction of the dual net needs about
1,5 minutes and SF17.1 requires 3 minutes to find the drawing move Rxc6.

P.S.2: Using more threads produces indeterminism & using high hash
pressure makes SF reevaluate explored positions more often which makes
it more likely to solve the position. To have stable meaningful results
I tested therfore with one single thread and low hash pressure.

Preliminary LTC test at 30k games
https://tests.stockfishchess.org/tests/view/67ece7a931d7cf8afdc44e18 Elo: 0.04 ± 2.0 (95%) LOS: 51.7%
Total: 24416 W: 6226 L: 6223 D: 11967
Ptnml(0-2): 12, 2497, 7185, 2504, 10
nElo: 0.09 ± 4.4 (95%) PairsRatio: 1.00

Passed LTC no-regression sprt
https://tests.stockfishchess.org/tests/view/67ee8e4631d7cf8afdc452fb
LLR: 2.94 (-2.94,2.94) <-1.75,0.25>
Total: 401556 W: 101612 L: 101776 D: 198168
Ptnml(0-2): 152, 42241, 116170, 42049, 166

closes https://github.com/official-stockfish/Stockfish/pull/5983
fixes https://github.com/official-stockfish/Stockfish/issues/5899

Bench: 1721673
2025-04-18 14:32:26 +02:00
FauziAkram
44efbaddea Simplify bonusScale calculation
Allowing this specific term to potentially become negative for low depth
values is ok, because the overall `bonusScale` is explicitly ensured to
be non-negative a few lines later: bonusScale = std::max(bonusScale, 0);

Passed STC:
LLR: 2.93 (-2.94,2.94) <-1.75,0.25>
Total: 164928 W: 42446 L: 42368 D: 80114
Ptnml(0-2): 497, 19551, 42306, 19597, 513
https://tests.stockfishchess.org/tests/view/67debf0b8888403457d8736c

Passed LTC:
LLR: 2.94 (-2.94,2.94) <-1.75,0.25>
Total: 234942 W: 59539 L: 59537 D: 115866
Ptnml(0-2): 113, 25639, 65964, 25643, 112
https://tests.stockfishchess.org/tests/view/67e2e1c48888403457d87768

closes https://github.com/official-stockfish/Stockfish/pull/5979

Bench: 1933843
2025-04-18 14:32:26 +02:00
Daniel Monroe
904a016396 Don't use 5th continuation history in move ordering
Passed simplification STC
LLR: 2.96 (-2.94,2.94) <-1.75,0.25>
Total: 136960 W: 35374 L: 35262 D: 66324
Ptnml(0-2): 420, 16214, 35049, 16428, 369
https://tests.stockfishchess.org/tests/view/67e6d0ae6682f97da2178ee5

Passed simplification LTC
LLR: 2.94 (-2.94,2.94) <-1.75,0.25>
Total: 233016 W: 59063 L: 59059 D: 114894
Ptnml(0-2): 113, 25430, 65421, 25428, 116
https://tests.stockfishchess.org/tests/view/67e9f7fb31d7cf8afdc44ad4

closes https://github.com/official-stockfish/Stockfish/pull/5978

Bench: 1842721
2025-04-18 14:32:26 +02:00
FauziAkram
5f8e67a544 Remove combineLast3 optimization
Passed non-reg STC 1st:
LLR: 2.96 (-2.94,2.94) <-1.75,0.25>
Total: 67328 W: 17296 L: 17118 D: 32914
Ptnml(0-2): 158, 7095, 19011, 7211, 189
https://tests.stockfishchess.org/tests/view/67e6c2796682f97da2178ebe

Passed non-reg STC 2nd:
LLR: 2.93 (-2.94,2.94) <-1.75,0.25>
Total: 92288 W: 23885 L: 23734 D: 44669
Ptnml(0-2): 213, 10039, 25518, 10132, 242
https://tests.stockfishchess.org/tests/view/67ed6a2d31d7cf8afdc45190

closes https://github.com/official-stockfish/Stockfish/pull/5975

Bench: 1875196
2025-04-18 14:32:26 +02:00
mstembera
8d2eef2b1e Fix fused() all controll paths should return a value.
closes https://github.com/official-stockfish/Stockfish/pull/5973

No functional change
2025-04-18 14:32:26 +02:00
AliceRoselia
2af64d581b Update AUTHORS
closes https://github.com/official-stockfish/Stockfish/pull/5972

No functional change
2025-04-18 14:32:19 +02:00
FauziAkram
cf8b3637a0 Improve futility pruning
Adding a small term to the futility calculation that depends on eval - beta.
Refactored to a simpler form.

Passed STC:
LLR: 2.93 (-2.94,2.94) <0.00,2.00>
Total: 74176 W: 19323 L: 18954 D: 35899
Ptnml(0-2): 226, 8576, 19117, 8941, 228
https://tests.stockfishchess.org/tests/view/67e6b0946682f97da2178eaf

Passed LTC:
LLR: 2.96 (-2.94,2.94) <0.50,2.50>
Total: 135090 W: 34499 L: 33983 D: 66608
Ptnml(0-2): 79, 14403, 38040, 14969, 54
https://tests.stockfishchess.org/tests/view/67e757cc6682f97da2178f62

closes https://github.com/official-stockfish/Stockfish/pull/5970

Bench: 1875196
2025-04-18 14:32:19 +02:00
Disservin
bb3eaf8def Add cstddef header
Fix missing header for std::size_t reported on discord.

closes https://github.com/official-stockfish/Stockfish/pull/5968

No functional change
2025-04-18 14:32:19 +02:00
mstembera
1577fa0470 Simplify Forward and Backward
Forward and Backward are not independent so simplify to a bool.

Cleanup some MSVC warnings like "warning C4267: 'argument': conversion
from 'size_t' to 'int', possible loss of data" Other minor formatting
stuff.

This is a rebase of https://github.com/official-stockfish/Stockfish/pull/5912

closes https://github.com/official-stockfish/Stockfish/pull/5967

No functional change
2025-04-18 14:32:19 +02:00
Daniel Monroe
fb6a3e04ec Simply use non_pawn_material rather than summing tuned terms
Passed simplification STC
LLR: 2.93 (-2.94,2.94) <-1.75,0.25>
Total: 136576 W: 35285 L: 35175 D: 66116
Ptnml(0-2): 410, 16179, 34997, 16295, 407
https://tests.stockfishchess.org/tests/view/67e5dc736682f97da2178da6

Passed rebased simplification LTC
LLR: 2.94 (-2.94,2.94) <-1.75,0.25>
Total: 85482 W: 21812 L: 21658 D: 42012
Ptnml(0-2): 34, 9260, 24022, 9368, 57
https://tests.stockfishchess.org/tests/view/67e852cb31d7cf8afdc44966

closes https://github.com/official-stockfish/Stockfish/pull/5965

Bench: 2006483
2025-04-18 14:32:19 +02:00
Daniel Samek
15f34560f2 Update AUTHORS
closes https://github.com/official-stockfish/Stockfish/pull/5964

No functional change
2025-04-18 14:32:13 +02:00
Disservin
7beff18ef0 Retire Acc Pointer
Since @xu-shawn's refactor the acc pointer has become a bit unnecessary,
we can achieve the same behavior by using the dimension size to get the
right accumulator from the `AccumulatorState`. I think the acc pointer
has become a bit of a burden required to be passed through multiple
different functions together with the necessary templating required.

Passed Non-Regression STC:
https://tests.stockfishchess.org/tests/view/67ed600f31d7cf8afdc45183
LLR: 2.94 (-2.94,2.94) <-1.75,0.25>
Total: 279744 W: 72037 L: 72082 D: 135625
Ptnml(0-2): 673, 29918, 78767, 29809, 705

closes https://github.com/official-stockfish/Stockfish/pull/5942

No functional change
2025-04-18 13:54:09 +02:00
Shawn Xu
d7c04a9429 Introduce TT Move History Double Extensions
Passed VVLTC w/ STC bounds:
LLR: 2.94 (-2.94,2.94) <0.00,2.00>
Total: 74918 W: 19436 L: 19120 D: 36362
Ptnml(0-2): 6, 6890, 23354, 7200, 9
https://tests.stockfishchess.org/tests/view/67e4a1088888403457d878bb

Passed VVLTC w/ LTC bounds:
LLR: 2.95 (-2.94,2.94) <0.50,2.50>
Total: 111706 W: 29050 L: 28619 D: 54037
Ptnml(0-2): 13, 10218, 34959, 10651, 12
https://tests.stockfishchess.org/tests/view/67d6877c517865b4a2dfd36b

STC Elo Estimate:
Elo: 1.26 ± 2.0 (95%) LOS: 88.8%
Total: 30000 W: 7855 L: 7746 D: 14399
Ptnml(0-2): 104, 3531, 7630, 3622, 113
nElo: 2.44 ± 3.9 (95%) PairsRatio: 1.03
https://tests.stockfishchess.org/tests/view/67eacb8c31d7cf8afdc44b99

closes https://github.com/official-stockfish/Stockfish/pull/5961

Bench: 1887897
2025-04-02 17:53:05 +02:00
Daniel Samek
d942e13398 Less fail high cnt in the condition
Passed STC:
https://tests.stockfishchess.org/tests/view/67e027538888403457d87535
LLR: 2.93 (-2.94,2.94) <0.00,2.00>
Total: 164000 W: 42535 L: 42034 D: 79431
Ptnml(0-2): 478, 19228, 42113, 19677, 504

Passed LTC:
https://tests.stockfishchess.org/tests/view/67e3c21b8888403457d87808
LLR: 2.94 (-2.94,2.94) <0.50,2.50>
Total: 139176 W: 35500 L: 34975 D: 68701
Ptnml(0-2): 54, 15038, 38898, 15525, 73

closes https://github.com/official-stockfish/Stockfish/pull/5960

Bench: 1921404
2025-04-02 17:52:38 +02:00
Daniel Monroe
3d61f932cb Squash out post-lmr bonus variable
Squash out bonus variable for post-lmr now that the bonus is constant.

closes https://github.com/official-stockfish/Stockfish/pull/5953

No functional change
2025-04-02 17:43:40 +02:00
mstembera
ed89817f62 Various cleanups
Various simplifications, cleanups, consistancy improvements, and warning
mitigations.

Passed Non-Regression STC:
https://tests.stockfishchess.org/tests/view/67e7dd2d6682f97da2178fd8
LLR: 2.94 (-2.94,2.94) <-1.75,0.25>
Total: 386848 W: 99593 L: 99751 D: 187504
Ptnml(0-2): 1024, 41822, 107973, 41498, 1107

closes https://github.com/official-stockfish/Stockfish/pull/5948

No functional change
2025-04-02 17:43:14 +02:00
mstembera
1a395f1b56 Remove pawn_attacks_bb()
Generalize attacks_bb() to handle pawns and remove pawn_attacks_bb()

https://tests.stockfishchess.org/tests/view/67e9496231d7cf8afdc44a2e
LLR: 2.94 (-2.94,2.94) <-1.75,0.25>
Total: 134688 W: 34660 L: 34553 D: 65475
Ptnml(0-2): 298, 14619, 37462, 14608, 357

closes https://github.com/official-stockfish/Stockfish/pull/5947

No functional change
2025-04-02 17:39:40 +02:00
Shawn Xu
d2cb927a04 Simplify TT cutoff conthist updates
Passed Non-regression STC:
LLR: 2.95 (-2.94,2.94) <-1.75,0.25>
Total: 86304 W: 22251 L: 22084 D: 41969
Ptnml(0-2): 250, 10214, 22123, 10249, 316
https://tests.stockfishchess.org/tests/view/67db60cd8c7f315cc372aae7

Passed Non-regression LTC:
LLR: 2.94 (-2.94,2.94) <-1.75,0.25>
Total: 199158 W: 50655 L: 50617 D: 97886
Ptnml(0-2): 103, 21579, 56182, 21607, 108
https://tests.stockfishchess.org/tests/view/67dcdc5b8c7f315cc372ac12

closes https://github.com/official-stockfish/Stockfish/pull/5945

Bench: 2069191
2025-04-02 17:37:02 +02:00
FauziAkram
dfef7e7520 Remove redundant assignment
closes https://github.com/official-stockfish/Stockfish/pull/5944

No functional change
2025-04-02 17:36:46 +02:00
Shawn Xu
ee35a51c40 Remove extra division
closes https://github.com/official-stockfish/Stockfish/pull/5943

No functional change
2025-04-02 17:34:37 +02:00
Shawn Xu
c2ff7a95c3 Cleanup fused updates
Passed Non-regression STC:
LLR: 2.95 (-2.94,2.94) <-1.75,0.25>
Total: 70656 W: 18257 L: 18077 D: 34322
Ptnml(0-2): 217, 7912, 18879, 8114, 206
https://tests.stockfishchess.org/tests/view/67e23ae78888403457d876d4

closes https://github.com/official-stockfish/Stockfish/pull/5941

No functional change
2025-04-02 17:34:31 +02:00
Disservin
0475c8653f Restore development
closes https://github.com/official-stockfish/Stockfish/pull/5956

No functional change
2025-04-02 17:23:32 +02:00
47 changed files with 1568 additions and 1395 deletions

View File

@@ -9,14 +9,14 @@ AllowAllParametersOfDeclarationOnNextLine: true
AllowShortCaseLabelsOnASingleLine: false
AllowShortEnumsOnASingleLine: false
AllowShortIfStatementsOnASingleLine: false
AlwaysBreakTemplateDeclarations: Yes
BreakTemplateDeclarations: Yes
BasedOnStyle: WebKit
BitFieldColonSpacing: After
BinPackParameters: false
BreakBeforeBinaryOperators: NonAssignment
BreakBeforeBraces: Custom
BraceWrapping:
AfterFunction: false
AfterFunction: false
AfterClass: false
AfterControlStatement: true
BeforeElse: true

View File

@@ -4,7 +4,7 @@
"name": "Android NDK aarch64",
"os": "ubuntu-22.04",
"simple_name": "android",
"compiler": "aarch64-linux-android21-clang++",
"compiler": "aarch64-linux-android29-clang++",
"emu": "qemu-aarch64",
"comp": "ndk",
"shell": "bash",
@@ -14,7 +14,7 @@
"name": "Android NDK arm",
"os": "ubuntu-22.04",
"simple_name": "android",
"compiler": "armv7a-linux-androideabi21-clang++",
"compiler": "armv7a-linux-androideabi29-clang++",
"emu": "qemu-arm",
"comp": "ndk",
"shell": "bash",
@@ -26,25 +26,25 @@
{
"binaries": "armv8-dotprod",
"config": {
"compiler": "armv7a-linux-androideabi21-clang++"
"compiler": "armv7a-linux-androideabi29-clang++"
}
},
{
"binaries": "armv8",
"config": {
"compiler": "armv7a-linux-androideabi21-clang++"
"compiler": "armv7a-linux-androideabi29-clang++"
}
},
{
"binaries": "armv7",
"config": {
"compiler": "aarch64-linux-android21-clang++"
"compiler": "aarch64-linux-android29-clang++"
}
},
{
"binaries": "armv7-neon",
"config": {
"compiler": "aarch64-linux-android21-clang++"
"compiler": "aarch64-linux-android29-clang++"
}
}
]

150
.github/ci/matrix.json vendored
View File

@@ -40,6 +40,18 @@
"ext": ".exe",
"sde": "/d/a/Stockfish/Stockfish/.output/sde-temp-files/sde-external-9.27.0-2023-09-13-win/sde.exe -future --",
"archive_ext": "zip"
},
{
"name": "Windows 11 Mingw-w64 Clang arm64",
"os": "windows-11-arm",
"simple_name": "windows",
"compiler": "clang++",
"comp": "clang",
"msys_sys": "clangarm64",
"msys_env": "clang-aarch64-clang",
"shell": "msys2 {0}",
"ext": ".exe",
"archive_ext": "zip"
}
],
"binaries": [
@@ -51,7 +63,9 @@
"x86-64-avx512",
"x86-64-vnni256",
"x86-64-vnni512",
"apple-silicon"
"apple-silicon",
"armv8",
"armv8-dotprod"
],
"exclude": [
{
@@ -84,12 +98,6 @@
"os": "macos-14"
}
},
{
"binaries": "x86-64-avxvnni",
"config": {
"os": "macos-14"
}
},
{
"binaries": "x86-64-avx512",
"config": {
@@ -108,12 +116,6 @@
"os": "macos-14"
}
},
{
"binaries": "x86-64-avxvnni",
"config": {
"ubuntu-22.04": null
}
},
{
"binaries": "x86-64-avxvnni",
"config": {
@@ -138,6 +140,54 @@
"os": "macos-13"
}
},
{
"binaries": "x86-64",
"config": {
"os": "windows-11-arm"
}
},
{
"binaries": "x86-64-sse41-popcnt",
"config": {
"os": "windows-11-arm"
}
},
{
"binaries": "x86-64-avx2",
"config": {
"os": "windows-11-arm"
}
},
{
"binaries": "x86-64-bmi2",
"config": {
"os": "windows-11-arm"
}
},
{
"binaries": "x86-64-avxvnni",
"config": {
"os": "windows-11-arm"
}
},
{
"binaries": "x86-64-avx512",
"config": {
"os": "windows-11-arm"
}
},
{
"binaries": "x86-64-vnni256",
"config": {
"os": "windows-11-arm"
}
},
{
"binaries": "x86-64-vnni512",
"config": {
"os": "windows-11-arm"
}
},
{
"binaries": "apple-silicon",
"config": {
@@ -147,7 +197,13 @@
{
"binaries": "apple-silicon",
"config": {
"os": "macos-13"
"os": "windows-11-arm"
}
},
{
"binaries": "apple-silicon",
"config": {
"os": "ubuntu-20.04"
}
},
{
@@ -155,6 +211,72 @@
"config": {
"os": "ubuntu-22.04"
}
},
{
"binaries": "apple-silicon",
"config": {
"os": "macos-13"
}
},
{
"binaries": "armv8",
"config": {
"os": "windows-2022"
}
},
{
"binaries": "armv8",
"config": {
"os": "ubuntu-20.04"
}
},
{
"binaries": "armv8",
"config": {
"os": "ubuntu-22.04"
}
},
{
"binaries": "armv8",
"config": {
"os": "macos-13"
}
},
{
"binaries": "armv8",
"config": {
"os": "macos-14"
}
},
{
"binaries": "armv8-dotprod",
"config": {
"os": "windows-2022"
}
},
{
"binaries": "armv8-dotprod",
"config": {
"os": "ubuntu-20.04"
}
},
{
"binaries": "armv8-dotprod",
"config": {
"os": "ubuntu-22.04"
}
},
{
"binaries": "armv8-dotprod",
"config": {
"os": "macos-13"
}
},
{
"binaries": "armv8-dotprod",
"config": {
"os": "macos-14"
}
}
]
}

View File

@@ -38,7 +38,7 @@ jobs:
if: runner.os == 'Linux'
run: |
if [ $COMP == ndk ]; then
NDKV="21.4.7075529"
NDKV="27.2.12479018"
ANDROID_ROOT=/usr/local/lib/android
ANDROID_SDK_ROOT=$ANDROID_ROOT/sdk
SDKMANAGER=$ANDROID_SDK_ROOT/cmdline-tools/latest/bin/sdkmanager

View File

@@ -25,11 +25,11 @@ jobs:
ref: ${{ github.event.pull_request.head.sha }}
- name: Run clang-format style check
uses: jidicula/clang-format-action@f62da5e3d3a2d88ff364771d9d938773a618ab5e # @v4.11.0
uses: jidicula/clang-format-action@4726374d1aa3c6aecf132e5197e498979588ebc8 # @v4.15.0
id: clang-format
continue-on-error: true
with:
clang-format-version: "18"
clang-format-version: "20"
exclude-regex: "incbin"
- name: Comment on PR
@@ -37,9 +37,9 @@ jobs:
uses: thollander/actions-comment-pull-request@fabd468d3a1a0b97feee5f6b9e499eab0dd903f6 # @v2.5.0
with:
message: |
clang-format 18 needs to be run on this PR.
clang-format 20 needs to be run on this PR.
If you do not have clang-format installed, the maintainer will run it when merging.
For the exact version please see https://packages.ubuntu.com/noble/clang-format-18.
For the exact version please see https://packages.ubuntu.com/plucky/clang-format-20.
_(execution **${{ github.run_id }}** / attempt **${{ github.run_attempt }}**)_
comment_tag: execution

View File

@@ -47,7 +47,7 @@ jobs:
- name: Build
working-directory: src
run: make -j build ARCH=x86-64-modern
run: make -j build
- name: Perform CodeQL Analysis
uses: github/codeql-action/analyze@v3

View File

@@ -63,13 +63,13 @@ jobs:
- name: Check compiler
run: $COMPCXX -v
- name: Show g++ cpu info
if: runner.os != 'macOS'
run: g++ -Q -march=native --help=target
- name: Show clang++ cpu info
if: runner.os == 'macOS'
run: clang++ -E - -march=native -###
- name: Show compiler cpu info
run: |
if [[ "$COMPCXX" == clang* ]]; then
$COMPCXX -E - -march=native -###
else
$COMPCXX -Q -march=native --help=target
fi
# x86-64 with newer extensions tests

View File

@@ -29,24 +29,25 @@ jobs:
shell: bash
- name: Android NDK aarch64
os: ubuntu-22.04
compiler: aarch64-linux-android21-clang++
compiler: aarch64-linux-android29-clang++
comp: ndk
run_armv8_tests: true
shell: bash
- name: Android NDK arm
os: ubuntu-22.04
compiler: armv7a-linux-androideabi21-clang++
compiler: armv7a-linux-androideabi29-clang++
comp: ndk
run_armv7_tests: true
shell: bash
- name: Linux GCC riscv64
os: ubuntu-22.04
compiler: g++
comp: gcc
run_riscv64_tests: true
base_image: "riscv64/alpine:edge"
platform: linux/riscv64
shell: bash
# Currently segfaults in the CI unrelated to a Stockfish change.
# - name: Linux GCC riscv64
# os: ubuntu-22.04
# compiler: g++
# comp: gcc
# run_riscv64_tests: true
# base_image: "riscv64/alpine:edge"
# platform: linux/riscv64
# shell: bash
- name: Linux GCC ppc64
os: ubuntu-22.04
compiler: g++
@@ -98,6 +99,14 @@ jobs:
msys_sys: clang64
msys_env: clang-x86_64-clang
shell: msys2 {0}
- name: Windows 11 Mingw-w64 Clang arm64
os: windows-11-arm
compiler: clang++
comp: clang
run_armv8_tests: true
msys_sys: clangarm64
msys_env: clang-aarch64-clang
shell: msys2 {0}
defaults:
run:
working-directory: src
@@ -118,7 +127,7 @@ jobs:
if: runner.os == 'Linux'
run: |
if [ $COMP == ndk ]; then
NDKV="21.4.7075529"
NDKV="27.2.12479018"
ANDROID_ROOT=/usr/local/lib/android
ANDROID_SDK_ROOT=$ANDROID_ROOT/sdk
SDKMANAGER=$ANDROID_SDK_ROOT/cmdline-tools/latest/bin/sdkmanager
@@ -302,8 +311,10 @@ jobs:
- name: Test armv8 build
if: matrix.config.run_armv8_tests
run: |
export PATH=${{ env.ANDROID_NDK_BIN }}:$PATH
export LDFLAGS="-static -Wno-unused-command-line-argument"
if [ $COMP == ndk ]; then
export PATH=${{ env.ANDROID_NDK_BIN }}:$PATH
export LDFLAGS="-static -Wno-unused-command-line-argument"
fi
make clean
make -j4 ARCH=armv8 build
../tests/signature.sh $benchref
@@ -311,8 +322,10 @@ jobs:
- name: Test armv8-dotprod build
if: matrix.config.run_armv8_tests
run: |
export PATH=${{ env.ANDROID_NDK_BIN }}:$PATH
export LDFLAGS="-static -Wno-unused-command-line-argument"
if [ $COMP == ndk ]; then
export PATH=${{ env.ANDROID_NDK_BIN }}:$PATH
export LDFLAGS="-static -Wno-unused-command-line-argument"
fi
make clean
make -j4 ARCH=armv8-dotprod build
../tests/signature.sh $benchref

View File

@@ -20,6 +20,7 @@ Alexander Kure
Alexander Pagel (Lolligerhans)
Alfredo Menezes (lonfom169)
Ali AlZhrani (Cooffe)
AliceRoselia
Andreas Jan van der Meulen (Andyson007)
Andreas Matthies (Matthies)
Andrei Vetrov (proukornew)
@@ -33,6 +34,7 @@ Artem Solopiy (EntityFX)
Auguste Pop
Balazs Szilagyi
Balint Pfliegel
Baptiste Rech (breatn)
Ben Chaney (Chaneybenjamini)
Ben Koshy (BKSpurgeon)
Bill Henry (VoyagerOne)
@@ -57,6 +59,7 @@ Dale Weiler (graphitemaster)
Daniel Axtens (daxtens)
Daniel Dugovic (ddugovic)
Daniel Monroe (Ergodice)
Daniel Samek (DanSamek)
Dan Schmidt (dfannius)
Dariusz Orzechowski (dorzechowski)
David (dav1312)
@@ -129,6 +132,7 @@ Kenneth Lee (kennethlee33)
Kian E (KJE-98)
kinderchocolate
Kiran Panditrao (Krgp)
Kirill Zaripov (kokodio)
Kojirion
Krisztián Peőcz
Krystian Kuzniarek (kuzkry)
@@ -145,6 +149,7 @@ Lucas Braesch (lucasart)
Lyudmil Antonov (lantonov)
Maciej Żenczykowski (zenczykowski)
Malcolm Campbell (xoto10)
Mark Marosi (Mapika)
Mark Tenzer (31m059)
marotear
Mathias Parnaudeau (mparnaudeau)

View File

@@ -59,7 +59,7 @@ discussion._
Changes to Stockfish C++ code should respect our coding style defined by
[.clang-format](.clang-format). You can format your changes by running
`make format`. This requires clang-format version 18 to be installed on your system.
`make format`. This requires clang-format version 20 to be installed on your system.
## Navigate

View File

@@ -130,7 +130,13 @@ case $uname_s in
esac
file_ext='tar'
;;
'CYGWIN'*|'MINGW'*|'MSYS'*) # Windows system with POSIX compatibility layer
'MINGW'*'ARM64'*) # Windows ARM64 system with POSIX compatibility layer
# TODO: older chips might be armv8, but we have no good way to detect, /proc/cpuinfo shows x86 info
file_os='windows'
true_arch='armv8-dotprod'
file_ext='zip'
;;
'CYGWIN'*|'MINGW'*|'MSYS'*) # Windows x86_64system with POSIX compatibility layer
get_flags
check_znver_1_2
set_arch_x86_64

View File

@@ -60,9 +60,9 @@ SRCS = benchmark.cpp bitboard.cpp evaluate.cpp main.cpp \
HEADERS = benchmark.h bitboard.h evaluate.h misc.h movegen.h movepick.h history.h \
nnue/nnue_misc.h nnue/features/half_ka_v2_hm.h nnue/layers/affine_transform.h \
nnue/layers/affine_transform_sparse_input.h nnue/layers/clipped_relu.h nnue/layers/simd.h \
nnue/layers/affine_transform_sparse_input.h nnue/layers/clipped_relu.h \
nnue/layers/sqr_clipped_relu.h nnue/nnue_accumulator.h nnue/nnue_architecture.h \
nnue/nnue_common.h nnue/nnue_feature_transformer.h position.h \
nnue/nnue_common.h nnue/nnue_feature_transformer.h nnue/simd.h position.h \
search.h syzygy/tbprobe.h thread.h thread_win32_osx.h timeman.h \
tt.h tune.h types.h uci.h ucioption.h perft.h nnue/network.h engine.h score.h numa.h memory.h
@@ -163,8 +163,8 @@ lsx = no
lasx = no
STRIP = strip
ifneq ($(shell which clang-format-18 2> /dev/null),)
CLANG-FORMAT = clang-format-18
ifneq ($(shell which clang-format-20 2> /dev/null),)
CLANG-FORMAT = clang-format-20
else
CLANG-FORMAT = clang-format
endif
@@ -533,14 +533,12 @@ ifeq ($(KERNEL),Darwin)
XCRUN = xcrun
endif
# To cross-compile for Android, NDK version r21 or later is recommended.
# In earlier NDK versions, you'll need to pass -fno-addrsig if using GNU binutils.
# Currently we don't know how to make PGO builds with the NDK yet.
# To cross-compile for Android, use NDK version r27c or later.
ifeq ($(COMP),ndk)
CXXFLAGS += -stdlib=libc++ -fPIE
CXXFLAGS += -stdlib=libc++
comp=clang
ifeq ($(arch),armv7)
CXX=armv7a-linux-androideabi16-clang++
CXX=armv7a-linux-androideabi29-clang++
CXXFLAGS += -mthumb -march=armv7-a -mfloat-abi=softfp -mfpu=neon
ifneq ($(shell which arm-linux-androideabi-strip 2>/dev/null),)
STRIP=arm-linux-androideabi-strip
@@ -549,7 +547,7 @@ ifeq ($(COMP),ndk)
endif
endif
ifeq ($(arch),armv8)
CXX=aarch64-linux-android21-clang++
CXX=aarch64-linux-android29-clang++
ifneq ($(shell which aarch64-linux-android-strip 2>/dev/null),)
STRIP=aarch64-linux-android-strip
else
@@ -557,14 +555,28 @@ ifeq ($(COMP),ndk)
endif
endif
ifeq ($(arch),x86_64)
CXX=x86_64-linux-android21-clang++
CXX=x86_64-linux-android29-clang++
ifneq ($(shell which x86_64-linux-android-strip 2>/dev/null),)
STRIP=x86_64-linux-android-strip
else
STRIP=llvm-strip
endif
endif
LDFLAGS += -static-libstdc++ -pie -lm -latomic
LDFLAGS += -static-libstdc++
endif
### Allow overwriting CXX from command line
ifdef COMPCXX
CXX=$(COMPCXX)
endif
# llvm-profdata must be version compatible with the specified CXX (be it clang, or the gcc alias)
# make -j profile-build CXX=clang++-20 COMP=clang
# Locate the version in the same directory as the compiler used,
# with fallback to a generic one if it can't be located
LLVM_PROFDATA := $(dir $(realpath $(shell which $(CXX) 2> /dev/null)))llvm-profdata
ifeq ($(wildcard $(LLVM_PROFDATA)),)
LLVM_PROFDATA := llvm-profdata
endif
ifeq ($(comp),icx)
@@ -581,11 +593,6 @@ else
endif
endif
### Allow overwriting CXX from command line
ifdef COMPCXX
CXX=$(COMPCXX)
endif
### Sometimes gcc is really clang
ifeq ($(COMP),gcc)
gccversion := $(shell $(CXX) --version 2>/dev/null)
@@ -694,7 +701,7 @@ endif
ifeq ($(avx512),yes)
CXXFLAGS += -DUSE_AVX512
ifeq ($(comp),$(filter $(comp),gcc clang mingw icx))
CXXFLAGS += -mavx512f -mavx512bw
CXXFLAGS += -mavx512f -mavx512bw -mavx512dq -mavx512vl
endif
endif
@@ -989,10 +996,6 @@ net:
format:
$(CLANG-FORMAT) -i $(SRCS) $(HEADERS) -style=file
# default target
default:
help
### ==========================================================================
### Section 5. Private Targets
### ==========================================================================
@@ -1081,7 +1084,7 @@ clang-profile-make:
all
clang-profile-use:
$(XCRUN) llvm-profdata merge -output=stockfish.profdata *.profraw
$(XCRUN) $(LLVM_PROFDATA) merge -output=stockfish.profdata *.profraw
$(MAKE) ARCH=$(ARCH) COMP=$(COMP) \
EXTRACXXFLAGS='-fprofile-use=stockfish.profdata' \
EXTRALDFLAGS='-fprofile-use ' \
@@ -1118,6 +1121,6 @@ icx-profile-use:
.depend: $(SRCS)
-@$(CXX) $(DEPENDFLAGS) -MM $(SRCS) > $@ 2> /dev/null
ifeq (, $(filter $(MAKECMDGOALS), help strip install clean net objclean profileclean config-sanity))
ifeq (, $(filter $(MAKECMDGOALS), help strip install clean net objclean profileclean format config-sanity))
-include .depend
endif

View File

@@ -32,7 +32,6 @@ uint8_t SquareDistance[SQUARE_NB][SQUARE_NB];
Bitboard LineBB[SQUARE_NB][SQUARE_NB];
Bitboard BetweenBB[SQUARE_NB][SQUARE_NB];
Bitboard PseudoAttacks[PIECE_TYPE_NB][SQUARE_NB];
Bitboard PawnAttacks[COLOR_NB][SQUARE_NB];
alignas(64) Magic Magics[SQUARE_NB][2];
@@ -86,8 +85,8 @@ void Bitboards::init() {
for (Square s1 = SQ_A1; s1 <= SQ_H8; ++s1)
{
PawnAttacks[WHITE][s1] = pawn_attacks_bb<WHITE>(square_bb(s1));
PawnAttacks[BLACK][s1] = pawn_attacks_bb<BLACK>(square_bb(s1));
PseudoAttacks[WHITE][s1] = pawn_attacks_bb<WHITE>(square_bb(s1));
PseudoAttacks[BLACK][s1] = pawn_attacks_bb<BLACK>(square_bb(s1));
for (int step : {-9, -8, -7, -1, 1, 7, 8, 9})
PseudoAttacks[KING][s1] |= safe_destination(s1, step);

View File

@@ -62,7 +62,6 @@ extern uint8_t SquareDistance[SQUARE_NB][SQUARE_NB];
extern Bitboard BetweenBB[SQUARE_NB][SQUARE_NB];
extern Bitboard LineBB[SQUARE_NB][SQUARE_NB];
extern Bitboard PseudoAttacks[PIECE_TYPE_NB][SQUARE_NB];
extern Bitboard PawnAttacks[COLOR_NB][SQUARE_NB];
// Magic holds all magic bitboards relevant data for a single square
@@ -103,17 +102,17 @@ constexpr Bitboard square_bb(Square s) {
// Overloads of bitwise operators between a Bitboard and a Square for testing
// whether a given bit is set in a bitboard, and for setting and clearing bits.
inline Bitboard operator&(Bitboard b, Square s) { return b & square_bb(s); }
inline Bitboard operator|(Bitboard b, Square s) { return b | square_bb(s); }
inline Bitboard operator^(Bitboard b, Square s) { return b ^ square_bb(s); }
inline Bitboard& operator|=(Bitboard& b, Square s) { return b |= square_bb(s); }
inline Bitboard& operator^=(Bitboard& b, Square s) { return b ^= square_bb(s); }
constexpr Bitboard operator&(Bitboard b, Square s) { return b & square_bb(s); }
constexpr Bitboard operator|(Bitboard b, Square s) { return b | square_bb(s); }
constexpr Bitboard operator^(Bitboard b, Square s) { return b ^ square_bb(s); }
constexpr Bitboard& operator|=(Bitboard& b, Square s) { return b |= square_bb(s); }
constexpr Bitboard& operator^=(Bitboard& b, Square s) { return b ^= square_bb(s); }
inline Bitboard operator&(Square s, Bitboard b) { return b & s; }
inline Bitboard operator|(Square s, Bitboard b) { return b | s; }
inline Bitboard operator^(Square s, Bitboard b) { return b ^ s; }
constexpr Bitboard operator&(Square s, Bitboard b) { return b & s; }
constexpr Bitboard operator|(Square s, Bitboard b) { return b | s; }
constexpr Bitboard operator^(Square s, Bitboard b) { return b ^ s; }
inline Bitboard operator|(Square s1, Square s2) { return square_bb(s1) | s2; }
constexpr Bitboard operator|(Square s1, Square s2) { return square_bb(s1) | s2; }
constexpr bool more_than_one(Bitboard b) { return b & (b - 1); }
@@ -155,11 +154,6 @@ constexpr Bitboard pawn_attacks_bb(Bitboard b) {
: shift<SOUTH_WEST>(b) | shift<SOUTH_EAST>(b);
}
inline Bitboard pawn_attacks_bb(Color c, Square s) {
assert(is_ok(s));
return PawnAttacks[c][s];
}
// Returns a bitboard representing an entire line (from board edge
// to board edge) that intersects the two given squares. If the given squares
@@ -216,10 +210,10 @@ inline int edge_distance(File f) { return std::min(f, File(FILE_H - f)); }
// Returns the pseudo attacks of the given piece type
// assuming an empty board.
template<PieceType Pt>
inline Bitboard attacks_bb(Square s) {
inline Bitboard attacks_bb(Square s, Color c = COLOR_NB) {
assert((Pt != PAWN) && (is_ok(s)));
return PseudoAttacks[Pt][s];
assert((Pt != PAWN || c < COLOR_NB) && (is_ok(s)));
return Pt == PAWN ? PseudoAttacks[c][s] : PseudoAttacks[Pt][s];
}

View File

@@ -38,17 +38,15 @@
namespace Stockfish {
// Returns a static, purely materialistic evaluation of the position from
// the point of view of the given color. It can be divided by PawnValue to get
// the point of view of the side to move. It can be divided by PawnValue to get
// an approximation of the material advantage on the board in terms of pawns.
int Eval::simple_eval(const Position& pos, Color c) {
int Eval::simple_eval(const Position& pos) {
Color c = pos.side_to_move();
return PawnValue * (pos.count<PAWN>(c) - pos.count<PAWN>(~c))
+ (pos.non_pawn_material(c) - pos.non_pawn_material(~c));
}
bool Eval::use_smallnet(const Position& pos) {
int simpleEval = simple_eval(pos, pos.side_to_move());
return std::abs(simpleEval) > 962;
}
bool Eval::use_smallnet(const Position& pos) { return std::abs(simple_eval(pos)) > 962; }
// Evaluate is the evaluator for the outer world. It returns a static evaluation
// of the position from the point of view of the side to move.
@@ -103,8 +101,6 @@ std::string Eval::trace(Position& pos, const Eval::NNUE::Networks& networks) {
Eval::NNUE::AccumulatorStack accumulators;
auto caches = std::make_unique<Eval::NNUE::AccumulatorCaches>(networks);
accumulators.reset(pos, networks, *caches);
std::stringstream ss;
ss << std::showpoint << std::noshowpos << std::fixed << std::setprecision(2);
ss << '\n' << NNUE::trace(pos, networks, *caches) << '\n';

View File

@@ -44,7 +44,7 @@ class AccumulatorStack;
std::string trace(Position& pos, const Eval::NNUE::Networks& networks);
int simple_eval(const Position& pos, Color c);
int simple_eval(const Position& pos);
bool use_smallnet(const Position& pos);
Value evaluate(const NNUE::Networks& networks,
const Position& pos,

View File

@@ -36,7 +36,7 @@ namespace Stockfish {
constexpr int PAWN_HISTORY_SIZE = 512; // has to be a power of 2
constexpr int CORRECTION_HISTORY_SIZE = 32768; // has to be a power of 2
constexpr int CORRECTION_HISTORY_LIMIT = 1024;
constexpr int LOW_PLY_HISTORY_SIZE = 4;
constexpr int LOW_PLY_HISTORY_SIZE = 5;
static_assert((PAWN_HISTORY_SIZE & (PAWN_HISTORY_SIZE - 1)) == 0,
"PAWN_HISTORY_SIZE has to be a power of 2");
@@ -166,6 +166,8 @@ struct CorrHistTypedef<NonPawn> {
template<CorrHistType T>
using CorrectionHistory = typename Detail::CorrHistTypedef<T>::type;
using TTMoveHistory = StatsEntry<std::int16_t, 8192>;
} // namespace Stockfish
#endif // #ifndef HISTORY_H_INCLUDED

View File

@@ -52,7 +52,6 @@ void memory_deleter(T* ptr, FREE_FUNC free_func) {
ptr->~T();
free_func(ptr);
return;
}
// Frees memory which was placed there with placement new.

View File

@@ -40,7 +40,7 @@ namespace Stockfish {
namespace {
// Version number or dev.
constexpr std::string_view version = "17.1";
constexpr std::string_view version = "dev";
// Our fancy logging facility. The trick here is to replace cin.rdbuf() and
// cout.rdbuf() with two Tie objects that tie cin and cout to a file stream. We

View File

@@ -317,6 +317,22 @@ void move_to_front(std::vector<T>& vec, Predicate pred) {
}
}
#if defined(__GNUC__) && !defined(__clang__)
#if __GNUC__ >= 13
#define sf_assume(cond) __attribute__((assume(cond)))
#else
#define sf_assume(cond) \
do \
{ \
if (!(cond)) \
__builtin_unreachable(); \
} while (0)
#endif
#else
// do nothing for other compilers
#define sf_assume(cond)
#endif
} // namespace Stockfish
#endif // #ifndef MISC_H_INCLUDED

View File

@@ -134,7 +134,7 @@ ExtMove* generate_pawn_moves(const Position& pos, ExtMove* moveList, Bitboard ta
if (Type == EVASIONS && (target & (pos.ep_square() + Up)))
return moveList;
b1 = pawnsNotOn7 & pawn_attacks_bb(Them, pos.ep_square());
b1 = pawnsNotOn7 & attacks_bb<PAWN>(pos.ep_square(), Them);
assert(b1);

View File

@@ -20,6 +20,7 @@
#include <cassert>
#include <limits>
#include <utility>
#include "bitboard.h"
#include "misc.h"
@@ -55,6 +56,7 @@ enum Stages {
QCAPTURE
};
// Sort moves in descending order up to and including a given limit.
// The order of moves smaller than the limit is left unspecified.
void partial_insertion_sort(ExtMove* begin, ExtMove* end, int limit) {
@@ -125,74 +127,68 @@ void MovePicker::score() {
static_assert(Type == CAPTURES || Type == QUIETS || Type == EVASIONS, "Wrong type");
[[maybe_unused]] Bitboard threatenedByPawn, threatenedByMinor, threatenedByRook,
threatenedPieces;
Color us = pos.side_to_move();
[[maybe_unused]] Bitboard threatByLesser[QUEEN + 1];
if constexpr (Type == QUIETS)
{
Color us = pos.side_to_move();
threatenedByPawn = pos.attacks_by<PAWN>(~us);
threatenedByMinor =
pos.attacks_by<KNIGHT>(~us) | pos.attacks_by<BISHOP>(~us) | threatenedByPawn;
threatenedByRook = pos.attacks_by<ROOK>(~us) | threatenedByMinor;
// Pieces threatened by pieces of lesser material value
threatenedPieces = (pos.pieces(us, QUEEN) & threatenedByRook)
| (pos.pieces(us, ROOK) & threatenedByMinor)
| (pos.pieces(us, KNIGHT, BISHOP) & threatenedByPawn);
threatByLesser[KNIGHT] = threatByLesser[BISHOP] = pos.attacks_by<PAWN>(~us);
threatByLesser[ROOK] =
pos.attacks_by<KNIGHT>(~us) | pos.attacks_by<BISHOP>(~us) | threatByLesser[KNIGHT];
threatByLesser[QUEEN] = pos.attacks_by<ROOK>(~us) | threatByLesser[ROOK];
}
for (auto& m : *this)
{
const Square from = m.from_sq();
const Square to = m.to_sq();
const Piece pc = pos.moved_piece(m);
const PieceType pt = type_of(pc);
const Piece capturedPiece = pos.piece_on(to);
if constexpr (Type == CAPTURES)
m.value =
7 * int(PieceValue[pos.piece_on(m.to_sq())])
+ (*captureHistory)[pos.moved_piece(m)][m.to_sq()][type_of(pos.piece_on(m.to_sq()))];
m.value = (*captureHistory)[pc][to][type_of(capturedPiece)]
+ 7 * int(PieceValue[capturedPiece]) + 1024 * bool(pos.check_squares(pt) & to);
else if constexpr (Type == QUIETS)
{
Piece pc = pos.moved_piece(m);
PieceType pt = type_of(pc);
Square from = m.from_sq();
Square to = m.to_sq();
// histories
m.value = 2 * (*mainHistory)[pos.side_to_move()][m.from_to()];
m.value = 2 * (*mainHistory)[us][m.from_to()];
m.value += 2 * (*pawnHistory)[pawn_structure_index(pos)][pc][to];
m.value += (*continuationHistory[0])[pc][to];
m.value += (*continuationHistory[1])[pc][to];
m.value += (*continuationHistory[2])[pc][to];
m.value += (*continuationHistory[3])[pc][to];
m.value += (*continuationHistory[4])[pc][to] / 3;
m.value += (*continuationHistory[5])[pc][to];
// bonus for checks
m.value += bool(pos.check_squares(pt) & to) * 16384;
m.value += (bool(pos.check_squares(pt) & to) && pos.see_ge(m, -75)) * 16384;
// bonus for escaping from capture
m.value += threatenedPieces & from ? (pt == QUEEN && !(to & threatenedByRook) ? 51700
: pt == ROOK && !(to & threatenedByMinor) ? 25600
: !(to & threatenedByPawn) ? 14450
: 0)
: 0;
// malus for putting piece en prise
m.value -= (pt == QUEEN ? bool(to & threatenedByRook) * 49000
: pt == ROOK && bool(to & threatenedByMinor) ? 24335
: 0);
// penalty for moving to a square threatened by a lesser piece
// or bonus for escaping an attack by a lesser piece.
if (KNIGHT <= pt && pt <= QUEEN)
{
static constexpr int bonus[QUEEN + 1] = {0, 0, 144, 144, 256, 517};
int v = threatByLesser[pt] & to ? -95 : 100 * bool(threatByLesser[pt] & from);
m.value += bonus[pt] * v;
}
if (ply < LOW_PLY_HISTORY_SIZE)
m.value += 8 * (*lowPlyHistory)[ply][m.from_to()] / (1 + 2 * ply);
m.value += 8 * (*lowPlyHistory)[ply][m.from_to()] / (1 + ply);
}
else // Type == EVASIONS
{
if (pos.capture_stage(m))
m.value = PieceValue[pos.piece_on(m.to_sq())] + (1 << 28);
m.value = PieceValue[capturedPiece] + (1 << 28);
else
m.value = (*mainHistory)[pos.side_to_move()][m.from_to()]
+ (*continuationHistory[0])[pos.moved_piece(m)][m.to_sq()]
+ (*pawnHistory)[pawn_structure_index(pos)][pos.moved_piece(m)][m.to_sq()];
{
m.value = (*mainHistory)[us][m.from_to()] + (*continuationHistory[0])[pc][to];
if (ply < LOW_PLY_HISTORY_SIZE)
m.value += 2 * (*lowPlyHistory)[ply][m.from_to()] / (1 + ply);
}
}
}
}
// Returns the next move satisfying a predicate function.
@@ -200,7 +196,7 @@ void MovePicker::score() {
template<typename Pred>
Move MovePicker::select(Pred filter) {
for (; cur < endMoves; ++cur)
for (; cur < endCur; ++cur)
if (*cur != ttMove && filter())
return *cur++;
@@ -212,8 +208,7 @@ Move MovePicker::select(Pred filter) {
// picking the move with the highest score from a list of generated moves.
Move MovePicker::next_move() {
auto quiet_threshold = [](Depth d) { return -3560 * d; };
constexpr int goodQuietThreshold = -14000;
top:
switch (stage)
{
@@ -229,18 +224,19 @@ top:
case PROBCUT_INIT :
case QCAPTURE_INIT :
cur = endBadCaptures = moves;
endMoves = generate<CAPTURES>(pos, cur);
endCur = endCaptures = generate<CAPTURES>(pos, cur);
score<CAPTURES>();
partial_insertion_sort(cur, endMoves, std::numeric_limits<int>::min());
partial_insertion_sort(cur, endCur, std::numeric_limits<int>::min());
++stage;
goto top;
case GOOD_CAPTURE :
if (select([&]() {
// Move losing capture to endBadCaptures to be tried later
return pos.see_ge(*cur, -cur->value / 18) ? true
: (*endBadCaptures++ = *cur, false);
if (pos.see_ge(*cur, -cur->value / 18))
return true;
std::swap(*endBadCaptures++, *cur);
return false;
}))
return *(cur - 1);
@@ -250,29 +246,22 @@ top:
case QUIET_INIT :
if (!skipQuiets)
{
cur = endBadCaptures;
endMoves = beginBadQuiets = endBadQuiets = generate<QUIETS>(pos, cur);
endCur = endGenerated = generate<QUIETS>(pos, cur);
score<QUIETS>();
partial_insertion_sort(cur, endMoves, quiet_threshold(depth));
partial_insertion_sort(cur, endCur, -3560 * depth);
}
++stage;
[[fallthrough]];
case GOOD_QUIET :
if (!skipQuiets && select([]() { return true; }))
{
if ((cur - 1)->value > -7998 || (cur - 1)->value <= quiet_threshold(depth))
return *(cur - 1);
// Remaining quiets are bad
beginBadQuiets = cur - 1;
}
if (!skipQuiets && select([&]() { return cur->value > goodQuietThreshold; }))
return *(cur - 1);
// Prepare the pointers to loop over the bad captures
cur = moves;
endMoves = endBadCaptures;
cur = moves;
endCur = endBadCaptures;
++stage;
[[fallthrough]];
@@ -281,25 +270,25 @@ top:
if (select([]() { return true; }))
return *(cur - 1);
// Prepare the pointers to loop over the bad quiets
cur = beginBadQuiets;
endMoves = endBadQuiets;
// Prepare the pointers to loop over quiets again
cur = endCaptures;
endCur = endGenerated;
++stage;
[[fallthrough]];
case BAD_QUIET :
if (!skipQuiets)
return select([]() { return true; });
return select([&]() { return cur->value <= goodQuietThreshold; });
return Move::none();
case EVASION_INIT :
cur = moves;
endMoves = generate<EVASIONS>(pos, cur);
cur = moves;
endCur = endGenerated = generate<EVASIONS>(pos, cur);
score<EVASIONS>();
partial_insertion_sort(cur, endMoves, std::numeric_limits<int>::min());
partial_insertion_sort(cur, endCur, std::numeric_limits<int>::min());
++stage;
[[fallthrough]];
@@ -317,4 +306,18 @@ top:
void MovePicker::skip_quiet_moves() { skipQuiets = true; }
// this function must be called after all quiet moves and captures have been generated
bool MovePicker::can_move_king_or_pawn() const {
// SEE negative captures shouldn't be returned in GOOD_CAPTURE stage
assert(stage > GOOD_CAPTURE && stage != EVASION_INIT);
for (const ExtMove* m = moves; m < endGenerated; ++m)
{
PieceType movedPieceType = type_of(pos.moved_piece(*m));
if ((movedPieceType == PAWN || movedPieceType == KING) && pos.legal(*m))
return true;
}
return false;
}
} // namespace Stockfish

View File

@@ -50,6 +50,7 @@ class MovePicker {
MovePicker(const Position&, Move, int, const CapturePieceToHistory*);
Move next_move();
void skip_quiet_moves();
bool can_move_king_or_pawn() const;
private:
template<typename Pred>
@@ -57,7 +58,7 @@ class MovePicker {
template<GenType>
void score();
ExtMove* begin() { return cur; }
ExtMove* end() { return endMoves; }
ExtMove* end() { return endCur; }
const Position& pos;
const ButterflyHistory* mainHistory;
@@ -66,7 +67,7 @@ class MovePicker {
const PieceToHistory** continuationHistory;
const PawnHistory* pawnHistory;
Move ttMove;
ExtMove * cur, *endMoves, *endBadCaptures, *beginBadQuiets, *endBadQuiets;
ExtMove * cur, *endCur, *endBadCaptures, *endCaptures, *endGenerated;
int stage;
int threshold;
Depth depth;

View File

@@ -23,7 +23,7 @@
#include "../../bitboard.h"
#include "../../position.h"
#include "../../types.h"
#include "../nnue_accumulator.h"
#include "../nnue_common.h"
namespace Stockfish::Eval::NNUE::Features {
@@ -58,13 +58,15 @@ void HalfKAv2_hm::append_changed_indices(Square ksq,
const DirtyPiece& dp,
IndexList& removed,
IndexList& added) {
for (int i = 0; i < dp.dirty_num; ++i)
{
if (dp.from[i] != SQ_NONE)
removed.push_back(make_index<Perspective>(dp.from[i], dp.piece[i], ksq));
if (dp.to[i] != SQ_NONE)
added.push_back(make_index<Perspective>(dp.to[i], dp.piece[i], ksq));
}
removed.push_back(make_index<Perspective>(dp.from, dp.pc, ksq));
if (dp.to != SQ_NONE)
added.push_back(make_index<Perspective>(dp.to, dp.pc, ksq));
if (dp.remove_sq != SQ_NONE)
removed.push_back(make_index<Perspective>(dp.remove_sq, dp.remove_pc, ksq));
if (dp.add_sq != SQ_NONE)
added.push_back(make_index<Perspective>(dp.add_sq, dp.add_pc, ksq));
}
// Explicit template instantiations
@@ -78,7 +80,7 @@ template void HalfKAv2_hm::append_changed_indices<BLACK>(Square ksq,
IndexList& added);
bool HalfKAv2_hm::requires_refresh(const DirtyPiece& dirtyPiece, Color perspective) {
return dirtyPiece.piece[0] == make_piece(perspective, KING);
return dirtyPiece.pc == make_piece(perspective, KING);
}
} // namespace Stockfish::Eval::NNUE::Features

View File

@@ -25,7 +25,7 @@
#include <iostream>
#include "../nnue_common.h"
#include "simd.h"
#include "../simd.h"
/*
This file contains the definition for a fully connected layer (aka affine transform).
@@ -102,7 +102,7 @@ static void affine_transform_non_ssse3(std::int32_t* output,
product = vmlal_s8(product, inputVector[j * 2 + 1], row[j * 2 + 1]);
sum = vpadalq_s16(sum, product);
}
output[i] = Simd::neon_m128_reduce_add_epi32(sum);
output[i] = SIMD::neon_m128_reduce_add_epi32(sum);
#endif
}
@@ -191,20 +191,20 @@ class AffineTransform {
#if defined(USE_AVX512)
using vec_t = __m512i;
#define vec_set_32 _mm512_set1_epi32
#define vec_add_dpbusd_32 Simd::m512_add_dpbusd_epi32
#define vec_add_dpbusd_32 SIMD::m512_add_dpbusd_epi32
#elif defined(USE_AVX2)
using vec_t = __m256i;
#define vec_set_32 _mm256_set1_epi32
#define vec_add_dpbusd_32 Simd::m256_add_dpbusd_epi32
#define vec_add_dpbusd_32 SIMD::m256_add_dpbusd_epi32
#elif defined(USE_SSSE3)
using vec_t = __m128i;
#define vec_set_32 _mm_set1_epi32
#define vec_add_dpbusd_32 Simd::m128_add_dpbusd_epi32
#define vec_add_dpbusd_32 SIMD::m128_add_dpbusd_epi32
#elif defined(USE_NEON_DOTPROD)
using vec_t = int32x4_t;
#define vec_set_32 vdupq_n_s32
#define vec_add_dpbusd_32(acc, a, b) \
Simd::dotprod_m128_add_dpbusd_epi32(acc, vreinterpretq_s8_s32(a), \
SIMD::dotprod_m128_add_dpbusd_epi32(acc, vreinterpretq_s8_s32(a), \
vreinterpretq_s8_s32(b))
#endif
@@ -245,23 +245,20 @@ class AffineTransform {
#if defined(USE_AVX2)
using vec_t = __m256i;
#define vec_setzero() _mm256_setzero_si256()
#define vec_set_32 _mm256_set1_epi32
#define vec_add_dpbusd_32 Simd::m256_add_dpbusd_epi32
#define vec_hadd Simd::m256_hadd
#define vec_add_dpbusd_32 SIMD::m256_add_dpbusd_epi32
#define vec_hadd SIMD::m256_hadd
#elif defined(USE_SSSE3)
using vec_t = __m128i;
#define vec_setzero() _mm_setzero_si128()
#define vec_set_32 _mm_set1_epi32
#define vec_add_dpbusd_32 Simd::m128_add_dpbusd_epi32
#define vec_hadd Simd::m128_hadd
#define vec_add_dpbusd_32 SIMD::m128_add_dpbusd_epi32
#define vec_hadd SIMD::m128_hadd
#elif defined(USE_NEON_DOTPROD)
using vec_t = int32x4_t;
#define vec_setzero() vdupq_n_s32(0)
#define vec_set_32 vdupq_n_s32
#define vec_add_dpbusd_32(acc, a, b) \
Simd::dotprod_m128_add_dpbusd_epi32(acc, vreinterpretq_s8_s32(a), \
SIMD::dotprod_m128_add_dpbusd_epi32(acc, vreinterpretq_s8_s32(a), \
vreinterpretq_s8_s32(b))
#define vec_hadd Simd::neon_m128_hadd
#define vec_hadd SIMD::neon_m128_hadd
#endif
const auto inputVector = reinterpret_cast<const vec_t*>(input);
@@ -282,7 +279,6 @@ class AffineTransform {
output[0] = vec_hadd(sum0, biases[0]);
#undef vec_setzero
#undef vec_set_32
#undef vec_add_dpbusd_32
#undef vec_hadd
}

View File

@@ -22,14 +22,12 @@
#define NNUE_LAYERS_AFFINE_TRANSFORM_SPARSE_INPUT_H_INCLUDED
#include <algorithm>
#include <array>
#include <cstdint>
#include <iostream>
#include "../../bitboard.h"
#include "../simd.h"
#include "../nnue_common.h"
#include "affine_transform.h"
#include "simd.h"
/*
This file contains the definition for a fully connected layer (aka affine transform) with block sparse input.
@@ -51,11 +49,7 @@ constexpr int constexpr_lsb(uint64_t bb) {
alignas(CacheLineSize) static constexpr struct OffsetIndices {
#if (USE_SSE41)
std::uint8_t offset_indices[256][8];
#else
std::uint16_t offset_indices[256][8];
#endif
constexpr OffsetIndices() :
offset_indices() {
@@ -74,56 +68,52 @@ alignas(CacheLineSize) static constexpr struct OffsetIndices {
} Lookup;
#if defined(__GNUC__) || defined(__clang__)
#define RESTRICT __restrict__
#elif defined(_MSC_VER)
#define RESTRICT __restrict
#else
#define RESTRICT
#endif
// Find indices of nonzero numbers in an int32_t array
template<const IndexType InputDimensions>
void find_nnz(const std::int32_t* input, std::uint16_t* out, IndexType& count_out) {
#if defined(USE_SSSE3)
#if defined(USE_AVX512)
using vec_t = __m512i;
#define vec_nnz(a) _mm512_cmpgt_epi32_mask(a, _mm512_setzero_si512())
#elif defined(USE_AVX2)
using vec_t = __m256i;
#if defined(USE_VNNI) && !defined(USE_AVXVNNI)
#define vec_nnz(a) _mm256_cmpgt_epi32_mask(a, _mm256_setzero_si256())
#else
#define vec_nnz(a) \
_mm256_movemask_ps( \
_mm256_castsi256_ps(_mm256_cmpgt_epi32(a, _mm256_setzero_si256())))
#endif
#elif defined(USE_SSSE3)
using vec_t = __m128i;
#define vec_nnz(a) \
_mm_movemask_ps(_mm_castsi128_ps(_mm_cmpgt_epi32(a, _mm_setzero_si128())))
#endif
using vec128_t = __m128i;
#define vec128_zero _mm_setzero_si128()
#define vec128_set_16(a) _mm_set1_epi16(a)
#if (USE_SSE41)
#define vec128_load(a) _mm_cvtepu8_epi16(_mm_loadl_epi64(a))
#else
#define vec128_load(a) _mm_load_si128(a)
#endif
#define vec128_storeu(a, b) _mm_storeu_si128(a, b)
#define vec128_add(a, b) _mm_add_epi16(a, b)
#elif defined(USE_NEON)
using vec_t = uint32x4_t;
static const std::uint32_t Mask[4] = {1, 2, 4, 8};
#define vec_nnz(a) vaddvq_u32(vandq_u32(vtstq_u32(a, a), vld1q_u32(Mask)))
using vec128_t = uint16x8_t;
#define vec128_zero vdupq_n_u16(0)
#define vec128_set_16(a) vdupq_n_u16(a)
#define vec128_load(a) vld1q_u16(reinterpret_cast<const std::uint16_t*>(a))
#define vec128_storeu(a, b) vst1q_u16(reinterpret_cast<std::uint16_t*>(a), b)
#define vec128_add(a, b) vaddq_u16(a, b)
#endif
constexpr IndexType InputSimdWidth = sizeof(vec_t) / sizeof(std::int32_t);
void find_nnz(const std::int32_t* RESTRICT input,
std::uint16_t* RESTRICT out,
IndexType& count_out) {
#ifdef USE_AVX512
constexpr IndexType SimdWidth = 16; // 512 bits / 32 bits
constexpr IndexType NumChunks = InputDimensions / SimdWidth;
const __m512i increment = _mm512_set1_epi32(SimdWidth);
__m512i base = _mm512_set_epi32(15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0);
IndexType count = 0;
for (IndexType i = 0; i < NumChunks; ++i)
{
const __m512i inputV = _mm512_load_si512(input + i * SimdWidth);
// Get a bitmask and gather non zero indices
const __mmask16 nnzMask = _mm512_test_epi32_mask(inputV, inputV);
const __m512i nnzV = _mm512_maskz_compress_epi32(nnzMask, base);
_mm512_mask_cvtepi32_storeu_epi16(out + count, 0xFFFF, nnzV);
count += popcount(nnzMask);
base = _mm512_add_epi32(base, increment);
}
count_out = count;
#else
using namespace SIMD;
constexpr IndexType InputSimdWidth = sizeof(vec_uint_t) / sizeof(std::int32_t);
// Inputs are processed InputSimdWidth at a time and outputs are processed 8 at a time so we process in chunks of max(InputSimdWidth, 8)
constexpr IndexType ChunkSize = std::max<IndexType>(InputSimdWidth, 8);
constexpr IndexType NumChunks = InputDimensions / ChunkSize;
constexpr IndexType InputsPerChunk = ChunkSize / InputSimdWidth;
constexpr IndexType OutputsPerChunk = ChunkSize / 8;
const auto inputVector = reinterpret_cast<const vec_t*>(input);
const auto inputVector = reinterpret_cast<const vec_uint_t*>(input);
IndexType count = 0;
vec128_t base = vec128_zero;
const vec128_t increment = vec128_set_16(8);
@@ -133,7 +123,7 @@ void find_nnz(const std::int32_t* input, std::uint16_t* out, IndexType& count_ou
unsigned nnz = 0;
for (IndexType j = 0; j < InputsPerChunk; ++j)
{
const vec_t inputChunk = inputVector[i * InputsPerChunk + j];
const vec_uint_t inputChunk = inputVector[i * InputsPerChunk + j];
nnz |= unsigned(vec_nnz(inputChunk)) << (j * InputSimdWidth);
}
for (IndexType j = 0; j < OutputsPerChunk; ++j)
@@ -147,13 +137,9 @@ void find_nnz(const std::int32_t* input, std::uint16_t* out, IndexType& count_ou
}
}
count_out = count;
#endif
}
#undef vec_nnz
#undef vec128_zero
#undef vec128_set_16
#undef vec128_load
#undef vec128_storeu
#undef vec128_add
#endif
// Sparse input implementation
@@ -232,27 +218,27 @@ class AffineTransformSparseInput {
using invec_t = __m512i;
using outvec_t = __m512i;
#define vec_set_32 _mm512_set1_epi32
#define vec_add_dpbusd_32 Simd::m512_add_dpbusd_epi32
#define vec_add_dpbusd_32 SIMD::m512_add_dpbusd_epi32
#elif defined(USE_AVX2)
using invec_t = __m256i;
using outvec_t = __m256i;
#define vec_set_32 _mm256_set1_epi32
#define vec_add_dpbusd_32 Simd::m256_add_dpbusd_epi32
#define vec_add_dpbusd_32 SIMD::m256_add_dpbusd_epi32
#elif defined(USE_SSSE3)
using invec_t = __m128i;
using outvec_t = __m128i;
#define vec_set_32 _mm_set1_epi32
#define vec_add_dpbusd_32 Simd::m128_add_dpbusd_epi32
#define vec_add_dpbusd_32 SIMD::m128_add_dpbusd_epi32
#elif defined(USE_NEON_DOTPROD)
using invec_t = int8x16_t;
using outvec_t = int32x4_t;
#define vec_set_32(a) vreinterpretq_s8_u32(vdupq_n_u32(a))
#define vec_add_dpbusd_32 Simd::dotprod_m128_add_dpbusd_epi32
#define vec_add_dpbusd_32 SIMD::dotprod_m128_add_dpbusd_epi32
#elif defined(USE_NEON)
using invec_t = int8x16_t;
using outvec_t = int32x4_t;
#define vec_set_32(a) vreinterpretq_s8_u32(vdupq_n_u32(a))
#define vec_add_dpbusd_32 Simd::neon_m128_add_dpbusd_epi32
#define vec_add_dpbusd_32 SIMD::neon_m128_add_dpbusd_epi32
#endif
static constexpr IndexType OutputSimdWidth = sizeof(outvec_t) / sizeof(OutputType);

View File

@@ -1,134 +0,0 @@
/*
Stockfish, a UCI chess playing engine derived from Glaurung 2.1
Copyright (C) 2004-2025 The Stockfish developers (see AUTHORS file)
Stockfish is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.
Stockfish is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program. If not, see <http://www.gnu.org/licenses/>.
*/
#ifndef STOCKFISH_SIMD_H_INCLUDED
#define STOCKFISH_SIMD_H_INCLUDED
#if defined(USE_AVX2)
#include <immintrin.h>
#elif defined(USE_SSE41)
#include <smmintrin.h>
#elif defined(USE_SSSE3)
#include <tmmintrin.h>
#elif defined(USE_SSE2)
#include <emmintrin.h>
#elif defined(USE_NEON)
#include <arm_neon.h>
#endif
namespace Stockfish::Simd {
#if defined(USE_AVX512)
[[maybe_unused]] static int m512_hadd(__m512i sum, int bias) {
return _mm512_reduce_add_epi32(sum) + bias;
}
[[maybe_unused]] static void m512_add_dpbusd_epi32(__m512i& acc, __m512i a, __m512i b) {
#if defined(USE_VNNI)
acc = _mm512_dpbusd_epi32(acc, a, b);
#else
__m512i product0 = _mm512_maddubs_epi16(a, b);
product0 = _mm512_madd_epi16(product0, _mm512_set1_epi16(1));
acc = _mm512_add_epi32(acc, product0);
#endif
}
#endif
#if defined(USE_AVX2)
[[maybe_unused]] static int m256_hadd(__m256i sum, int bias) {
__m128i sum128 = _mm_add_epi32(_mm256_castsi256_si128(sum), _mm256_extracti128_si256(sum, 1));
sum128 = _mm_add_epi32(sum128, _mm_shuffle_epi32(sum128, _MM_PERM_BADC));
sum128 = _mm_add_epi32(sum128, _mm_shuffle_epi32(sum128, _MM_PERM_CDAB));
return _mm_cvtsi128_si32(sum128) + bias;
}
[[maybe_unused]] static void m256_add_dpbusd_epi32(__m256i& acc, __m256i a, __m256i b) {
#if defined(USE_VNNI)
acc = _mm256_dpbusd_epi32(acc, a, b);
#else
__m256i product0 = _mm256_maddubs_epi16(a, b);
product0 = _mm256_madd_epi16(product0, _mm256_set1_epi16(1));
acc = _mm256_add_epi32(acc, product0);
#endif
}
#endif
#if defined(USE_SSSE3)
[[maybe_unused]] static int m128_hadd(__m128i sum, int bias) {
sum = _mm_add_epi32(sum, _mm_shuffle_epi32(sum, 0x4E)); //_MM_PERM_BADC
sum = _mm_add_epi32(sum, _mm_shuffle_epi32(sum, 0xB1)); //_MM_PERM_CDAB
return _mm_cvtsi128_si32(sum) + bias;
}
[[maybe_unused]] static void m128_add_dpbusd_epi32(__m128i& acc, __m128i a, __m128i b) {
__m128i product0 = _mm_maddubs_epi16(a, b);
product0 = _mm_madd_epi16(product0, _mm_set1_epi16(1));
acc = _mm_add_epi32(acc, product0);
}
#endif
#if defined(USE_NEON_DOTPROD)
[[maybe_unused]] static void
dotprod_m128_add_dpbusd_epi32(int32x4_t& acc, int8x16_t a, int8x16_t b) {
acc = vdotq_s32(acc, a, b);
}
#endif
#if defined(USE_NEON)
[[maybe_unused]] static int neon_m128_reduce_add_epi32(int32x4_t s) {
#if USE_NEON >= 8
return vaddvq_s32(s);
#else
return s[0] + s[1] + s[2] + s[3];
#endif
}
[[maybe_unused]] static int neon_m128_hadd(int32x4_t sum, int bias) {
return neon_m128_reduce_add_epi32(sum) + bias;
}
#endif
#if USE_NEON >= 8
[[maybe_unused]] static void neon_m128_add_dpbusd_epi32(int32x4_t& acc, int8x16_t a, int8x16_t b) {
int16x8_t product0 = vmull_s8(vget_low_s8(a), vget_low_s8(b));
int16x8_t product1 = vmull_high_s8(a, b);
int16x8_t sum = vpaddq_s16(product0, product1);
acc = vpadalq_s16(acc, sum);
}
#endif
}
#endif // STOCKFISH_SIMD_H_INCLUDED

View File

@@ -212,21 +212,11 @@ NetworkOutput
Network<Arch, Transformer>::evaluate(const Position& pos,
AccumulatorStack& accumulatorStack,
AccumulatorCaches::Cache<FTDimensions>* cache) const {
// We manually align the arrays on the stack because with gcc < 9.3
// overaligning stack variables with alignas() doesn't work correctly.
constexpr uint64_t alignment = CacheLineSize;
#if defined(ALIGNAS_ON_STACK_VARIABLES_BROKEN)
TransformedFeatureType
transformedFeaturesUnaligned[FeatureTransformer<FTDimensions, nullptr>::BufferSize
+ alignment / sizeof(TransformedFeatureType)];
auto* transformedFeatures = align_ptr_up<alignment>(&transformedFeaturesUnaligned[0]);
#else
alignas(alignment) TransformedFeatureType
transformedFeatures[FeatureTransformer<FTDimensions, nullptr>::BufferSize];
#endif
alignas(alignment)
TransformedFeatureType transformedFeatures[FeatureTransformer<FTDimensions>::BufferSize];
ASSERT_ALIGNED(transformedFeatures, alignment);
@@ -284,20 +274,11 @@ NnueEvalTrace
Network<Arch, Transformer>::trace_evaluate(const Position& pos,
AccumulatorStack& accumulatorStack,
AccumulatorCaches::Cache<FTDimensions>* cache) const {
// We manually align the arrays on the stack because with gcc < 9.3
// overaligning stack variables with alignas() doesn't work correctly.
constexpr uint64_t alignment = CacheLineSize;
#if defined(ALIGNAS_ON_STACK_VARIABLES_BROKEN)
TransformedFeatureType
transformedFeaturesUnaligned[FeatureTransformer<FTDimensions, nullptr>::BufferSize
+ alignment / sizeof(TransformedFeatureType)];
auto* transformedFeatures = align_ptr_up<alignment>(&transformedFeaturesUnaligned[0]);
#else
alignas(alignment) TransformedFeatureType
transformedFeatures[FeatureTransformer<FTDimensions, nullptr>::BufferSize];
#endif
alignas(alignment)
TransformedFeatureType transformedFeatures[FeatureTransformer<FTDimensions>::BufferSize];
ASSERT_ALIGNED(transformedFeatures, alignment);
@@ -452,12 +433,10 @@ bool Network<Arch, Transformer>::write_parameters(std::ostream& stream,
// Explicit template instantiations
template class Network<
NetworkArchitecture<TransformedFeatureDimensionsBig, L2Big, L3Big>,
FeatureTransformer<TransformedFeatureDimensionsBig, &AccumulatorState::accumulatorBig>>;
template class Network<NetworkArchitecture<TransformedFeatureDimensionsBig, L2Big, L3Big>,
FeatureTransformer<TransformedFeatureDimensionsBig>>;
template class Network<
NetworkArchitecture<TransformedFeatureDimensionsSmall, L2Small, L3Small>,
FeatureTransformer<TransformedFeatureDimensionsSmall, &AccumulatorState::accumulatorSmall>>;
template class Network<NetworkArchitecture<TransformedFeatureDimensionsSmall, L2Small, L3Small>,
FeatureTransformer<TransformedFeatureDimensionsSmall>>;
} // namespace Stockfish::Eval::NNUE

View File

@@ -32,6 +32,7 @@
#include "../types.h"
#include "nnue_accumulator.h"
#include "nnue_architecture.h"
#include "nnue_common.h"
#include "nnue_feature_transformer.h"
#include "nnue_misc.h"
@@ -110,13 +111,11 @@ class Network {
};
// Definitions of the network types
using SmallFeatureTransformer =
FeatureTransformer<TransformedFeatureDimensionsSmall, &AccumulatorState::accumulatorSmall>;
using SmallFeatureTransformer = FeatureTransformer<TransformedFeatureDimensionsSmall>;
using SmallNetworkArchitecture =
NetworkArchitecture<TransformedFeatureDimensionsSmall, L2Small, L3Small>;
using BigFeatureTransformer =
FeatureTransformer<TransformedFeatureDimensionsBig, &AccumulatorState::accumulatorBig>;
using BigFeatureTransformer = FeatureTransformer<TransformedFeatureDimensionsBig>;
using BigNetworkArchitecture = NetworkArchitecture<TransformedFeatureDimensionsBig, L2Big, L3Big>;
using NetworkBig = Network<BigNetworkArchitecture, BigFeatureTransformer>;

View File

@@ -19,49 +19,43 @@
#include "nnue_accumulator.h"
#include <cassert>
#include <cstdint>
#include <initializer_list>
#include <memory>
#include <type_traits>
#include "../bitboard.h"
#include "../misc.h"
#include "../position.h"
#include "../types.h"
#include "network.h"
#include "nnue_architecture.h"
#include "nnue_common.h"
#include "nnue_feature_transformer.h"
#include "nnue_feature_transformer.h" // IWYU pragma: keep
#include "simd.h"
namespace Stockfish::Eval::NNUE {
#if defined(__GNUC__) && !defined(__clang__)
#define sf_assume(cond) \
do \
{ \
if (!(cond)) \
__builtin_unreachable(); \
} while (0)
#else
// do nothing for other compilers
#define sf_assume(cond)
#endif
using namespace SIMD;
namespace {
template<Color Perspective,
IncUpdateDirection Direction = FORWARD,
IndexType TransformedFeatureDimensions,
Accumulator<TransformedFeatureDimensions> AccumulatorState::*accPtr>
void update_accumulator_incremental(
const FeatureTransformer<TransformedFeatureDimensions, accPtr>& featureTransformer,
const Square ksq,
AccumulatorState& target_state,
const AccumulatorState& computed);
template<Color Perspective, IndexType TransformedFeatureDimensions>
void double_inc_update(const FeatureTransformer<TransformedFeatureDimensions>& featureTransformer,
const Square ksq,
AccumulatorState& middle_state,
AccumulatorState& target_state,
const AccumulatorState& computed);
template<Color Perspective, IndexType Dimensions, Accumulator<Dimensions> AccumulatorState::*accPtr>
void update_accumulator_refresh_cache(
const FeatureTransformer<Dimensions, accPtr>& featureTransformer,
const Position& pos,
AccumulatorState& accumulatorState,
AccumulatorCaches::Cache<Dimensions>& cache);
template<Color Perspective, bool Forward, IndexType TransformedFeatureDimensions>
void update_accumulator_incremental(
const FeatureTransformer<TransformedFeatureDimensions>& featureTransformer,
const Square ksq,
AccumulatorState& target_state,
const AccumulatorState& computed);
template<Color Perspective, IndexType Dimensions>
void update_accumulator_refresh_cache(const FeatureTransformer<Dimensions>& featureTransformer,
const Position& pos,
AccumulatorState& accumulatorState,
AccumulatorCaches::Cache<Dimensions>& cache);
}
@@ -71,63 +65,43 @@ void AccumulatorState::reset(const DirtyPiece& dp) noexcept {
accumulatorSmall.computed.fill(false);
}
const AccumulatorState& AccumulatorStack::latest() const noexcept {
return m_accumulators[m_current_idx - 1];
}
const AccumulatorState& AccumulatorStack::latest() const noexcept { return accumulators[size - 1]; }
AccumulatorState& AccumulatorStack::mut_latest() noexcept {
return m_accumulators[m_current_idx - 1];
}
AccumulatorState& AccumulatorStack::mut_latest() noexcept { return accumulators[size - 1]; }
void AccumulatorStack::reset(const Position& rootPos,
const Networks& networks,
AccumulatorCaches& caches) noexcept {
m_current_idx = 1;
update_accumulator_refresh_cache<WHITE, TransformedFeatureDimensionsBig,
&AccumulatorState::accumulatorBig>(
*networks.big.featureTransformer, rootPos, m_accumulators[0], caches.big);
update_accumulator_refresh_cache<BLACK, TransformedFeatureDimensionsBig,
&AccumulatorState::accumulatorBig>(
*networks.big.featureTransformer, rootPos, m_accumulators[0], caches.big);
update_accumulator_refresh_cache<WHITE, TransformedFeatureDimensionsSmall,
&AccumulatorState::accumulatorSmall>(
*networks.small.featureTransformer, rootPos, m_accumulators[0], caches.small);
update_accumulator_refresh_cache<BLACK, TransformedFeatureDimensionsSmall,
&AccumulatorState::accumulatorSmall>(
*networks.small.featureTransformer, rootPos, m_accumulators[0], caches.small);
void AccumulatorStack::reset() noexcept {
accumulators[0].reset({});
size = 1;
}
void AccumulatorStack::push(const DirtyPiece& dirtyPiece) noexcept {
assert(m_current_idx + 1 < m_accumulators.size());
m_accumulators[m_current_idx].reset(dirtyPiece);
m_current_idx++;
assert(size + 1 < accumulators.size());
accumulators[size].reset(dirtyPiece);
size++;
}
void AccumulatorStack::pop() noexcept {
assert(m_current_idx > 1);
m_current_idx--;
assert(size > 1);
size--;
}
template<IndexType Dimensions, Accumulator<Dimensions> AccumulatorState::*accPtr>
void AccumulatorStack::evaluate(const Position& pos,
const FeatureTransformer<Dimensions, accPtr>& featureTransformer,
AccumulatorCaches::Cache<Dimensions>& cache) noexcept {
template<IndexType Dimensions>
void AccumulatorStack::evaluate(const Position& pos,
const FeatureTransformer<Dimensions>& featureTransformer,
AccumulatorCaches::Cache<Dimensions>& cache) noexcept {
evaluate_side<WHITE>(pos, featureTransformer, cache);
evaluate_side<BLACK>(pos, featureTransformer, cache);
}
template<Color Perspective, IndexType Dimensions, Accumulator<Dimensions> AccumulatorState::*accPtr>
void AccumulatorStack::evaluate_side(
const Position& pos,
const FeatureTransformer<Dimensions, accPtr>& featureTransformer,
AccumulatorCaches::Cache<Dimensions>& cache) noexcept {
template<Color Perspective, IndexType Dimensions>
void AccumulatorStack::evaluate_side(const Position& pos,
const FeatureTransformer<Dimensions>& featureTransformer,
AccumulatorCaches::Cache<Dimensions>& cache) noexcept {
const auto last_usable_accum = find_last_usable_accumulator<Perspective, Dimensions, accPtr>();
const auto last_usable_accum = find_last_usable_accumulator<Perspective, Dimensions>();
if ((m_accumulators[last_usable_accum].*accPtr).computed[Perspective])
if ((accumulators[last_usable_accum].template acc<Dimensions>()).computed[Perspective])
forward_update_incremental<Perspective>(pos, featureTransformer, last_usable_accum);
else
@@ -139,91 +113,202 @@ void AccumulatorStack::evaluate_side(
// Find the earliest usable accumulator, this can either be a computed accumulator or the accumulator
// state just before a change that requires full refresh.
template<Color Perspective, IndexType Dimensions, Accumulator<Dimensions> AccumulatorState::*accPtr>
template<Color Perspective, IndexType Dimensions>
std::size_t AccumulatorStack::find_last_usable_accumulator() const noexcept {
for (std::size_t curr_idx = m_current_idx - 1; curr_idx > 0; curr_idx--)
for (std::size_t curr_idx = size - 1; curr_idx > 0; curr_idx--)
{
if ((m_accumulators[curr_idx].*accPtr).computed[Perspective])
if ((accumulators[curr_idx].template acc<Dimensions>()).computed[Perspective])
return curr_idx;
if (FeatureSet::requires_refresh(m_accumulators[curr_idx].dirtyPiece, Perspective))
if (FeatureSet::requires_refresh(accumulators[curr_idx].dirtyPiece, Perspective))
return curr_idx;
}
return 0;
}
template<Color Perspective, IndexType Dimensions, Accumulator<Dimensions> AccumulatorState::*accPtr>
template<Color Perspective, IndexType Dimensions>
void AccumulatorStack::forward_update_incremental(
const Position& pos,
const FeatureTransformer<Dimensions, accPtr>& featureTransformer,
const std::size_t begin) noexcept {
const Position& pos,
const FeatureTransformer<Dimensions>& featureTransformer,
const std::size_t begin) noexcept {
assert(begin < m_accumulators.size());
assert((m_accumulators[begin].*accPtr).computed[Perspective]);
assert(begin < accumulators.size());
assert((accumulators[begin].acc<Dimensions>()).computed[Perspective]);
const Square ksq = pos.square<KING>(Perspective);
for (std::size_t next = begin + 1; next < m_current_idx; next++)
update_accumulator_incremental<Perspective>(featureTransformer, ksq, m_accumulators[next],
m_accumulators[next - 1]);
for (std::size_t next = begin + 1; next < size; next++)
{
if (next + 1 < size)
{
DirtyPiece& dp1 = accumulators[next].dirtyPiece;
DirtyPiece& dp2 = accumulators[next + 1].dirtyPiece;
assert((latest().*accPtr).computed[Perspective]);
if (dp1.to != SQ_NONE && dp1.to == dp2.remove_sq)
{
const Square captureSq = dp1.to;
dp1.to = dp2.remove_sq = SQ_NONE;
double_inc_update<Perspective>(featureTransformer, ksq, accumulators[next],
accumulators[next + 1], accumulators[next - 1]);
dp1.to = dp2.remove_sq = captureSq;
next++;
continue;
}
}
update_accumulator_incremental<Perspective, true>(
featureTransformer, ksq, accumulators[next], accumulators[next - 1]);
}
assert((latest().acc<Dimensions>()).computed[Perspective]);
}
template<Color Perspective, IndexType Dimensions, Accumulator<Dimensions> AccumulatorState::*accPtr>
template<Color Perspective, IndexType Dimensions>
void AccumulatorStack::backward_update_incremental(
const Position& pos,
const FeatureTransformer<Dimensions, accPtr>& featureTransformer,
const std::size_t end) noexcept {
const Position& pos,
const FeatureTransformer<Dimensions>& featureTransformer,
const std::size_t end) noexcept {
assert(end < m_accumulators.size());
assert(end < m_current_idx);
assert((latest().*accPtr).computed[Perspective]);
assert(end < accumulators.size());
assert(end < size);
assert((latest().acc<Dimensions>()).computed[Perspective]);
const Square ksq = pos.square<KING>(Perspective);
for (std::size_t next = m_current_idx - 2; next >= end; next--)
update_accumulator_incremental<Perspective, BACKWARDS>(
featureTransformer, ksq, m_accumulators[next], m_accumulators[next + 1]);
for (std::int64_t next = std::int64_t(size) - 2; next >= std::int64_t(end); next--)
update_accumulator_incremental<Perspective, false>(
featureTransformer, ksq, accumulators[next], accumulators[next + 1]);
assert((m_accumulators[end].*accPtr).computed[Perspective]);
assert((accumulators[end].acc<Dimensions>()).computed[Perspective]);
}
// Explicit template instantiations
template void
AccumulatorStack::evaluate<TransformedFeatureDimensionsBig, &AccumulatorState::accumulatorBig>(
const Position& pos,
const FeatureTransformer<TransformedFeatureDimensionsBig, &AccumulatorState::accumulatorBig>&
featureTransformer,
template void AccumulatorStack::evaluate<TransformedFeatureDimensionsBig>(
const Position& pos,
const FeatureTransformer<TransformedFeatureDimensionsBig>& featureTransformer,
AccumulatorCaches::Cache<TransformedFeatureDimensionsBig>& cache) noexcept;
template void
AccumulatorStack::evaluate<TransformedFeatureDimensionsSmall, &AccumulatorState::accumulatorSmall>(
const Position& pos,
const FeatureTransformer<TransformedFeatureDimensionsSmall, &AccumulatorState::accumulatorSmall>&
featureTransformer,
template void AccumulatorStack::evaluate<TransformedFeatureDimensionsSmall>(
const Position& pos,
const FeatureTransformer<TransformedFeatureDimensionsSmall>& featureTransformer,
AccumulatorCaches::Cache<TransformedFeatureDimensionsSmall>& cache) noexcept;
namespace {
template<Color Perspective,
IncUpdateDirection Direction,
IndexType TransformedFeatureDimensions,
Accumulator<TransformedFeatureDimensions> AccumulatorState::*accPtr>
template<typename VectorWrapper,
IndexType Width,
UpdateOperation... ops,
typename ElementType,
typename... Ts,
std::enable_if_t<is_all_same_v<ElementType, Ts...>, bool> = true>
void fused_row_reduce(const ElementType* in, ElementType* out, const Ts* const... rows) {
constexpr IndexType size = Width * sizeof(ElementType) / sizeof(typename VectorWrapper::type);
auto* vecIn = reinterpret_cast<const typename VectorWrapper::type*>(in);
auto* vecOut = reinterpret_cast<typename VectorWrapper::type*>(out);
for (IndexType i = 0; i < size; ++i)
vecOut[i] = fused<VectorWrapper, ops...>(
vecIn[i], reinterpret_cast<const typename VectorWrapper::type*>(rows)[i]...);
}
template<Color Perspective, IndexType Dimensions>
struct AccumulatorUpdateContext {
const FeatureTransformer<Dimensions>& featureTransformer;
const AccumulatorState& from;
AccumulatorState& to;
AccumulatorUpdateContext(const FeatureTransformer<Dimensions>& ft,
const AccumulatorState& accF,
AccumulatorState& accT) noexcept :
featureTransformer{ft},
from{accF},
to{accT} {}
template<UpdateOperation... ops,
typename... Ts,
std::enable_if_t<is_all_same_v<IndexType, Ts...>, bool> = true>
void apply(const Ts... indices) {
auto to_weight_vector = [&](const IndexType index) {
return &featureTransformer.weights[index * Dimensions];
};
auto to_psqt_weight_vector = [&](const IndexType index) {
return &featureTransformer.psqtWeights[index * PSQTBuckets];
};
fused_row_reduce<Vec16Wrapper, Dimensions, ops...>(
(from.acc<Dimensions>()).accumulation[Perspective],
(to.acc<Dimensions>()).accumulation[Perspective], to_weight_vector(indices)...);
fused_row_reduce<Vec32Wrapper, PSQTBuckets, ops...>(
(from.acc<Dimensions>()).psqtAccumulation[Perspective],
(to.acc<Dimensions>()).psqtAccumulation[Perspective], to_psqt_weight_vector(indices)...);
}
};
template<Color Perspective, IndexType Dimensions>
auto make_accumulator_update_context(const FeatureTransformer<Dimensions>& featureTransformer,
const AccumulatorState& accumulatorFrom,
AccumulatorState& accumulatorTo) noexcept {
return AccumulatorUpdateContext<Perspective, Dimensions>{featureTransformer, accumulatorFrom,
accumulatorTo};
}
template<Color Perspective, IndexType TransformedFeatureDimensions>
void double_inc_update(const FeatureTransformer<TransformedFeatureDimensions>& featureTransformer,
const Square ksq,
AccumulatorState& middle_state,
AccumulatorState& target_state,
const AccumulatorState& computed) {
assert(computed.acc<TransformedFeatureDimensions>().computed[Perspective]);
assert(!middle_state.acc<TransformedFeatureDimensions>().computed[Perspective]);
assert(!target_state.acc<TransformedFeatureDimensions>().computed[Perspective]);
FeatureSet::IndexList removed, added;
FeatureSet::append_changed_indices<Perspective>(ksq, middle_state.dirtyPiece, removed, added);
// you can't capture a piece that was just involved in castling since the rook ends up
// in a square that the king passed
assert(added.size() < 2);
FeatureSet::append_changed_indices<Perspective>(ksq, target_state.dirtyPiece, removed, added);
assert(added.size() == 1);
assert(removed.size() == 2 || removed.size() == 3);
// Workaround compiler warning for uninitialized variables, replicated on
// profile builds on windows with gcc 14.2.0.
// TODO remove once unneeded
sf_assume(added.size() == 1);
sf_assume(removed.size() == 2 || removed.size() == 3);
auto updateContext =
make_accumulator_update_context<Perspective>(featureTransformer, computed, target_state);
if (removed.size() == 2)
{
updateContext.template apply<Add, Sub, Sub>(added[0], removed[0], removed[1]);
}
else
{
updateContext.template apply<Add, Sub, Sub, Sub>(added[0], removed[0], removed[1],
removed[2]);
}
target_state.acc<TransformedFeatureDimensions>().computed[Perspective] = true;
}
template<Color Perspective, bool Forward, IndexType TransformedFeatureDimensions>
void update_accumulator_incremental(
const FeatureTransformer<TransformedFeatureDimensions, accPtr>& featureTransformer,
const Square ksq,
AccumulatorState& target_state,
const AccumulatorState& computed) {
[[maybe_unused]] constexpr bool Forward = Direction == FORWARD;
[[maybe_unused]] constexpr bool Backwards = Direction == BACKWARDS;
const FeatureTransformer<TransformedFeatureDimensions>& featureTransformer,
const Square ksq,
AccumulatorState& target_state,
const AccumulatorState& computed) {
assert(Forward != Backwards);
assert((computed.*accPtr).computed[Perspective]);
assert(!(target_state.*accPtr).computed[Perspective]);
assert((computed.acc<TransformedFeatureDimensions>()).computed[Perspective]);
assert(!(target_state.acc<TransformedFeatureDimensions>()).computed[Perspective]);
// The size must be enough to contain the largest possible update.
// That might depend on the feature set and generally relies on the
@@ -238,188 +323,52 @@ void update_accumulator_incremental(
else
FeatureSet::append_changed_indices<Perspective>(ksq, computed.dirtyPiece, added, removed);
if (removed.size() == 0 && added.size() == 0)
assert(added.size() == 1 || added.size() == 2);
assert(removed.size() == 1 || removed.size() == 2);
assert((Forward && added.size() <= removed.size())
|| (!Forward && added.size() >= removed.size()));
// Workaround compiler warning for uninitialized variables, replicated on
// profile builds on windows with gcc 14.2.0.
// TODO remove once unneeded
sf_assume(added.size() == 1 || added.size() == 2);
sf_assume(removed.size() == 1 || removed.size() == 2);
auto updateContext =
make_accumulator_update_context<Perspective>(featureTransformer, computed, target_state);
if ((Forward && removed.size() == 1) || (!Forward && added.size() == 1))
{
std::memcpy((target_state.*accPtr).accumulation[Perspective],
(computed.*accPtr).accumulation[Perspective],
TransformedFeatureDimensions * sizeof(BiasType));
std::memcpy((target_state.*accPtr).psqtAccumulation[Perspective],
(computed.*accPtr).psqtAccumulation[Perspective],
PSQTBuckets * sizeof(PSQTWeightType));
assert(added.size() == 1 && removed.size() == 1);
updateContext.template apply<Add, Sub>(added[0], removed[0]);
}
else if (Forward && added.size() == 1)
{
assert(removed.size() == 2);
updateContext.template apply<Add, Sub, Sub>(added[0], removed[0], removed[1]);
}
else if (!Forward && removed.size() == 1)
{
assert(added.size() == 2);
updateContext.template apply<Add, Add, Sub>(added[0], added[1], removed[0]);
}
else
{
assert(added.size() == 1 || added.size() == 2);
assert(removed.size() == 1 || removed.size() == 2);
if (Forward)
assert(added.size() <= removed.size());
else
assert(removed.size() <= added.size());
// Workaround compiler warning for uninitialized variables, replicated on
// profile builds on windows with gcc 14.2.0.
// TODO remove once unneeded
sf_assume(added.size() == 1 || added.size() == 2);
sf_assume(removed.size() == 1 || removed.size() == 2);
#ifdef VECTOR
auto* accIn =
reinterpret_cast<const vec_t*>(&(computed.*accPtr).accumulation[Perspective][0]);
auto* accOut =
reinterpret_cast<vec_t*>(&(target_state.*accPtr).accumulation[Perspective][0]);
const IndexType offsetA0 = TransformedFeatureDimensions * added[0];
auto* columnA0 = reinterpret_cast<const vec_t*>(&featureTransformer.weights[offsetA0]);
const IndexType offsetR0 = TransformedFeatureDimensions * removed[0];
auto* columnR0 = reinterpret_cast<const vec_t*>(&featureTransformer.weights[offsetR0]);
if ((Forward && removed.size() == 1) || (Backwards && added.size() == 1))
{
assert(added.size() == 1 && removed.size() == 1);
for (IndexType i = 0;
i < TransformedFeatureDimensions * sizeof(WeightType) / sizeof(vec_t); ++i)
accOut[i] = vec_add_16(vec_sub_16(accIn[i], columnR0[i]), columnA0[i]);
}
else if (Forward && added.size() == 1)
{
assert(removed.size() == 2);
const IndexType offsetR1 = TransformedFeatureDimensions * removed[1];
auto* columnR1 = reinterpret_cast<const vec_t*>(&featureTransformer.weights[offsetR1]);
for (IndexType i = 0;
i < TransformedFeatureDimensions * sizeof(WeightType) / sizeof(vec_t); ++i)
accOut[i] = vec_sub_16(vec_add_16(accIn[i], columnA0[i]),
vec_add_16(columnR0[i], columnR1[i]));
}
else if (Backwards && removed.size() == 1)
{
assert(added.size() == 2);
const IndexType offsetA1 = TransformedFeatureDimensions * added[1];
auto* columnA1 = reinterpret_cast<const vec_t*>(&featureTransformer.weights[offsetA1]);
for (IndexType i = 0;
i < TransformedFeatureDimensions * sizeof(WeightType) / sizeof(vec_t); ++i)
accOut[i] = vec_add_16(vec_add_16(accIn[i], columnA0[i]),
vec_sub_16(columnA1[i], columnR0[i]));
}
else
{
assert(added.size() == 2 && removed.size() == 2);
const IndexType offsetA1 = TransformedFeatureDimensions * added[1];
auto* columnA1 = reinterpret_cast<const vec_t*>(&featureTransformer.weights[offsetA1]);
const IndexType offsetR1 = TransformedFeatureDimensions * removed[1];
auto* columnR1 = reinterpret_cast<const vec_t*>(&featureTransformer.weights[offsetR1]);
for (IndexType i = 0;
i < TransformedFeatureDimensions * sizeof(WeightType) / sizeof(vec_t); ++i)
accOut[i] = vec_add_16(accIn[i], vec_sub_16(vec_add_16(columnA0[i], columnA1[i]),
vec_add_16(columnR0[i], columnR1[i])));
}
auto* accPsqtIn =
reinterpret_cast<const psqt_vec_t*>(&(computed.*accPtr).psqtAccumulation[Perspective][0]);
auto* accPsqtOut =
reinterpret_cast<psqt_vec_t*>(&(target_state.*accPtr).psqtAccumulation[Perspective][0]);
const IndexType offsetPsqtA0 = PSQTBuckets * added[0];
auto* columnPsqtA0 =
reinterpret_cast<const psqt_vec_t*>(&featureTransformer.psqtWeights[offsetPsqtA0]);
const IndexType offsetPsqtR0 = PSQTBuckets * removed[0];
auto* columnPsqtR0 =
reinterpret_cast<const psqt_vec_t*>(&featureTransformer.psqtWeights[offsetPsqtR0]);
if ((Forward && removed.size() == 1)
|| (Backwards && added.size() == 1)) // added.size() == removed.size() == 1
{
for (std::size_t i = 0; i < PSQTBuckets * sizeof(PSQTWeightType) / sizeof(psqt_vec_t);
++i)
accPsqtOut[i] =
vec_add_psqt_32(vec_sub_psqt_32(accPsqtIn[i], columnPsqtR0[i]), columnPsqtA0[i]);
}
else if (Forward && added.size() == 1)
{
const IndexType offsetPsqtR1 = PSQTBuckets * removed[1];
auto* columnPsqtR1 =
reinterpret_cast<const psqt_vec_t*>(&featureTransformer.psqtWeights[offsetPsqtR1]);
for (std::size_t i = 0; i < PSQTBuckets * sizeof(PSQTWeightType) / sizeof(psqt_vec_t);
++i)
accPsqtOut[i] = vec_sub_psqt_32(vec_add_psqt_32(accPsqtIn[i], columnPsqtA0[i]),
vec_add_psqt_32(columnPsqtR0[i], columnPsqtR1[i]));
}
else if (Backwards && removed.size() == 1)
{
const IndexType offsetPsqtA1 = PSQTBuckets * added[1];
auto* columnPsqtA1 =
reinterpret_cast<const psqt_vec_t*>(&featureTransformer.psqtWeights[offsetPsqtA1]);
for (std::size_t i = 0; i < PSQTBuckets * sizeof(PSQTWeightType) / sizeof(psqt_vec_t);
++i)
accPsqtOut[i] = vec_add_psqt_32(vec_add_psqt_32(accPsqtIn[i], columnPsqtA0[i]),
vec_sub_psqt_32(columnPsqtA1[i], columnPsqtR0[i]));
}
else
{
const IndexType offsetPsqtA1 = PSQTBuckets * added[1];
auto* columnPsqtA1 =
reinterpret_cast<const psqt_vec_t*>(&featureTransformer.psqtWeights[offsetPsqtA1]);
const IndexType offsetPsqtR1 = PSQTBuckets * removed[1];
auto* columnPsqtR1 =
reinterpret_cast<const psqt_vec_t*>(&featureTransformer.psqtWeights[offsetPsqtR1]);
for (std::size_t i = 0; i < PSQTBuckets * sizeof(PSQTWeightType) / sizeof(psqt_vec_t);
++i)
accPsqtOut[i] = vec_add_psqt_32(
accPsqtIn[i], vec_sub_psqt_32(vec_add_psqt_32(columnPsqtA0[i], columnPsqtA1[i]),
vec_add_psqt_32(columnPsqtR0[i], columnPsqtR1[i])));
}
#else
std::memcpy((target_state.*accPtr).accumulation[Perspective],
(computed.*accPtr).accumulation[Perspective],
TransformedFeatureDimensions * sizeof(BiasType));
std::memcpy((target_state.*accPtr).psqtAccumulation[Perspective],
(computed.*accPtr).psqtAccumulation[Perspective],
PSQTBuckets * sizeof(PSQTWeightType));
// Difference calculation for the deactivated features
for (const auto index : removed)
{
const IndexType offset = TransformedFeatureDimensions * index;
for (IndexType i = 0; i < TransformedFeatureDimensions; ++i)
(target_state.*accPtr).accumulation[Perspective][i] -=
featureTransformer.weights[offset + i];
for (std::size_t i = 0; i < PSQTBuckets; ++i)
(target_state.*accPtr).psqtAccumulation[Perspective][i] -=
featureTransformer.psqtWeights[index * PSQTBuckets + i];
}
// Difference calculation for the activated features
for (const auto index : added)
{
const IndexType offset = TransformedFeatureDimensions * index;
for (IndexType i = 0; i < TransformedFeatureDimensions; ++i)
(target_state.*accPtr).accumulation[Perspective][i] +=
featureTransformer.weights[offset + i];
for (std::size_t i = 0; i < PSQTBuckets; ++i)
(target_state.*accPtr).psqtAccumulation[Perspective][i] +=
featureTransformer.psqtWeights[index * PSQTBuckets + i];
}
#endif
assert(added.size() == 2 && removed.size() == 2);
updateContext.template apply<Add, Add, Sub, Sub>(added[0], added[1], removed[0],
removed[1]);
}
(target_state.*accPtr).computed[Perspective] = true;
(target_state.acc<TransformedFeatureDimensions>()).computed[Perspective] = true;
}
template<Color Perspective, IndexType Dimensions, Accumulator<Dimensions> AccumulatorState::*accPtr>
void update_accumulator_refresh_cache(
const FeatureTransformer<Dimensions, accPtr>& featureTransformer,
const Position& pos,
AccumulatorState& accumulatorState,
AccumulatorCaches::Cache<Dimensions>& cache) {
using Tiling [[maybe_unused]] = SIMDTiling<Dimensions, Dimensions>;
template<Color Perspective, IndexType Dimensions>
void update_accumulator_refresh_cache(const FeatureTransformer<Dimensions>& featureTransformer,
const Position& pos,
AccumulatorState& accumulatorState,
AccumulatorCaches::Cache<Dimensions>& cache) {
using Tiling [[maybe_unused]] = SIMDTiling<Dimensions, Dimensions, PSQTBuckets>;
const Square ksq = pos.square<KING>(Perspective);
auto& entry = cache[ksq][Perspective];
@@ -448,12 +397,10 @@ void update_accumulator_refresh_cache(
}
}
auto& accumulator = accumulatorState.*accPtr;
auto& accumulator = accumulatorState.acc<Dimensions>();
accumulator.computed[Perspective] = true;
#ifdef VECTOR
const bool combineLast3 =
std::abs((int) removed.size() - (int) added.size()) == 1 && removed.size() + added.size() > 2;
vec_t acc[Tiling::NumRegs];
psqt_vec_t psqt[Tiling::NumPsqtRegs];
@@ -466,8 +413,8 @@ void update_accumulator_refresh_cache(
for (IndexType k = 0; k < Tiling::NumRegs; ++k)
acc[k] = entryTile[k];
std::size_t i = 0;
for (; i < std::min(removed.size(), added.size()) - combineLast3; ++i)
IndexType i = 0;
for (; i < std::min(removed.size(), added.size()); ++i)
{
IndexType indexR = removed[i];
const IndexType offsetR = Dimensions * indexR + j * Tiling::TileHeight;
@@ -477,60 +424,25 @@ void update_accumulator_refresh_cache(
auto* columnA = reinterpret_cast<const vec_t*>(&featureTransformer.weights[offsetA]);
for (IndexType k = 0; k < Tiling::NumRegs; ++k)
acc[k] = vec_add_16(acc[k], vec_sub_16(columnA[k], columnR[k]));
acc[k] = fused<Vec16Wrapper, Add, Sub>(acc[k], columnA[k], columnR[k]);
}
if (combineLast3)
for (; i < removed.size(); ++i)
{
IndexType indexR = removed[i];
const IndexType offsetR = Dimensions * indexR + j * Tiling::TileHeight;
auto* columnR = reinterpret_cast<const vec_t*>(&featureTransformer.weights[offsetR]);
IndexType indexA = added[i];
const IndexType offsetA = Dimensions * indexA + j * Tiling::TileHeight;
auto* columnA = reinterpret_cast<const vec_t*>(&featureTransformer.weights[offsetA]);
IndexType index = removed[i];
const IndexType offset = Dimensions * index + j * Tiling::TileHeight;
auto* column = reinterpret_cast<const vec_t*>(&featureTransformer.weights[offset]);
if (removed.size() > added.size())
{
IndexType indexR2 = removed[i + 1];
const IndexType offsetR2 = Dimensions * indexR2 + j * Tiling::TileHeight;
auto* columnR2 =
reinterpret_cast<const vec_t*>(&featureTransformer.weights[offsetR2]);
for (IndexType k = 0; k < Tiling::NumRegs; ++k)
acc[k] = vec_sub_16(vec_add_16(acc[k], columnA[k]),
vec_add_16(columnR[k], columnR2[k]));
}
else
{
IndexType indexA2 = added[i + 1];
const IndexType offsetA2 = Dimensions * indexA2 + j * Tiling::TileHeight;
auto* columnA2 =
reinterpret_cast<const vec_t*>(&featureTransformer.weights[offsetA2]);
for (IndexType k = 0; k < Tiling::NumRegs; ++k)
acc[k] = vec_add_16(vec_sub_16(acc[k], columnR[k]),
vec_add_16(columnA[k], columnA2[k]));
}
for (IndexType k = 0; k < Tiling::NumRegs; ++k)
acc[k] = vec_sub_16(acc[k], column[k]);
}
else
for (; i < added.size(); ++i)
{
for (; i < removed.size(); ++i)
{
IndexType index = removed[i];
const IndexType offset = Dimensions * index + j * Tiling::TileHeight;
auto* column = reinterpret_cast<const vec_t*>(&featureTransformer.weights[offset]);
IndexType index = added[i];
const IndexType offset = Dimensions * index + j * Tiling::TileHeight;
auto* column = reinterpret_cast<const vec_t*>(&featureTransformer.weights[offset]);
for (IndexType k = 0; k < Tiling::NumRegs; ++k)
acc[k] = vec_sub_16(acc[k], column[k]);
}
for (; i < added.size(); ++i)
{
IndexType index = added[i];
const IndexType offset = Dimensions * index + j * Tiling::TileHeight;
auto* column = reinterpret_cast<const vec_t*>(&featureTransformer.weights[offset]);
for (IndexType k = 0; k < Tiling::NumRegs; ++k)
acc[k] = vec_add_16(acc[k], column[k]);
}
for (IndexType k = 0; k < Tiling::NumRegs; ++k)
acc[k] = vec_add_16(acc[k], column[k]);
}
for (IndexType k = 0; k < Tiling::NumRegs; k++)
@@ -546,10 +458,10 @@ void update_accumulator_refresh_cache(
auto* entryTilePsqt =
reinterpret_cast<psqt_vec_t*>(&entry.psqtAccumulation[j * Tiling::PsqtTileHeight]);
for (std::size_t k = 0; k < Tiling::NumPsqtRegs; ++k)
for (IndexType k = 0; k < Tiling::NumPsqtRegs; ++k)
psqt[k] = entryTilePsqt[k];
for (std::size_t i = 0; i < removed.size(); ++i)
for (IndexType i = 0; i < removed.size(); ++i)
{
IndexType index = removed[i];
const IndexType offset = PSQTBuckets * index + j * Tiling::PsqtTileHeight;
@@ -559,7 +471,7 @@ void update_accumulator_refresh_cache(
for (std::size_t k = 0; k < Tiling::NumPsqtRegs; ++k)
psqt[k] = vec_sub_psqt_32(psqt[k], columnPsqt[k]);
}
for (std::size_t i = 0; i < added.size(); ++i)
for (IndexType i = 0; i < added.size(); ++i)
{
IndexType index = added[i];
const IndexType offset = PSQTBuckets * index + j * Tiling::PsqtTileHeight;
@@ -570,9 +482,9 @@ void update_accumulator_refresh_cache(
psqt[k] = vec_add_psqt_32(psqt[k], columnPsqt[k]);
}
for (std::size_t k = 0; k < Tiling::NumPsqtRegs; ++k)
for (IndexType k = 0; k < Tiling::NumPsqtRegs; ++k)
vec_store_psqt(&entryTilePsqt[k], psqt[k]);
for (std::size_t k = 0; k < Tiling::NumPsqtRegs; ++k)
for (IndexType k = 0; k < Tiling::NumPsqtRegs; ++k)
vec_store_psqt(&accTilePsqt[k], psqt[k]);
}

View File

@@ -37,19 +37,10 @@ class Position;
namespace Stockfish::Eval::NNUE {
using BiasType = std::int16_t;
using PSQTWeightType = std::int32_t;
using IndexType = std::uint32_t;
struct Networks;
template<IndexType Size>
struct alignas(CacheLineSize) Accumulator;
struct AccumulatorState;
template<IndexType TransformedFeatureDimensions,
Accumulator<TransformedFeatureDimensions> AccumulatorState::*accPtr>
template<IndexType TransformedFeatureDimensions>
class FeatureTransformer;
// Class that holds the result of affine transformation of input features
@@ -121,6 +112,30 @@ struct AccumulatorState {
Accumulator<TransformedFeatureDimensionsSmall> accumulatorSmall;
DirtyPiece dirtyPiece;
template<IndexType Size>
auto& acc() noexcept {
static_assert(Size == TransformedFeatureDimensionsBig
|| Size == TransformedFeatureDimensionsSmall,
"Invalid size for accumulator");
if constexpr (Size == TransformedFeatureDimensionsBig)
return accumulatorBig;
else if constexpr (Size == TransformedFeatureDimensionsSmall)
return accumulatorSmall;
}
template<IndexType Size>
const auto& acc() const noexcept {
static_assert(Size == TransformedFeatureDimensionsBig
|| Size == TransformedFeatureDimensionsSmall,
"Invalid size for accumulator");
if constexpr (Size == TransformedFeatureDimensionsBig)
return accumulatorBig;
else if constexpr (Size == TransformedFeatureDimensionsSmall)
return accumulatorSmall;
}
void reset(const DirtyPiece& dp) noexcept;
};
@@ -128,54 +143,43 @@ struct AccumulatorState {
class AccumulatorStack {
public:
AccumulatorStack() :
m_accumulators(MAX_PLY + 1),
m_current_idx{} {}
accumulators(MAX_PLY + 1),
size{1} {}
[[nodiscard]] const AccumulatorState& latest() const noexcept;
void
reset(const Position& rootPos, const Networks& networks, AccumulatorCaches& caches) noexcept;
void reset() noexcept;
void push(const DirtyPiece& dirtyPiece) noexcept;
void pop() noexcept;
template<IndexType Dimensions, Accumulator<Dimensions> AccumulatorState::*accPtr>
void evaluate(const Position& pos,
const FeatureTransformer<Dimensions, accPtr>& featureTransformer,
AccumulatorCaches::Cache<Dimensions>& cache) noexcept;
template<IndexType Dimensions>
void evaluate(const Position& pos,
const FeatureTransformer<Dimensions>& featureTransformer,
AccumulatorCaches::Cache<Dimensions>& cache) noexcept;
private:
[[nodiscard]] AccumulatorState& mut_latest() noexcept;
template<Color Perspective,
IndexType Dimensions,
Accumulator<Dimensions> AccumulatorState::*accPtr>
void evaluate_side(const Position& pos,
const FeatureTransformer<Dimensions, accPtr>& featureTransformer,
AccumulatorCaches::Cache<Dimensions>& cache) noexcept;
template<Color Perspective, IndexType Dimensions>
void evaluate_side(const Position& pos,
const FeatureTransformer<Dimensions>& featureTransformer,
AccumulatorCaches::Cache<Dimensions>& cache) noexcept;
template<Color Perspective,
IndexType Dimensions,
Accumulator<Dimensions> AccumulatorState::*accPtr>
template<Color Perspective, IndexType Dimensions>
[[nodiscard]] std::size_t find_last_usable_accumulator() const noexcept;
template<Color Perspective,
IndexType Dimensions,
Accumulator<Dimensions> AccumulatorState::*accPtr>
void
forward_update_incremental(const Position& pos,
const FeatureTransformer<Dimensions, accPtr>& featureTransformer,
const std::size_t begin) noexcept;
template<Color Perspective, IndexType Dimensions>
void forward_update_incremental(const Position& pos,
const FeatureTransformer<Dimensions>& featureTransformer,
const std::size_t begin) noexcept;
template<Color Perspective,
IndexType Dimensions,
Accumulator<Dimensions> AccumulatorState::*accPtr>
void
backward_update_incremental(const Position& pos,
const FeatureTransformer<Dimensions, accPtr>& featureTransformer,
const std::size_t end) noexcept;
template<Color Perspective, IndexType Dimensions>
void backward_update_incremental(const Position& pos,
const FeatureTransformer<Dimensions>& featureTransformer,
const std::size_t end) noexcept;
std::vector<AccumulatorState> m_accumulators;
std::size_t m_current_idx;
std::vector<AccumulatorState> accumulators;
std::size_t size;
};
} // namespace Stockfish::Eval::NNUE

View File

@@ -49,6 +49,12 @@ constexpr int L3Small = 32;
constexpr IndexType PSQTBuckets = 8;
constexpr IndexType LayerStacks = 8;
// If vector instructions are enabled, we update and refresh the
// accumulator tile by tile such that each tile fits in the CPU's
// vector registers.
static_assert(PSQTBuckets % 8 == 0,
"Per feature PSQT values cannot be processed at granularity lower than 8 at a time.");
template<IndexType L1, int L2, int L3>
struct NetworkArchitecture {
static constexpr IndexType TransformedFeatureDimensions = L1;

View File

@@ -48,6 +48,11 @@
namespace Stockfish::Eval::NNUE {
using BiasType = std::int16_t;
using WeightType = std::int16_t;
using PSQTWeightType = std::int32_t;
using IndexType = std::uint32_t;
// Version of the evaluation file
constexpr std::uint32_t Version = 0x7AF32F20u;
@@ -76,7 +81,6 @@ constexpr std::size_t MaxSimdWidth = 32;
// Type of input feature after conversion
using TransformedFeatureType = std::uint8_t;
using IndexType = std::uint32_t;
// Round n up to be a multiple of base
template<typename IntType>
@@ -279,11 +283,6 @@ inline void write_leb_128(std::ostream& stream, const IntType* values, std::size
flush();
}
enum IncUpdateDirection {
FORWARD,
BACKWARDS
};
} // namespace Stockfish::Eval::NNUE
#endif // #ifndef NNUE_COMMON_H_INCLUDED

View File

@@ -31,118 +31,10 @@
#include "nnue_accumulator.h"
#include "nnue_architecture.h"
#include "nnue_common.h"
#include "simd.h"
namespace Stockfish::Eval::NNUE {
using BiasType = std::int16_t;
using WeightType = std::int16_t;
using PSQTWeightType = std::int32_t;
// If vector instructions are enabled, we update and refresh the
// accumulator tile by tile such that each tile fits in the CPU's
// vector registers.
#define VECTOR
static_assert(PSQTBuckets % 8 == 0,
"Per feature PSQT values cannot be processed at granularity lower than 8 at a time.");
#ifdef USE_AVX512
using vec_t = __m512i;
using psqt_vec_t = __m256i;
#define vec_load(a) _mm512_load_si512(a)
#define vec_store(a, b) _mm512_store_si512(a, b)
#define vec_add_16(a, b) _mm512_add_epi16(a, b)
#define vec_sub_16(a, b) _mm512_sub_epi16(a, b)
#define vec_mulhi_16(a, b) _mm512_mulhi_epi16(a, b)
#define vec_zero() _mm512_setzero_epi32()
#define vec_set_16(a) _mm512_set1_epi16(a)
#define vec_max_16(a, b) _mm512_max_epi16(a, b)
#define vec_min_16(a, b) _mm512_min_epi16(a, b)
#define vec_slli_16(a, b) _mm512_slli_epi16(a, b)
// Inverse permuted at load time
#define vec_packus_16(a, b) _mm512_packus_epi16(a, b)
#define vec_load_psqt(a) _mm256_load_si256(a)
#define vec_store_psqt(a, b) _mm256_store_si256(a, b)
#define vec_add_psqt_32(a, b) _mm256_add_epi32(a, b)
#define vec_sub_psqt_32(a, b) _mm256_sub_epi32(a, b)
#define vec_zero_psqt() _mm256_setzero_si256()
#define NumRegistersSIMD 16
#define MaxChunkSize 64
#elif USE_AVX2
using vec_t = __m256i;
using psqt_vec_t = __m256i;
#define vec_load(a) _mm256_load_si256(a)
#define vec_store(a, b) _mm256_store_si256(a, b)
#define vec_add_16(a, b) _mm256_add_epi16(a, b)
#define vec_sub_16(a, b) _mm256_sub_epi16(a, b)
#define vec_mulhi_16(a, b) _mm256_mulhi_epi16(a, b)
#define vec_zero() _mm256_setzero_si256()
#define vec_set_16(a) _mm256_set1_epi16(a)
#define vec_max_16(a, b) _mm256_max_epi16(a, b)
#define vec_min_16(a, b) _mm256_min_epi16(a, b)
#define vec_slli_16(a, b) _mm256_slli_epi16(a, b)
// Inverse permuted at load time
#define vec_packus_16(a, b) _mm256_packus_epi16(a, b)
#define vec_load_psqt(a) _mm256_load_si256(a)
#define vec_store_psqt(a, b) _mm256_store_si256(a, b)
#define vec_add_psqt_32(a, b) _mm256_add_epi32(a, b)
#define vec_sub_psqt_32(a, b) _mm256_sub_epi32(a, b)
#define vec_zero_psqt() _mm256_setzero_si256()
#define NumRegistersSIMD 16
#define MaxChunkSize 32
#elif USE_SSE2
using vec_t = __m128i;
using psqt_vec_t = __m128i;
#define vec_load(a) (*(a))
#define vec_store(a, b) *(a) = (b)
#define vec_add_16(a, b) _mm_add_epi16(a, b)
#define vec_sub_16(a, b) _mm_sub_epi16(a, b)
#define vec_mulhi_16(a, b) _mm_mulhi_epi16(a, b)
#define vec_zero() _mm_setzero_si128()
#define vec_set_16(a) _mm_set1_epi16(a)
#define vec_max_16(a, b) _mm_max_epi16(a, b)
#define vec_min_16(a, b) _mm_min_epi16(a, b)
#define vec_slli_16(a, b) _mm_slli_epi16(a, b)
#define vec_packus_16(a, b) _mm_packus_epi16(a, b)
#define vec_load_psqt(a) (*(a))
#define vec_store_psqt(a, b) *(a) = (b)
#define vec_add_psqt_32(a, b) _mm_add_epi32(a, b)
#define vec_sub_psqt_32(a, b) _mm_sub_epi32(a, b)
#define vec_zero_psqt() _mm_setzero_si128()
#define NumRegistersSIMD (Is64Bit ? 16 : 8)
#define MaxChunkSize 16
#elif USE_NEON
using vec_t = int16x8_t;
using psqt_vec_t = int32x4_t;
#define vec_load(a) (*(a))
#define vec_store(a, b) *(a) = (b)
#define vec_add_16(a, b) vaddq_s16(a, b)
#define vec_sub_16(a, b) vsubq_s16(a, b)
#define vec_mulhi_16(a, b) vqdmulhq_s16(a, b)
#define vec_zero() \
vec_t { 0 }
#define vec_set_16(a) vdupq_n_s16(a)
#define vec_max_16(a, b) vmaxq_s16(a, b)
#define vec_min_16(a, b) vminq_s16(a, b)
#define vec_slli_16(a, b) vshlq_s16(a, vec_set_16(b))
#define vec_packus_16(a, b) reinterpret_cast<vec_t>(vcombine_u8(vqmovun_s16(a), vqmovun_s16(b)))
#define vec_load_psqt(a) (*(a))
#define vec_store_psqt(a, b) *(a) = (b)
#define vec_add_psqt_32(a, b) vaddq_s32(a, b)
#define vec_sub_psqt_32(a, b) vsubq_s32(a, b)
#define vec_zero_psqt() \
psqt_vec_t { 0 }
#define NumRegistersSIMD 16
#define MaxChunkSize 16
#else
#undef VECTOR
#endif
// Returns the inverse of a permutation
template<std::size_t Len>
constexpr std::array<std::size_t, Len>
@@ -184,64 +76,8 @@ void permute(T (&data)[N], const std::array<std::size_t, OrderSize>& order) {
}
}
// Compute optimal SIMD register count for feature transformer accumulation.
template<IndexType TransformedFeatureWidth, IndexType HalfDimensions>
class SIMDTiling {
#ifdef VECTOR
// We use __m* types as template arguments, which causes GCC to emit warnings
// about losing some attribute information. This is irrelevant to us as we
// only take their size, so the following pragma are harmless.
#if defined(__GNUC__)
#pragma GCC diagnostic push
#pragma GCC diagnostic ignored "-Wignored-attributes"
#endif
template<typename SIMDRegisterType, typename LaneType, int NumLanes, int MaxRegisters>
static constexpr int BestRegisterCount() {
constexpr std::size_t RegisterSize = sizeof(SIMDRegisterType);
constexpr std::size_t LaneSize = sizeof(LaneType);
static_assert(RegisterSize >= LaneSize);
static_assert(MaxRegisters <= NumRegistersSIMD);
static_assert(MaxRegisters > 0);
static_assert(NumRegistersSIMD > 0);
static_assert(RegisterSize % LaneSize == 0);
static_assert((NumLanes * LaneSize) % RegisterSize == 0);
const int ideal = (NumLanes * LaneSize) / RegisterSize;
if (ideal <= MaxRegisters)
return ideal;
// Look for the largest divisor of the ideal register count that is smaller than MaxRegisters
for (int divisor = MaxRegisters; divisor > 1; --divisor)
if (ideal % divisor == 0)
return divisor;
return 1;
}
#if defined(__GNUC__)
#pragma GCC diagnostic pop
#endif
public:
static constexpr int NumRegs =
BestRegisterCount<vec_t, WeightType, TransformedFeatureWidth, NumRegistersSIMD>();
static constexpr int NumPsqtRegs =
BestRegisterCount<psqt_vec_t, PSQTWeightType, PSQTBuckets, NumRegistersSIMD>();
static constexpr IndexType TileHeight = NumRegs * sizeof(vec_t) / 2;
static constexpr IndexType PsqtTileHeight = NumPsqtRegs * sizeof(psqt_vec_t) / 4;
static_assert(HalfDimensions % TileHeight == 0, "TileHeight must divide HalfDimensions");
static_assert(PSQTBuckets % PsqtTileHeight == 0, "PsqtTileHeight must divide PSQTBuckets");
#endif
};
// Input feature converter
template<IndexType TransformedFeatureDimensions,
Accumulator<TransformedFeatureDimensions> AccumulatorState::*accPtr>
template<IndexType TransformedFeatureDimensions>
class FeatureTransformer {
// Number of output dimensions for one side
@@ -342,16 +178,18 @@ class FeatureTransformer {
OutputType* output,
int bucket) const {
using namespace SIMD;
accumulatorStack.evaluate(pos, *this, *cache);
const auto& accumulatorState = accumulatorStack.latest();
const Color perspectives[2] = {pos.side_to_move(), ~pos.side_to_move()};
const auto& psqtAccumulation = (accumulatorState.*accPtr).psqtAccumulation;
const auto& psqtAccumulation = (accumulatorState.acc<HalfDimensions>()).psqtAccumulation;
const auto psqt =
(psqtAccumulation[perspectives[0]][bucket] - psqtAccumulation[perspectives[1]][bucket])
/ 2;
const auto& accumulation = (accumulatorState.*accPtr).accumulation;
const auto& accumulation = (accumulatorState.acc<HalfDimensions>()).accumulation;
for (IndexType p = 0; p < 2; ++p)
{

View File

@@ -121,7 +121,6 @@ trace(Position& pos, const Eval::NNUE::Networks& networks, Eval::NNUE::Accumulat
};
AccumulatorStack accumulators;
accumulators.reset(pos, networks, caches);
// We estimate the value of each piece by doing a differential evaluation from
// the current base eval, simulating the removal of the piece from its square.
@@ -140,7 +139,7 @@ trace(Position& pos, const Eval::NNUE::Networks& networks, Eval::NNUE::Accumulat
{
pos.remove_piece(sq);
accumulators.reset(pos, networks, caches);
accumulators.reset();
std::tie(psqt, positional) = networks.big.evaluate(pos, accumulators, &caches.big);
Value eval = psqt + positional;
eval = pos.side_to_move() == WHITE ? eval : -eval;
@@ -157,7 +156,7 @@ trace(Position& pos, const Eval::NNUE::Networks& networks, Eval::NNUE::Accumulat
ss << board[row] << '\n';
ss << '\n';
accumulators.reset(pos, networks, caches);
accumulators.reset();
auto t = networks.big.trace_evaluate(pos, accumulators, &caches.big);
ss << " NNUE network contributions "

406
src/nnue/simd.h Normal file
View File

@@ -0,0 +1,406 @@
/*
Stockfish, a UCI chess playing engine derived from Glaurung 2.1
Copyright (C) 2004-2025 The Stockfish developers (see AUTHORS file)
Stockfish is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.
Stockfish is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program. If not, see <http://www.gnu.org/licenses/>.
*/
#ifndef NNUE_SIMD_H_INCLUDED
#define NNUE_SIMD_H_INCLUDED
#if defined(USE_AVX2)
#include <immintrin.h>
#elif defined(USE_SSE41)
#include <smmintrin.h>
#elif defined(USE_SSSE3)
#include <tmmintrin.h>
#elif defined(USE_SSE2)
#include <emmintrin.h>
#elif defined(USE_NEON)
#include <arm_neon.h>
#endif
#include "../types.h"
#include "nnue_common.h"
namespace Stockfish::Eval::NNUE::SIMD {
// If vector instructions are enabled, we update and refresh the
// accumulator tile by tile such that each tile fits in the CPU's
// vector registers.
#define VECTOR
#ifdef USE_AVX512
using vec_t = __m512i;
using vec128_t = __m128i;
using psqt_vec_t = __m256i;
using vec_uint_t = __m512i;
#define vec_load(a) _mm512_load_si512(a)
#define vec_store(a, b) _mm512_store_si512(a, b)
#define vec_add_16(a, b) _mm512_add_epi16(a, b)
#define vec_sub_16(a, b) _mm512_sub_epi16(a, b)
#define vec_mulhi_16(a, b) _mm512_mulhi_epi16(a, b)
#define vec_zero() _mm512_setzero_epi32()
#define vec_set_16(a) _mm512_set1_epi16(a)
#define vec_max_16(a, b) _mm512_max_epi16(a, b)
#define vec_min_16(a, b) _mm512_min_epi16(a, b)
#define vec_slli_16(a, b) _mm512_slli_epi16(a, b)
// Inverse permuted at load time
#define vec_packus_16(a, b) _mm512_packus_epi16(a, b)
#define vec_load_psqt(a) _mm256_load_si256(a)
#define vec_store_psqt(a, b) _mm256_store_si256(a, b)
#define vec_add_psqt_32(a, b) _mm256_add_epi32(a, b)
#define vec_sub_psqt_32(a, b) _mm256_sub_epi32(a, b)
#define vec_zero_psqt() _mm256_setzero_si256()
#ifdef USE_SSSE3
#define vec_nnz(a) _mm512_cmpgt_epi32_mask(a, _mm512_setzero_si512())
#endif
#define vec128_zero _mm_setzero_si128()
#define vec128_set_16(a) _mm_set1_epi16(a)
#define vec128_load(a) _mm_load_si128(a)
#define vec128_storeu(a, b) _mm_storeu_si128(a, b)
#define vec128_add(a, b) _mm_add_epi16(a, b)
#define NumRegistersSIMD 16
#define MaxChunkSize 64
#elif USE_AVX2
using vec_t = __m256i;
using vec128_t = __m128i;
using psqt_vec_t = __m256i;
using vec_uint_t = __m256i;
#define vec_load(a) _mm256_load_si256(a)
#define vec_store(a, b) _mm256_store_si256(a, b)
#define vec_add_16(a, b) _mm256_add_epi16(a, b)
#define vec_sub_16(a, b) _mm256_sub_epi16(a, b)
#define vec_mulhi_16(a, b) _mm256_mulhi_epi16(a, b)
#define vec_zero() _mm256_setzero_si256()
#define vec_set_16(a) _mm256_set1_epi16(a)
#define vec_max_16(a, b) _mm256_max_epi16(a, b)
#define vec_min_16(a, b) _mm256_min_epi16(a, b)
#define vec_slli_16(a, b) _mm256_slli_epi16(a, b)
// Inverse permuted at load time
#define vec_packus_16(a, b) _mm256_packus_epi16(a, b)
#define vec_load_psqt(a) _mm256_load_si256(a)
#define vec_store_psqt(a, b) _mm256_store_si256(a, b)
#define vec_add_psqt_32(a, b) _mm256_add_epi32(a, b)
#define vec_sub_psqt_32(a, b) _mm256_sub_epi32(a, b)
#define vec_zero_psqt() _mm256_setzero_si256()
#ifdef USE_SSSE3
#if defined(USE_VNNI) && !defined(USE_AVXVNNI)
#define vec_nnz(a) _mm256_cmpgt_epi32_mask(a, _mm256_setzero_si256())
#else
#define vec_nnz(a) \
_mm256_movemask_ps( \
_mm256_castsi256_ps(_mm256_cmpgt_epi32(a, _mm256_setzero_si256())))
#endif
#endif
#define vec128_zero _mm_setzero_si128()
#define vec128_set_16(a) _mm_set1_epi16(a)
#define vec128_load(a) _mm_load_si128(a)
#define vec128_storeu(a, b) _mm_storeu_si128(a, b)
#define vec128_add(a, b) _mm_add_epi16(a, b)
#define NumRegistersSIMD 16
#define MaxChunkSize 32
#elif USE_SSE2
using vec_t = __m128i;
using vec128_t = __m128i;
using psqt_vec_t = __m128i;
using vec_uint_t = __m128i;
#define vec_load(a) (*(a))
#define vec_store(a, b) *(a) = (b)
#define vec_add_16(a, b) _mm_add_epi16(a, b)
#define vec_sub_16(a, b) _mm_sub_epi16(a, b)
#define vec_mulhi_16(a, b) _mm_mulhi_epi16(a, b)
#define vec_zero() _mm_setzero_si128()
#define vec_set_16(a) _mm_set1_epi16(a)
#define vec_max_16(a, b) _mm_max_epi16(a, b)
#define vec_min_16(a, b) _mm_min_epi16(a, b)
#define vec_slli_16(a, b) _mm_slli_epi16(a, b)
#define vec_packus_16(a, b) _mm_packus_epi16(a, b)
#define vec_load_psqt(a) (*(a))
#define vec_store_psqt(a, b) *(a) = (b)
#define vec_add_psqt_32(a, b) _mm_add_epi32(a, b)
#define vec_sub_psqt_32(a, b) _mm_sub_epi32(a, b)
#define vec_zero_psqt() _mm_setzero_si128()
#ifdef USE_SSSE3
#define vec_nnz(a) \
_mm_movemask_ps(_mm_castsi128_ps(_mm_cmpgt_epi32(a, _mm_setzero_si128())))
#endif
#define vec128_zero _mm_setzero_si128()
#define vec128_set_16(a) _mm_set1_epi16(a)
#define vec128_load(a) _mm_load_si128(a)
#define vec128_storeu(a, b) _mm_storeu_si128(a, b)
#define vec128_add(a, b) _mm_add_epi16(a, b)
#define NumRegistersSIMD (Is64Bit ? 16 : 8)
#define MaxChunkSize 16
#elif USE_NEON
using vec_t = int16x8_t;
using psqt_vec_t = int32x4_t;
using vec128_t = uint16x8_t;
using vec_uint_t = uint32x4_t;
#define vec_load(a) (*(a))
#define vec_store(a, b) *(a) = (b)
#define vec_add_16(a, b) vaddq_s16(a, b)
#define vec_sub_16(a, b) vsubq_s16(a, b)
#define vec_mulhi_16(a, b) vqdmulhq_s16(a, b)
#define vec_zero() vec_t{0}
#define vec_set_16(a) vdupq_n_s16(a)
#define vec_max_16(a, b) vmaxq_s16(a, b)
#define vec_min_16(a, b) vminq_s16(a, b)
#define vec_slli_16(a, b) vshlq_s16(a, vec_set_16(b))
#define vec_packus_16(a, b) reinterpret_cast<vec_t>(vcombine_u8(vqmovun_s16(a), vqmovun_s16(b)))
#define vec_load_psqt(a) (*(a))
#define vec_store_psqt(a, b) *(a) = (b)
#define vec_add_psqt_32(a, b) vaddq_s32(a, b)
#define vec_sub_psqt_32(a, b) vsubq_s32(a, b)
#define vec_zero_psqt() psqt_vec_t{0}
static constexpr std::uint32_t Mask[4] = {1, 2, 4, 8};
#define vec_nnz(a) vaddvq_u32(vandq_u32(vtstq_u32(a, a), vld1q_u32(Mask)))
#define vec128_zero vdupq_n_u16(0)
#define vec128_set_16(a) vdupq_n_u16(a)
#define vec128_load(a) vld1q_u16(reinterpret_cast<const std::uint16_t*>(a))
#define vec128_storeu(a, b) vst1q_u16(reinterpret_cast<std::uint16_t*>(a), b)
#define vec128_add(a, b) vaddq_u16(a, b)
#define NumRegistersSIMD 16
#define MaxChunkSize 16
#else
#undef VECTOR
#endif
struct Vec16Wrapper {
#ifdef VECTOR
using type = vec_t;
static type add(const type& lhs, const type& rhs) { return vec_add_16(lhs, rhs); }
static type sub(const type& lhs, const type& rhs) { return vec_sub_16(lhs, rhs); }
#else
using type = BiasType;
static type add(const type& lhs, const type& rhs) { return lhs + rhs; }
static type sub(const type& lhs, const type& rhs) { return lhs - rhs; }
#endif
};
struct Vec32Wrapper {
#ifdef VECTOR
using type = psqt_vec_t;
static type add(const type& lhs, const type& rhs) { return vec_add_psqt_32(lhs, rhs); }
static type sub(const type& lhs, const type& rhs) { return vec_sub_psqt_32(lhs, rhs); }
#else
using type = PSQTWeightType;
static type add(const type& lhs, const type& rhs) { return lhs + rhs; }
static type sub(const type& lhs, const type& rhs) { return lhs - rhs; }
#endif
};
enum UpdateOperation {
Add,
Sub
};
template<typename VecWrapper,
UpdateOperation... ops,
std::enable_if_t<sizeof...(ops) == 0, bool> = true>
typename VecWrapper::type fused(const typename VecWrapper::type& in) {
return in;
}
template<typename VecWrapper,
UpdateOperation update_op,
UpdateOperation... ops,
typename T,
typename... Ts,
std::enable_if_t<is_all_same_v<typename VecWrapper::type, T, Ts...>, bool> = true,
std::enable_if_t<sizeof...(ops) == sizeof...(Ts), bool> = true>
typename VecWrapper::type
fused(const typename VecWrapper::type& in, const T& operand, const Ts&... operands) {
switch (update_op)
{
case Add :
return fused<VecWrapper, ops...>(VecWrapper::add(in, operand), operands...);
case Sub :
return fused<VecWrapper, ops...>(VecWrapper::sub(in, operand), operands...);
default :
static_assert(update_op == Add || update_op == Sub,
"Only Add and Sub are currently supported.");
return typename VecWrapper::type();
}
}
#if defined(USE_AVX512)
[[maybe_unused]] static int m512_hadd(__m512i sum, int bias) {
return _mm512_reduce_add_epi32(sum) + bias;
}
[[maybe_unused]] static void m512_add_dpbusd_epi32(__m512i& acc, __m512i a, __m512i b) {
#if defined(USE_VNNI)
acc = _mm512_dpbusd_epi32(acc, a, b);
#else
__m512i product0 = _mm512_maddubs_epi16(a, b);
product0 = _mm512_madd_epi16(product0, _mm512_set1_epi16(1));
acc = _mm512_add_epi32(acc, product0);
#endif
}
#endif
#if defined(USE_AVX2)
[[maybe_unused]] static int m256_hadd(__m256i sum, int bias) {
__m128i sum128 = _mm_add_epi32(_mm256_castsi256_si128(sum), _mm256_extracti128_si256(sum, 1));
sum128 = _mm_add_epi32(sum128, _mm_shuffle_epi32(sum128, _MM_PERM_BADC));
sum128 = _mm_add_epi32(sum128, _mm_shuffle_epi32(sum128, _MM_PERM_CDAB));
return _mm_cvtsi128_si32(sum128) + bias;
}
[[maybe_unused]] static void m256_add_dpbusd_epi32(__m256i& acc, __m256i a, __m256i b) {
#if defined(USE_VNNI)
acc = _mm256_dpbusd_epi32(acc, a, b);
#else
__m256i product0 = _mm256_maddubs_epi16(a, b);
product0 = _mm256_madd_epi16(product0, _mm256_set1_epi16(1));
acc = _mm256_add_epi32(acc, product0);
#endif
}
#endif
#if defined(USE_SSSE3)
[[maybe_unused]] static int m128_hadd(__m128i sum, int bias) {
sum = _mm_add_epi32(sum, _mm_shuffle_epi32(sum, 0x4E)); //_MM_PERM_BADC
sum = _mm_add_epi32(sum, _mm_shuffle_epi32(sum, 0xB1)); //_MM_PERM_CDAB
return _mm_cvtsi128_si32(sum) + bias;
}
[[maybe_unused]] static void m128_add_dpbusd_epi32(__m128i& acc, __m128i a, __m128i b) {
__m128i product0 = _mm_maddubs_epi16(a, b);
product0 = _mm_madd_epi16(product0, _mm_set1_epi16(1));
acc = _mm_add_epi32(acc, product0);
}
#endif
#if defined(USE_NEON_DOTPROD)
[[maybe_unused]] static void
dotprod_m128_add_dpbusd_epi32(int32x4_t& acc, int8x16_t a, int8x16_t b) {
acc = vdotq_s32(acc, a, b);
}
#endif
#if defined(USE_NEON)
[[maybe_unused]] static int neon_m128_reduce_add_epi32(int32x4_t s) {
#if USE_NEON >= 8
return vaddvq_s32(s);
#else
return s[0] + s[1] + s[2] + s[3];
#endif
}
[[maybe_unused]] static int neon_m128_hadd(int32x4_t sum, int bias) {
return neon_m128_reduce_add_epi32(sum) + bias;
}
#endif
#if USE_NEON >= 8
[[maybe_unused]] static void neon_m128_add_dpbusd_epi32(int32x4_t& acc, int8x16_t a, int8x16_t b) {
int16x8_t product0 = vmull_s8(vget_low_s8(a), vget_low_s8(b));
int16x8_t product1 = vmull_high_s8(a, b);
int16x8_t sum = vpaddq_s16(product0, product1);
acc = vpadalq_s16(acc, sum);
}
#endif
// Compute optimal SIMD register count for feature transformer accumulation.
template<IndexType TransformedFeatureWidth, IndexType HalfDimensions, IndexType PSQTBuckets>
class SIMDTiling {
#ifdef VECTOR
// We use __m* types as template arguments, which causes GCC to emit warnings
// about losing some attribute information. This is irrelevant to us as we
// only take their size, so the following pragma are harmless.
#if defined(__GNUC__)
#pragma GCC diagnostic push
#pragma GCC diagnostic ignored "-Wignored-attributes"
#endif
template<typename SIMDRegisterType, typename LaneType, int NumLanes, int MaxRegisters>
static constexpr int BestRegisterCount() {
constexpr std::size_t RegisterSize = sizeof(SIMDRegisterType);
constexpr std::size_t LaneSize = sizeof(LaneType);
static_assert(RegisterSize >= LaneSize);
static_assert(MaxRegisters <= NumRegistersSIMD);
static_assert(MaxRegisters > 0);
static_assert(NumRegistersSIMD > 0);
static_assert(RegisterSize % LaneSize == 0);
static_assert((NumLanes * LaneSize) % RegisterSize == 0);
const int ideal = (NumLanes * LaneSize) / RegisterSize;
if (ideal <= MaxRegisters)
return ideal;
// Look for the largest divisor of the ideal register count that is smaller than MaxRegisters
for (int divisor = MaxRegisters; divisor > 1; --divisor)
if (ideal % divisor == 0)
return divisor;
return 1;
}
#if defined(__GNUC__)
#pragma GCC diagnostic pop
#endif
public:
static constexpr int NumRegs =
BestRegisterCount<vec_t, WeightType, TransformedFeatureWidth, NumRegistersSIMD>();
static constexpr int NumPsqtRegs =
BestRegisterCount<psqt_vec_t, PSQTWeightType, PSQTBuckets, NumRegistersSIMD>();
static constexpr IndexType TileHeight = NumRegs * sizeof(vec_t) / 2;
static constexpr IndexType PsqtTileHeight = NumPsqtRegs * sizeof(psqt_vec_t) / 4;
static_assert(HalfDimensions % TileHeight == 0, "TileHeight must divide HalfDimensions");
static_assert(PSQTBuckets % PsqtTileHeight == 0, "PsqtTileHeight must divide PSQTBuckets");
#endif
};
}
#endif

View File

@@ -54,8 +54,8 @@ namespace {
constexpr std::string_view PieceToChar(" PNBRQK pnbrqk");
constexpr Piece Pieces[] = {W_PAWN, W_KNIGHT, W_BISHOP, W_ROOK, W_QUEEN, W_KING,
B_PAWN, B_KNIGHT, B_BISHOP, B_ROOK, B_QUEEN, B_KING};
static constexpr Piece Pieces[] = {W_PAWN, W_KNIGHT, W_BISHOP, W_ROOK, W_QUEEN, W_KING,
B_PAWN, B_KNIGHT, B_BISHOP, B_ROOK, B_QUEEN, B_KING};
} // namespace
@@ -270,7 +270,7 @@ Position& Position::set(const string& fenStr, bool isChess960, StateInfo* si) {
// a) side to move have a pawn threatening epSquare
// b) there is an enemy pawn in front of epSquare
// c) there is no piece on epSquare or behind epSquare
enpassant = pawn_attacks_bb(~sideToMove, st->epSquare) & pieces(sideToMove, PAWN)
enpassant = attacks_bb<PAWN>(st->epSquare, ~sideToMove) & pieces(sideToMove, PAWN)
&& (pieces(~sideToMove, PAWN) & (st->epSquare + pawn_push(~sideToMove)))
&& !(pieces() & (st->epSquare | (st->epSquare + pawn_push(sideToMove))));
}
@@ -321,7 +321,7 @@ void Position::set_check_info() const {
Square ksq = square<KING>(~sideToMove);
st->checkSquares[PAWN] = pawn_attacks_bb(~sideToMove, ksq);
st->checkSquares[PAWN] = attacks_bb<PAWN>(ksq, ~sideToMove);
st->checkSquares[KNIGHT] = attacks_bb<KNIGHT>(ksq);
st->checkSquares[BISHOP] = attacks_bb<BISHOP>(ksq, pieces());
st->checkSquares[ROOK] = attacks_bb<ROOK>(ksq, pieces());
@@ -487,8 +487,8 @@ Bitboard Position::attackers_to(Square s, Bitboard occupied) const {
return (attacks_bb<ROOK>(s, occupied) & pieces(ROOK, QUEEN))
| (attacks_bb<BISHOP>(s, occupied) & pieces(BISHOP, QUEEN))
| (pawn_attacks_bb(BLACK, s) & pieces(WHITE, PAWN))
| (pawn_attacks_bb(WHITE, s) & pieces(BLACK, PAWN))
| (attacks_bb<PAWN>(s, BLACK) & pieces(WHITE, PAWN))
| (attacks_bb<PAWN>(s, WHITE) & pieces(BLACK, PAWN))
| (attacks_bb<KNIGHT>(s) & pieces(KNIGHT)) | (attacks_bb<KING>(s) & pieces(KING));
}
@@ -498,7 +498,7 @@ bool Position::attackers_to_exist(Square s, Bitboard occupied, Color c) const {
&& (attacks_bb<ROOK>(s, occupied) & pieces(c, ROOK, QUEEN)))
|| ((attacks_bb<BISHOP>(s) & pieces(c, BISHOP, QUEEN))
&& (attacks_bb<BISHOP>(s, occupied) & pieces(c, BISHOP, QUEEN)))
|| (((pawn_attacks_bb(~c, s) & pieces(PAWN)) | (attacks_bb<KNIGHT>(s) & pieces(KNIGHT))
|| (((attacks_bb<PAWN>(s, ~c) & pieces(PAWN)) | (attacks_bb<KNIGHT>(s) & pieces(KNIGHT))
| (attacks_bb<KING>(s) & pieces(KING)))
& pieces(c));
}
@@ -597,10 +597,14 @@ bool Position::pseudo_legal(const Move m) const {
if ((Rank8BB | Rank1BB) & to)
return false;
if (!(pawn_attacks_bb(us, from) & pieces(~us) & to) // Not a capture
&& !((from + pawn_push(us) == to) && empty(to)) // Not a single push
&& !((from + 2 * pawn_push(us) == to) // Not a double push
&& (relative_rank(us, from) == RANK_2) && empty(to) && empty(to - pawn_push(us))))
// Check if it's a valid capture, single push, or double push
const bool isCapture = bool(attacks_bb<PAWN>(from, us) & pieces(~us) & to);
const bool isSinglePush = (from + pawn_push(us) == to) && empty(to);
const bool isDoublePush = (from + 2 * pawn_push(us) == to)
&& (relative_rank(us, from) == RANK_2) && empty(to)
&& empty(to - pawn_push(us));
if (!(isCapture || isSinglePush || isDoublePush))
return false;
}
else if (!(attacks_bb(type_of(pc), from, pieces()) & to))
@@ -698,7 +702,6 @@ DirtyPiece Position::do_move(Move m,
// our state pointer to point to the new (ready to be updated) state.
std::memcpy(&newSt, st, offsetof(StateInfo, key));
newSt.previous = st;
st->next = &newSt;
st = &newSt;
// Increment ply counters. In particular, rule50 will be reset to zero later on
@@ -707,9 +710,6 @@ DirtyPiece Position::do_move(Move m,
++st->rule50;
++st->pliesFromNull;
DirtyPiece dp;
dp.dirty_num = 1;
Color us = sideToMove;
Color them = ~us;
Square from = m.from_sq();
@@ -717,6 +717,12 @@ DirtyPiece Position::do_move(Move m,
Piece pc = piece_on(from);
Piece captured = m.type_of() == EN_PASSANT ? make_piece(them, PAWN) : piece_on(to);
DirtyPiece dp;
dp.pc = pc;
dp.from = from;
dp.to = to;
dp.add_sq = SQ_NONE;
assert(color_of(pc) == us);
assert(captured == NO_PIECE || color_of(captured) == (m.type_of() != CASTLING ? them : us));
assert(type_of(captured) != KING);
@@ -733,8 +739,7 @@ DirtyPiece Position::do_move(Move m,
st->nonPawnKey[us] ^= Zobrist::psq[captured][rfrom] ^ Zobrist::psq[captured][rto];
captured = NO_PIECE;
}
if (captured)
else if (captured)
{
Square capsq = to;
@@ -764,10 +769,8 @@ DirtyPiece Position::do_move(Move m,
st->minorPieceKey ^= Zobrist::psq[captured][capsq];
}
dp.dirty_num = 2; // 1 piece moved, 1 piece captured
dp.piece[1] = captured;
dp.from[1] = capsq;
dp.to[1] = SQ_NONE;
dp.remove_pc = captured;
dp.remove_sq = capsq;
// Update board and piece lists
remove_piece(capsq);
@@ -778,6 +781,8 @@ DirtyPiece Position::do_move(Move m,
// Reset rule 50 counter
st->rule50 = 0;
}
else
dp.remove_sq = SQ_NONE;
// Update hash key
k ^= Zobrist::psq[pc][from] ^ Zobrist::psq[pc][to];
@@ -800,9 +805,6 @@ DirtyPiece Position::do_move(Move m,
// Move the piece. The tricky Chess960 castling is handled earlier
if (m.type_of() != CASTLING)
{
dp.piece[0] = pc;
dp.from[0] = from;
dp.to[0] = to;
move_piece(from, to);
}
@@ -812,7 +814,7 @@ DirtyPiece Position::do_move(Move m,
{
// Set en passant square if the moved pawn can be captured
if ((int(to) ^ int(from)) == 16
&& (pawn_attacks_bb(us, to - pawn_push(us)) & pieces(them, PAWN)))
&& (attacks_bb<PAWN>(to - pawn_push(us), us) & pieces(them, PAWN)))
{
st->epSquare = to - pawn_push(us);
k ^= Zobrist::enpassant[file_of(st->epSquare)];
@@ -829,12 +831,9 @@ DirtyPiece Position::do_move(Move m,
remove_piece(to);
put_piece(promotion, to);
// Promoting pawn to SQ_NONE, promoted piece from SQ_NONE
dp.to[0] = SQ_NONE;
dp.piece[dp.dirty_num] = promotion;
dp.from[dp.dirty_num] = SQ_NONE;
dp.to[dp.dirty_num] = to;
dp.dirty_num++;
dp.add_pc = promotion;
dp.add_sq = to;
dp.to = SQ_NONE;
// Update hash keys
// Zobrist::psq[pc][to] is zero, so we don't need to clear it
@@ -901,6 +900,10 @@ DirtyPiece Position::do_move(Move m,
assert(pos_is_ok());
assert(dp.pc != NO_PIECE);
assert(!(bool(captured) || m.type_of() == CASTLING) ^ (dp.remove_sq != SQ_NONE));
assert(dp.from != SQ_NONE);
assert(!(dp.add_sq != SQ_NONE) ^ (m.type_of() == PROMOTION || m.type_of() == CASTLING));
return dp;
}
@@ -983,13 +986,10 @@ void Position::do_castling(
if (Do)
{
dp->piece[0] = make_piece(us, KING);
dp->from[0] = from;
dp->to[0] = to;
dp->piece[1] = make_piece(us, ROOK);
dp->from[1] = rfrom;
dp->to[1] = rto;
dp->dirty_num = 2;
dp->to = to;
dp->remove_pc = dp->add_pc = make_piece(us, ROOK);
dp->remove_sq = rfrom;
dp->add_sq = rto;
}
// Remove both pieces first since squares could overlap in Chess960
@@ -1012,7 +1012,6 @@ void Position::do_null_move(StateInfo& newSt, const TranspositionTable& tt) {
std::memcpy(&newSt, st, sizeof(StateInfo));
newSt.previous = st;
st->next = &newSt;
st = &newSt;
if (st->epSquare != SQ_NONE)

View File

@@ -53,7 +53,6 @@ struct StateInfo {
Key key;
Bitboard checkersBB;
StateInfo* previous;
StateInfo* next;
Bitboard blockersForKing[COLOR_NB];
Bitboard pinners[COLOR_NB];
Bitboard checkSquares[PIECE_TYPE_NB];
@@ -87,9 +86,9 @@ class Position {
std::string fen() const;
// Position representation
Bitboard pieces(PieceType pt = ALL_PIECES) const;
Bitboard pieces() const; // All pieces
template<typename... PieceTypes>
Bitboard pieces(PieceType pt, PieceTypes... pts) const;
Bitboard pieces(PieceTypes... pts) const;
Bitboard pieces(Color c) const;
template<typename... PieceTypes>
Bitboard pieces(Color c, PieceTypes... pts) const;
@@ -165,7 +164,6 @@ class Position {
bool pos_is_ok() const;
void flip();
// Used by NNUE
StateInfo* state() const;
void put_piece(Piece pc, Square s);
@@ -216,11 +214,11 @@ inline bool Position::empty(Square s) const { return piece_on(s) == NO_PIECE; }
inline Piece Position::moved_piece(Move m) const { return piece_on(m.from_sq()); }
inline Bitboard Position::pieces(PieceType pt) const { return byTypeBB[pt]; }
inline Bitboard Position::pieces() const { return byTypeBB[ALL_PIECES]; }
template<typename... PieceTypes>
inline Bitboard Position::pieces(PieceType pt, PieceTypes... pts) const {
return pieces(pt) | pieces(pts...);
inline Bitboard Position::pieces(PieceTypes... pts) const {
return (byTypeBB[pts] | ...);
}
inline Bitboard Position::pieces(Color c) const { return byColorBB[c]; }

File diff suppressed because it is too large Load Diff

View File

@@ -75,7 +75,8 @@ struct Stack {
bool ttHit;
int cutoffCnt;
int reduction;
bool isTTMove;
bool isPvNode;
int quietMoveStreak;
};
@@ -292,6 +293,8 @@ class Worker {
CorrectionHistory<NonPawn> nonPawnCorrectionHistory;
CorrectionHistory<Continuation> continuationCorrectionHistory;
TTMoveHistory ttMoveHistory;
private:
void iterative_deepening();

View File

@@ -164,7 +164,7 @@ class ThreadPool {
std::vector<std::unique_ptr<Thread>> threads;
std::vector<NumaIndex> boundThreadToNumaNode;
uint64_t accumulate(std::atomic<uint64_t> Search::Worker::*member) const {
uint64_t accumulate(std::atomic<uint64_t> Search::Worker::* member) const {
uint64_t sum = 0;
for (auto&& th : threads)

View File

@@ -85,16 +85,13 @@ void TimeManagement::init(Search::LimitsType& limits,
// with constants are involved.
const int64_t scaleFactor = useNodesTime ? npmsec : 1;
const TimePoint scaledTime = limits.time[us] / scaleFactor;
const TimePoint scaledInc = limits.inc[us] / scaleFactor;
// Maximum move horizon
int centiMTG = limits.movestogo ? std::min(limits.movestogo * 100, 5000) : 5051;
// If less than one second, gradually reduce mtg
if (scaledTime < 1000 && double(centiMTG) / scaledInc > 5.051)
{
if (scaledTime < 1000)
centiMTG = scaledTime * 5.051;
}
// Make sure timeLeft is > 0 since we may use it as a divisor
TimePoint timeLeft =

View File

@@ -22,11 +22,11 @@
#include <cstdint>
#include "misc.h"
#include "types.h"
namespace Stockfish {
class OptionsMap;
enum Color : int8_t;
namespace Search {
struct LimitsType;

View File

@@ -110,6 +110,8 @@ void TTEntry::save(
value16 = int16_t(v);
eval16 = int16_t(ev);
}
else if (depth8 + DEPTH_ENTRY_OFFSET >= 5 && Bound(genBound8 & 0x3) != BOUND_EXACT)
depth8--;
}
@@ -234,8 +236,8 @@ std::tuple<bool, TTData, TTWriter> TranspositionTable::probe(const Key key) cons
// Find an entry to be replaced according to the replacement strategy
TTEntry* replace = tte;
for (int i = 1; i < ClusterSize; ++i)
if (replace->depth8 - replace->relative_age(generation8) * 2
> tte[i].depth8 - tte[i].relative_age(generation8) * 2)
if (replace->depth8 - replace->relative_age(generation8)
> tte[i].depth8 - tte[i].relative_age(generation8))
replace = &tte[i];
return {false,

View File

@@ -37,7 +37,9 @@
// | only in 64-bit mode and requires hardware with pext support.
#include <cassert>
#include <cstddef>
#include <cstdint>
#include <type_traits>
#if defined(_MSC_VER)
// Disable some silly and noisy warnings from MSVC compiler
@@ -55,9 +57,15 @@
// _WIN32 Building on Windows (any)
// _WIN64 Building on Windows 64 bit
#if defined(__GNUC__) && (__GNUC__ < 9 || (__GNUC__ == 9 && __GNUC_MINOR__ <= 2)) \
&& defined(_WIN32) && !defined(__clang__)
#define ALIGNAS_ON_STACK_VARIABLES_BROKEN
// Enforce minimum GCC version
#if defined(__GNUC__) && !defined(__clang__) \
&& (__GNUC__ < 9 || (__GNUC__ == 9 && __GNUC_MINOR__ < 3))
#error "Stockfish requires GCC 9.3 or later for correct compilation"
#endif
// Enforce minimum Clang version
#if defined(__clang__) && (__clang_major__ < 10)
#error "Stockfish requires Clang 10.0 or later for correct compilation"
#endif
#define ASSERT_ALIGNED(ptr, alignment) assert(reinterpret_cast<uintptr_t>(ptr) % alignment == 0)
@@ -108,13 +116,13 @@ using Bitboard = uint64_t;
constexpr int MAX_MOVES = 256;
constexpr int MAX_PLY = 246;
enum Color {
enum Color : int8_t {
WHITE,
BLACK,
COLOR_NB = 2
};
enum CastlingRights {
enum CastlingRights : int8_t {
NO_CASTLING,
WHITE_OO,
WHITE_OOO = WHITE_OO << 1,
@@ -130,7 +138,7 @@ enum CastlingRights {
CASTLING_RIGHT_NB = 16
};
enum Bound {
enum Bound : int8_t {
BOUND_NONE,
BOUND_UPPER,
BOUND_LOWER,
@@ -181,13 +189,13 @@ constexpr Value QueenValue = 2538;
// clang-format off
enum PieceType {
enum PieceType : std::int8_t {
NO_PIECE_TYPE, PAWN, KNIGHT, BISHOP, ROOK, QUEEN, KING,
ALL_PIECES = 0,
PIECE_TYPE_NB = 8
};
enum Piece {
enum Piece : std::int8_t {
NO_PIECE,
W_PAWN = PAWN, W_KNIGHT, W_BISHOP, W_ROOK, W_QUEEN, W_KING,
B_PAWN = PAWN + 8, B_KNIGHT, B_BISHOP, B_ROOK, B_QUEEN, B_KING,
@@ -201,26 +209,24 @@ constexpr Value PieceValue[PIECE_NB] = {
using Depth = int;
enum : int {
// The following DEPTH_ constants are used for transposition table entries
// and quiescence search move generation stages. In regular search, the
// depth stored in the transposition table is literal: the search depth
// (effort) used to make the corresponding transposition table value. In
// quiescence search, however, the transposition table entries only store
// the current quiescence move generation stage (which should thus compare
// lower than any regular search depth).
DEPTH_QS = 0,
// For transposition table entries where no searching at all was done
// (whether regular or qsearch) we use DEPTH_UNSEARCHED, which should thus
// compare lower than any quiescence or regular depth. DEPTH_ENTRY_OFFSET
// is used only for the transposition table entry occupancy check (see tt.cpp),
// and should thus be lower than DEPTH_UNSEARCHED.
DEPTH_UNSEARCHED = -2,
DEPTH_ENTRY_OFFSET = -3
};
// The following DEPTH_ constants are used for transposition table entries
// and quiescence search move generation stages. In regular search, the
// depth stored in the transposition table is literal: the search depth
// (effort) used to make the corresponding transposition table value. In
// quiescence search, however, the transposition table entries only store
// the current quiescence move generation stage (which should thus compare
// lower than any regular search depth).
constexpr Depth DEPTH_QS = 0;
// For transposition table entries where no searching at all was done
// (whether regular or qsearch) we use DEPTH_UNSEARCHED, which should thus
// compare lower than any quiescence or regular depth. DEPTH_ENTRY_OFFSET
// is used only for the transposition table entry occupancy check (see tt.cpp),
// and should thus be lower than DEPTH_UNSEARCHED.
constexpr Depth DEPTH_UNSEARCHED = -2;
constexpr Depth DEPTH_ENTRY_OFFSET = -3;
// clang-format off
enum Square : int {
enum Square : int8_t {
SQ_A1, SQ_B1, SQ_C1, SQ_D1, SQ_E1, SQ_F1, SQ_G1, SQ_H1,
SQ_A2, SQ_B2, SQ_C2, SQ_D2, SQ_E2, SQ_F2, SQ_G2, SQ_H2,
SQ_A3, SQ_B3, SQ_C3, SQ_D3, SQ_E3, SQ_F3, SQ_G3, SQ_H3,
@@ -236,7 +242,7 @@ enum Square : int {
};
// clang-format on
enum Direction : int {
enum Direction : int8_t {
NORTH = 8,
EAST = 1,
SOUTH = -NORTH,
@@ -248,7 +254,7 @@ enum Direction : int {
NORTH_WEST = NORTH + WEST
};
enum File : int {
enum File : int8_t {
FILE_A,
FILE_B,
FILE_C,
@@ -260,7 +266,7 @@ enum File : int {
FILE_NB
};
enum Rank : int {
enum Rank : int8_t {
RANK_1,
RANK_2,
RANK_3,
@@ -274,23 +280,19 @@ enum Rank : int {
// Keep track of what a move changes on the board (used by NNUE)
struct DirtyPiece {
Piece pc; // this is never allowed to be NO_PIECE
Square from, to; // to should be SQ_NONE for promotions
// Number of changed pieces
int dirty_num;
// Max 3 pieces can change in one move. A promotion with capture moves
// both the pawn and the captured piece to SQ_NONE and the piece promoted
// to from SQ_NONE to the capture square.
Piece piece[3];
// From and to squares, which may be SQ_NONE
Square from[3];
Square to[3];
// if {add,remove}_sq is SQ_NONE, {add,remove}_pc is allowed to be
// uninitialized
// castling uses add_sq and remove_sq to remove and add the rook
Square remove_sq, add_sq;
Piece remove_pc, add_pc;
};
#define ENABLE_INCR_OPERATORS_ON(T) \
inline T& operator++(T& d) { return d = T(int(d) + 1); } \
inline T& operator--(T& d) { return d = T(int(d) - 1); }
constexpr T& operator++(T& d) { return d = T(int(d) + 1); } \
constexpr T& operator--(T& d) { return d = T(int(d) - 1); }
ENABLE_INCR_OPERATORS_ON(PieceType)
ENABLE_INCR_OPERATORS_ON(Square)
@@ -303,10 +305,10 @@ constexpr Direction operator+(Direction d1, Direction d2) { return Direction(int
constexpr Direction operator*(int i, Direction d) { return Direction(i * int(d)); }
// Additional operators to add a Direction to a Square
constexpr Square operator+(Square s, Direction d) { return Square(int(s) + int(d)); }
constexpr Square operator-(Square s, Direction d) { return Square(int(s) - int(d)); }
inline Square& operator+=(Square& s, Direction d) { return s = s + d; }
inline Square& operator-=(Square& s, Direction d) { return s = s - d; }
constexpr Square operator+(Square s, Direction d) { return Square(int(s) + int(d)); }
constexpr Square operator-(Square s, Direction d) { return Square(int(s) - int(d)); }
constexpr Square& operator+=(Square& s, Direction d) { return s = s + d; }
constexpr Square& operator-=(Square& s, Direction d) { return s = s - d; }
// Toggle color
constexpr Color operator~(Color c) { return Color(c ^ BLACK); }
@@ -334,7 +336,7 @@ constexpr Piece make_piece(Color c, PieceType pt) { return Piece((c << 3) + pt);
constexpr PieceType type_of(Piece pc) { return PieceType(pc & 7); }
inline Color color_of(Piece pc) {
constexpr Color color_of(Piece pc) {
assert(pc != NO_PIECE);
return Color(pc >> 3);
}
@@ -429,6 +431,14 @@ class Move {
std::uint16_t data;
};
template<typename T, typename... Ts>
struct is_all_same {
static constexpr bool value = (std::is_same_v<T, Ts> && ...);
};
template<typename... Ts>
constexpr auto is_all_same_v = is_all_same<Ts...>::value;
} // namespace Stockfish
#endif // #ifndef TYPES_H_INCLUDED

View File

@@ -33,7 +33,7 @@ namespace Stockfish {
class Position;
class Move;
class Score;
enum Square : int;
enum Square : int8_t;
using Value = int;
class UCIEngine {

View File

@@ -2,16 +2,26 @@
# obtain and optionally verify Bench / signature
# if no reference is given, the output is deliberately limited to just the signature
STDOUT_FILE=$(mktemp)
STDERR_FILE=$(mktemp)
error()
{
echo "running bench for signature failed on line $1"
echo "===== STDOUT ====="
cat "$STDOUT_FILE"
echo "===== STDERR ====="
cat "$STDERR_FILE"
rm -f "$STDOUT_FILE" "$STDERR_FILE"
exit 1
}
trap 'error ${LINENO}' ERR
# obtain
eval "$WINE_PATH ./stockfish bench" > "$STDOUT_FILE" 2> "$STDERR_FILE" || error ${LINENO}
signature=$(grep "Nodes searched : " "$STDERR_FILE" | awk '{print $4}')
signature=`eval "$WINE_PATH ./stockfish bench 2>&1" | grep "Nodes searched : " | awk '{print $4}'`
rm -f "$STDOUT_FILE" "$STDERR_FILE"
if [ $# -gt 0 ]; then
# compare to given reference
@@ -28,4 +38,4 @@ if [ $# -gt 0 ]; then
else
# just report signature
echo $signature
fi
fi