this workaround is possibly rather a windows & gcc specific problem. See e.g.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54412#c25
on Linux with gcc 8 this patch brings roughly a 8% speedup.
However, probably needs some testing in the wild.
includes a workaround for an old msys make (3.81) installation (fixes#2984)
No functional change
Change condition from three friendly pieces to two. This now means that we only extend castling on the king side if there are no other friendly pieces aside from king and rook. For the queen side, we only extend if there is only a rook and another friendly piece or if there is only a single rook and no other friendly piece but this is very rare.
STC:
LLR: 3.20 (-2.94,2.94) {-0.50,1.50}
Total: 31144 W: 4086 L: 3903 D: 23155
Ptnml(0-2): 227, 2843, 9278, 2968, 256
https://tests.stockfishchess.org/tests/view/5f31487f9081672066537516
LTC:
LLR: 2.93 (-2.94,2.94) {0.25,1.75}
Total: 57816 W: 3786 L: 3538 D: 50492
Ptnml(0-2): 92, 2991, 22488, 3251, 86
https://tests.stockfishchess.org/tests/view/5f3167c3908167206653753d
closes https://github.com/official-stockfish/Stockfish/pull/2980
Bench: 4244812
Joint work gvreuls / vondele
* Download the default NNUE net in AppVeyor
* Download net in travis CI `make net`
* Adjust tests to cover more archs, speedup instrumented testing
* Introduce 'mixed' bench as default, with further options:
classical, NNUE, mixed.
mixed (default) and NNUE require the default net to be present,
which can be obtained with
```
make net
```
Further examples (first is equivalent to `./stockfish bench`):
```
./stockfish bench 16 1 13 default depth mixed
./stockfish bench 16 1 13 default depth classical
./stockfish bench 16 1 13 default depth NNUE
```
The net is now downloaded automatically if needed for `profile-build`
(usual `build` works fine without net present)
PGO gives a nice speedup on fishtest:
passed STC:
LLR: 2.93 (-2.94,2.94) {-0.50,1.50}
Total: 3360 W: 469 L: 343 D: 2548
Ptnml(0-2): 20, 246, 1030, 356, 28
https://tests.stockfishchess.org/tests/view/5f31b5499081672066537569
passed LTC:
LLR: 2.97 (-2.94,2.94) {0.25,1.75}
Total: 8824 W: 609 L: 502 D: 7713
Ptnml(0-2): 8, 430, 3438, 519, 17
https://tests.stockfishchess.org/tests/view/5f31c87b908167206653757c
closes https://github.com/official-stockfish/Stockfish/pull/2931
fixes https://github.com/official-stockfish/Stockfish/issues/2907
requires fishtest updates before commit
Bench: 4290577
This patch allows old x86 CPUs, from AMD K8 (which the x86-64 baseline
targets) all the way down to the Pentium MMX, to benefit from NNUE with
comparable performance hit versus hand-written eval as on more modern
processors.
NPS of the bench with NNUE enabled on a Pentium III 1.13 GHz (using the
MMX code):
master: 38951
this patch: 80586
NPS of the bench with NNUE enabled using baseline x86-64 arch, which is
how linux distros are likely to package stockfish, on a modern CPU
(using the SSE2 code):
master: 882584
this patch: 1203945
closes https://github.com/official-stockfish/Stockfish/pull/2956
No functional change.
Makefile targets x86-64-sse42, x86-sse3 are removed; x86-64-sse41
is renamed to x86-64-sse41-popcnt (it did enable popcnt).
Makefile variables sse3, sse42, their associated compilation flags
and code in misc.cpp are removed.
closes https://github.com/official-stockfish/Stockfish/pull/2922
No functional change
despite usage of alignas, the generated (avx2/avx512) code with older compilers needs to use
unaligned loads with older gcc (e.g. confirmed crash with gcc 7.3/mingw on abrok).
Better performance thus requires gcc >= 9 on hardware supporting avx2/avx512
closes https://github.com/official-stockfish/Stockfish/pull/2969
No functional change
The idea of this patch is that positions are usually more complex and hard to evaluate even if there are more pawns.
This patch adjusts NNUE threshold usage depending on number of pawns in position, if pawn count is <3 we use the
classical evaluation more often, for pawn count = 3 patch the is non-functional,
with pawn count > 3 NNUE evaluation is used more often.
passed STC
https://tests.stockfishchess.org/tests/view/5f2f02d09081672066536b1f
LLR: 2.96 (-2.94,2.94) {-0.50,1.50}
Total: 36520 W: 5011 L: 4823 D: 26686
Ptnml(0-2): 299, 3482, 10548, 3594, 337
passed LTC
https://tests.stockfishchess.org/tests/view/5f2f4c329081672066536b5c
LLR: 2.98 (-2.94,2.94) {0.25,1.75}
Total: 39272 W: 2630 L: 2433 D: 34209
Ptnml(0-2): 53, 2066, 15218, 2229, 70
closes https://github.com/official-stockfish/Stockfish/pull/2960
bench 4084753
This patch tries to run multiple LTO threads in parallel, speeding up
the build process of optimized builds if the -j make parameter is used.
This mitigates the longer linking times of optimized builds since the
integration of the NNUE code. Roughly 2x build speedup.
I've tried a similar patch some two years ago but it ran into trouble
with old compiler versions then. Since we're on the C++17 standard now
these old compilers should be obsolete.
closes https://github.com/official-stockfish/Stockfish/pull/2943
No functional change.