Add NNUE evaluation

This patch ports the efficiently updatable neural network (NNUE) evaluation to Stockfish.

Both the NNUE and the classical evaluations are available, and can be used to
assign a value to a position that is later used in alpha-beta (PVS) search to find the
best move. The classical evaluation computes this value as a function of various chess
concepts, handcrafted by experts, tested and tuned using fishtest. The NNUE evaluation
computes this value with a neural network based on basic inputs. The network is optimized
and trained on the evalutions of millions of positions at moderate search depth.

The NNUE evaluation was first introduced in shogi, and ported to Stockfish afterward.
It can be evaluated efficiently on CPUs, and exploits the fact that only parts
of the neural network need to be updated after a typical chess move.
[The nodchip repository](https://github.com/nodchip/Stockfish) provides additional
tools to train and develop the NNUE networks.

This patch is the result of contributions of various authors, from various communities,
including: nodchip, ynasu87, yaneurao (initial port and NNUE authors), domschl, FireFather,
rqs, xXH4CKST3RXx, tttak, zz4032, joergoster, mstembera, nguyenpham, erbsenzaehler,
dorzechowski, and vondele.

This new evaluation needed various changes to fishtest and the corresponding infrastructure,
for which tomtor, ppigazzini, noobpwnftw, daylen, and vondele are gratefully acknowledged.

The first networks have been provided by gekkehenker and sergiovieri, with the latter
net (nn-97f742aaefcd.nnue) being the current default.

The evaluation function can be selected at run time with the `Use NNUE` (true/false) UCI option,
provided the `EvalFile` option points the the network file (depending on the GUI, with full path).

The performance of the NNUE evaluation relative to the classical evaluation depends somewhat on
the hardware, and is expected to improve quickly, but is currently on > 80 Elo on fishtest:

60000 @ 10+0.1 th 1
https://tests.stockfishchess.org/tests/view/5f28fe6ea5abc164f05e4c4c
ELO: 92.77 +-2.1 (95%) LOS: 100.0%
Total: 60000 W: 24193 L: 8543 D: 27264
Ptnml(0-2): 609, 3850, 9708, 10948, 4885

40000 @ 20+0.2 th 8
https://tests.stockfishchess.org/tests/view/5f290229a5abc164f05e4c58
ELO: 89.47 +-2.0 (95%) LOS: 100.0%
Total: 40000 W: 12756 L: 2677 D: 24567
Ptnml(0-2): 74, 1583, 8550, 7776, 2017

At the same time, the impact on the classical evaluation remains minimal, causing no significant
regression:

sprt @ 10+0.1 th 1
https://tests.stockfishchess.org/tests/view/5f2906a2a5abc164f05e4c5b
LLR: 2.94 (-2.94,2.94) {-6.00,-4.00}
Total: 34936 W: 6502 L: 6825 D: 21609
Ptnml(0-2): 571, 4082, 8434, 3861, 520

sprt @ 60+0.6 th 1
https://tests.stockfishchess.org/tests/view/5f2906cfa5abc164f05e4c5d
LLR: 2.93 (-2.94,2.94) {-6.00,-4.00}
Total: 10088 W: 1232 L: 1265 D: 7591
Ptnml(0-2): 49, 914, 3170, 843, 68

The needed networks can be found at https://tests.stockfishchess.org/nns
It is recommended to use the default one as indicated by the `EvalFile` UCI option.

Guidelines for testing new nets can be found at
https://github.com/glinscott/fishtest/wiki/Creating-my-first-test#nnue-net-tests

Integration has been discussed in various issues:
https://github.com/official-stockfish/Stockfish/issues/2823
https://github.com/official-stockfish/Stockfish/issues/2728

The integration branch will be closed after the merge:
https://github.com/official-stockfish/Stockfish/pull/2825
https://github.com/official-stockfish/Stockfish/tree/nnue-player-wip

closes https://github.com/official-stockfish/Stockfish/pull/2912

This will be an exciting time for computer chess, looking forward to seeing the evolution of
this approach.

Bench: 4746616
This commit is contained in:
nodchip
2020-08-05 17:11:15 +02:00
committed by Joost VandeVondele
parent 9587eeeb5e
commit 84f3e86790
59 changed files with 2474 additions and 245 deletions

View File

@@ -38,11 +38,12 @@ PGOBENCH = ./$(EXE) bench
### Source and object files
SRCS = benchmark.cpp bitbase.cpp bitboard.cpp endgame.cpp evaluate.cpp main.cpp \
material.cpp misc.cpp movegen.cpp movepick.cpp pawns.cpp position.cpp psqt.cpp \
search.cpp thread.cpp timeman.cpp tt.cpp uci.cpp ucioption.cpp tune.cpp syzygy/tbprobe.cpp
search.cpp thread.cpp timeman.cpp tt.cpp uci.cpp ucioption.cpp tune.cpp syzygy/tbprobe.cpp \
nnue/evaluate_nnue.cpp nnue/features/half_kp.cpp
OBJS = $(notdir $(SRCS:.cpp=.o))
VPATH = syzygy
VPATH = syzygy:nnue:nnue/features
### Establish the operating system name
KERNEL = $(shell uname -s)
@@ -67,7 +68,14 @@ endif
# prefetch = yes/no --- -DUSE_PREFETCH --- Use prefetch asm-instruction
# popcnt = yes/no --- -DUSE_POPCNT --- Use popcnt asm-instruction
# sse = yes/no --- -msse --- Use Intel Streaming SIMD Extensions
# sse3 = yes/no --- -msse3 --- Use Intel Streaming SIMD Extensions 3
# ssse3 = yes/no --- -mssse3 --- Use Intel Supplemental Streaming SIMD Extensions 3
# sse41 = yes/no --- -msse4.1 --- Use Intel Streaming SIMD Extensions 4.1
# sse42 = yes/no --- -msse4.2 --- Use Intel Streaming SIMD Extensions 4.2
# avx2 = yes/no --- -mavx2 --- Use Intel Advanced Vector Extensions 2
# pext = yes/no --- -DUSE_PEXT --- Use pext x86_64 asm-instruction
# avx512 = yes/no --- -mavx512bw --- Use Intel Advanced Vector Extensions 512
# neon = yes/no --- -DUSE_NEON --- Use ARM SIMD architecture
#
# Note that Makefile is space sensitive, so when adding new architectures
# or modifying existing flags, you have to make sure there are no extra spaces
@@ -81,7 +89,15 @@ bits = 64
prefetch = no
popcnt = no
sse = no
sse3 = no
ssse3 = no
sse41 = no
sse42 = no
avx2 = no
pext = no
avx512 = no
neon = no
ARCH = x86-64-modern
### 2.2 Architecture specific
ifeq ($(ARCH),general-32)
@@ -111,11 +127,70 @@ ifeq ($(ARCH),x86-64)
sse = yes
endif
ifeq ($(ARCH),x86-64-sse3)
arch = x86_64
prefetch = yes
sse = yes
sse3 = yes
endif
ifeq ($(ARCH),x86-64-sse3-popcnt)
arch = x86_64
prefetch = yes
sse = yes
sse3 = yes
popcnt = yes
endif
ifeq ($(ARCH),x86-64-ssse3)
arch = x86_64
prefetch = yes
sse = yes
sse3 = yes
ssse3 = yes
endif
ifeq ($(ARCH),x86-64-sse41)
arch = x86_64
prefetch = yes
popcnt = yes
sse = yes
sse3 = yes
ssse3 = yes
sse41 = yes
endif
ifeq ($(ARCH),x86-64-modern)
arch = x86_64
prefetch = yes
popcnt = yes
sse = yes
sse3 = yes
ssse3 = yes
sse41 = yes
endif
ifeq ($(ARCH),x86-64-sse42)
arch = x86_64
prefetch = yes
popcnt = yes
sse = yes
sse3 = yes
ssse3 = yes
sse41 = yes
sse42 = yes
endif
ifeq ($(ARCH),x86-64-avx2)
arch = x86_64
prefetch = yes
popcnt = yes
sse = yes
sse3 = yes
ssse3 = yes
sse41 = yes
sse42 = yes
avx2 = yes
endif
ifeq ($(ARCH),x86-64-bmi2)
@@ -123,9 +198,28 @@ ifeq ($(ARCH),x86-64-bmi2)
prefetch = yes
popcnt = yes
sse = yes
sse3 = yes
ssse3 = yes
sse41 = yes
sse42 = yes
avx2 = yes
pext = yes
endif
ifeq ($(ARCH),x86-64-avx512)
arch = x86_64
prefetch = yes
popcnt = yes
sse = yes
sse3 = yes
ssse3 = yes
sse41 = yes
sse42 = yes
avx2 = yes
pext = yes
avx512 = yes
endif
ifeq ($(ARCH),armv7)
arch = armv7
prefetch = yes
@@ -136,6 +230,14 @@ ifeq ($(ARCH),armv8)
arch = armv8-a
prefetch = yes
popcnt = yes
neon = yes
endif
ifeq ($(ARCH),apple-silicon)
arch = arm64
prefetch = yes
popcnt = yes
neon = yes
endif
ifeq ($(ARCH),ppc-32)
@@ -154,8 +256,8 @@ endif
### ==========================================================================
### 3.1 Selecting compiler (default = gcc)
CXXFLAGS += -Wall -Wcast-qual -fno-exceptions -std=c++11 $(EXTRACXXFLAGS)
DEPENDFLAGS += -std=c++11
CXXFLAGS += -Wall -Wcast-qual -fno-exceptions -std=c++17 $(EXTRACXXFLAGS)
DEPENDFLAGS += -std=c++17
LDFLAGS += $(EXTRALDFLAGS)
ifeq ($(COMP),)
@@ -249,8 +351,8 @@ endif
endif
ifeq ($(KERNEL),Darwin)
CXXFLAGS += -arch $(arch) -mmacosx-version-min=10.9
LDFLAGS += -arch $(arch) -mmacosx-version-min=10.9
CXXFLAGS += -arch $(arch) -mmacosx-version-min=10.15
LDFLAGS += -arch $(arch) -mmacosx-version-min=10.15
endif
### Travis CI script uses COMPILER to overwrite CXX
@@ -283,8 +385,8 @@ endif
### 3.2.2 Debugging with undefined behavior sanitizers
ifneq ($(sanitize),no)
CXXFLAGS += -g3 -fsanitize=$(sanitize) -fuse-ld=gold
LDFLAGS += -fsanitize=$(sanitize) -fuse-ld=gold
CXXFLAGS += -g3 -fsanitize=$(sanitize)
LDFLAGS += -fsanitize=$(sanitize)
endif
### 3.3 Optimization
@@ -322,7 +424,7 @@ endif
### 3.6 popcnt
ifeq ($(popcnt),yes)
ifeq ($(arch),$(filter $(arch),ppc64 armv8-a))
ifeq ($(arch),$(filter $(arch),ppc64 armv8-a arm64))
CXXFLAGS += -DUSE_POPCNT
else ifeq ($(comp),icc)
CXXFLAGS += -msse3 -DUSE_POPCNT
@@ -331,11 +433,61 @@ ifeq ($(popcnt),yes)
endif
endif
ifeq ($(avx2),yes)
CXXFLAGS += -DUSE_AVX2
ifeq ($(comp),$(filter $(comp),gcc clang mingw))
CXXFLAGS += -mavx2
endif
endif
ifeq ($(avx512),yes)
CXXFLAGS += -DUSE_AVX512
ifeq ($(comp),$(filter $(comp),gcc clang mingw))
CXXFLAGS += -mavx512bw
endif
endif
ifeq ($(sse42),yes)
CXXFLAGS += -DUSE_SSE42
ifeq ($(comp),$(filter $(comp),gcc clang mingw))
CXXFLAGS += -msse4.2
endif
endif
ifeq ($(sse41),yes)
CXXFLAGS += -DUSE_SSE41
ifeq ($(comp),$(filter $(comp),gcc clang mingw))
CXXFLAGS += -msse4.1
endif
endif
ifeq ($(ssse3),yes)
CXXFLAGS += -DUSE_SSSE3
ifeq ($(comp),$(filter $(comp),gcc clang mingw))
CXXFLAGS += -mssse3
endif
endif
ifeq ($(sse3),yes)
CXXFLAGS += -DUSE_SSE3
ifeq ($(comp),$(filter $(comp),gcc clang mingw))
CXXFLAGS += -msse3
endif
endif
ifeq ($(neon),yes)
CXXFLAGS += -DUSE_NEON
endif
ifeq ($(arch),x86_64)
CXXFLAGS += -DUSE_SSE2
endif
### 3.7 pext
ifeq ($(pext),yes)
CXXFLAGS += -DUSE_PEXT
ifeq ($(comp),$(filter $(comp),gcc clang mingw))
CXXFLAGS += -msse4 -mbmi2
CXXFLAGS += -mbmi2
endif
endif
@@ -381,15 +533,23 @@ help:
@echo "Supported targets:"
@echo ""
@echo "build > Standard build"
@echo "profile-build > PGO build"
@echo "profile-build > Standard build with PGO"
@echo "strip > Strip executable"
@echo "install > Install executable"
@echo "clean > Clean up"
@echo "net > Download the default nnue net"
@echo ""
@echo "Supported archs:"
@echo ""
@echo "x86-64-bmi2 > x86 64-bit with pext support (also enables SSE4)"
@echo "x86-64-modern > x86 64-bit with popcnt support (also enables SSE3)"
@echo "x86-64-avx512 > x86 64-bit with avx512 support"
@echo "x86-64-bmi2 > x86 64-bit with bmi2 support"
@echo "x86-64-avx2 > x86 64-bit with avx2 support"
@echo "x86-64-sse42 > x86 64-bit with sse42 support"
@echo "x86-64-modern > x86 64-bit with sse41 support (x86-64-sse41)"
@echo "x86-64-sse41 > x86 64-bit with sse41 support"
@echo "x86-64-ssse3 > x86 64-bit with ssse3 support"
@echo "x86-64-sse3-popcnt > x86 64-bit with sse3 and popcnt support"
@echo "x86-64-sse3 > x86 64-bit with sse3 support"
@echo "x86-64 > x86 64-bit generic"
@echo "x86-32 > x86 32-bit (also enables SSE)"
@echo "x86-32-old > x86 32-bit fall back for old hardware"
@@ -397,6 +557,7 @@ help:
@echo "ppc-32 > PPC 32-bit"
@echo "armv7 > ARMv7 32-bit"
@echo "armv8 > ARMv8 64-bit"
@echo "apple-silicon > Apple silicon ARM64"
@echo "general-64 > unspecified 64-bit"
@echo "general-32 > unspecified 32-bit"
@echo ""
@@ -409,17 +570,20 @@ help:
@echo ""
@echo "Simple examples. If you don't know what to do, you likely want to run: "
@echo ""
@echo "make build ARCH=x86-64 (This is for 64-bit systems)"
@echo "make build ARCH=x86-32 (This is for 32-bit systems)"
@echo "make -j build ARCH=x86-64 (This is for 64-bit systems)"
@echo "make -j build ARCH=x86-32 (This is for 32-bit systems)"
@echo ""
@echo "Advanced examples, for experienced users: "
@echo ""
@echo "make build ARCH=x86-64 COMP=clang"
@echo "make profile-build ARCH=x86-64-bmi2 COMP=gcc COMPCXX=g++-4.8"
@echo "make -j build ARCH=x86-64-modern COMP=clang"
@echo "make -j profile-build ARCH=x86-64-bmi2 COMP=gcc COMPCXX=g++-4.8"
@echo ""
@echo "The selected architecture $(ARCH) enables the following configuration: "
@echo ""
@$(MAKE) ARCH=$(ARCH) COMP=$(COMP) config-sanity
.PHONY: help build profile-build strip install clean objclean profileclean \
.PHONY: help build profile-build strip install clean net objclean profileclean \
config-sanity icc-profile-use icc-profile-make gcc-profile-use gcc-profile-make \
clang-profile-use clang-profile-make
@@ -453,14 +617,21 @@ install:
clean: objclean profileclean
@rm -f .depend *~ core
net:
$(eval nnuenet := $(shell grep EvalFile ucioption.cpp | grep Option | sed 's/.*\(nn-[a-z0-9]\{12\}.nnue\).*/\1/'))
@echo "Default net: $(nnuenet)"
$(eval nnuedownloadurl := https://tests.stockfishchess.org/api/nn/$(nnuenet))
$(eval curl_or_wget := $(shell if hash curl 2>/dev/null; then echo "curl -sL"; elif hash wget 2>/dev/null; then echo "wget -qO-"; fi))
@if test -f "$(nnuenet)"; then echo "Already available."; else echo "Downloading $(nnuedownloadurl)"; $(curl_or_wget) $(nnuedownloadurl) > $(nnuenet); fi
# clean binaries and objects
objclean:
@rm -f $(EXE) *.o ./syzygy/*.o
@rm -f $(EXE) *.o ./syzygy/*.o ./nnue/*.o ./nnue/features/*.o
# clean auxiliary profiling files
profileclean:
@rm -rf profdir
@rm -f bench.txt *.gcda *.gcno
@rm -f bench.txt *.gcda *.gcno ./syzygy/*.gcda ./nnue/*.gcda ./nnue/features/*.gcda
@rm -f stockfish.profdata *.profraw
default:
@@ -485,7 +656,14 @@ config-sanity:
@echo "prefetch: '$(prefetch)'"
@echo "popcnt: '$(popcnt)'"
@echo "sse: '$(sse)'"
@echo "sse3: '$(sse3)'"
@echo "ssse3: '$(ssse3)'"
@echo "sse41: '$(sse41)'"
@echo "sse42: '$(sse42)'"
@echo "avx2: '$(avx2)'"
@echo "pext: '$(pext)'"
@echo "avx512: '$(avx512)'"
@echo "neon: '$(neon)'"
@echo ""
@echo "Flags:"
@echo "CXX: $(CXX)"
@@ -499,12 +677,19 @@ config-sanity:
@test "$(optimize)" = "yes" || test "$(optimize)" = "no"
@test "$(arch)" = "any" || test "$(arch)" = "x86_64" || test "$(arch)" = "i386" || \
test "$(arch)" = "ppc64" || test "$(arch)" = "ppc" || \
test "$(arch)" = "armv7" || test "$(arch)" = "armv8-a"
test "$(arch)" = "armv7" || test "$(arch)" = "armv8-a" || test "$(arch)" = "arm64"
@test "$(bits)" = "32" || test "$(bits)" = "64"
@test "$(prefetch)" = "yes" || test "$(prefetch)" = "no"
@test "$(popcnt)" = "yes" || test "$(popcnt)" = "no"
@test "$(sse)" = "yes" || test "$(sse)" = "no"
@test "$(sse3)" = "yes" || test "$(sse3)" = "no"
@test "$(ssse3)" = "yes" || test "$(ssse3)" = "no"
@test "$(sse41)" = "yes" || test "$(sse41)" = "no"
@test "$(sse42)" = "yes" || test "$(sse42)" = "no"
@test "$(avx2)" = "yes" || test "$(avx2)" = "no"
@test "$(pext)" = "yes" || test "$(pext)" = "no"
@test "$(avx512)" = "yes" || test "$(avx512)" = "no"
@test "$(neon)" = "yes" || test "$(neon)" = "no"
@test "$(comp)" = "gcc" || test "$(comp)" = "icc" || test "$(comp)" = "mingw" || test "$(comp)" = "clang"
$(EXE): $(OBJS)