Clang-10.0.0 poses as gcc-4.2:
$ clang++ -E -dM - </dev/null | grep GNUC
This means that Clang is using the workaround for the alignment bug of gcc-8
even though it does not have the bug (as far as I know).
This patch should speed up AVX2 and AVX512 compiles on Windows (when using Clang),
because it disables (for Clang) the gcc workaround we had introduced in this commit:
875183b310
closes https://github.com/official-stockfish/Stockfish/pull/3050
No functional change.
In recent Macs, it is possible to use the Clang compiler provided by Apple
to compile Stockfish out of the box, and this is the method used by default
in our Makefile (the Makefile sets the macosx-version-min=10.14 flag to select
the right libc++ library for the Clang compiler with recent c++17 support).
But it is quite possible to compile and run Stockfish on older Macs! Below
we describe a method to install a recent GNU compiler on these Macs, to get
the c++17 support. We have tested the following procedure to install gcc10 on
machines running Mac OS 10.7, Mac OS 10.9 and Mac OS 10.13:
1) install XCode for your machine.
2) install Apple command-line developer tools for XCode, by typing the following
command in a Terminal:
```
sudo xcode-select --install
```
3) go to the Stockfish "src" directory, then try a default build and run Stockfish:
```
make clean
make build
make net
./stockfish
```
4) if step 3 worked, congrats! You have a compiler recent enough on your Mac
to compile Stockfish. If not, continue with step 5 to install GNU gcc10 :-)
5) install the MacPorts package manager (https://www.macports.org/install.php),
for instance using the fast method in the "macOS Package (.pkg) Installer"
section of the page.
6) use the "port" command to install the gcc10 package of MacPorts by typing the
following command:
```
sudo port install gcc10
```
With this step, MacPorts will install the gcc10 compiler under the name "g++-mp-10"
in the /opt/local/bin directory:
```
which g++-mp-10
/opt/local/bin/g++-mp-10 <--- answer
```
7) You can now go back to the "src" directory of Stockfish, and try to build
Stockfish by pointing at the right compiler:
```
make clean
make build COMP=gcc COMPCXX=/opt/local/bin/g++-mp-10
make net
./stockfish
```
8) Enjoy Stockfish on Macintosh!
See this pull request for further discussion:
https://github.com/official-stockfish/Stockfish/pull/3049
No functional change
Changes to deal with compilation (particularly profile-build) on macOS.
(1) The default toolchain has gcc masquerading as clang,
the previous Makefile was not picking up the required changes
to the different profiling tools.
(2) The previous Makefile test for gccisclang occurred before
a potential overwrite of CXX by COMPCXX
(3) llvm-profdata no longer runs as a command on macOS and
instead is invoked by ``xcrun llvm-profdata``
(4) Needs to support use of true gcc using e.g.
``make build ... COMPCXX=g++-10``
(5) enable profile-build in travis for macOS
closes https://github.com/official-stockfish/Stockfish/pull/3043
No functional change
some GUIs do not show the error message when the engine terminates in the no-net case, as it is send to cerr.
Instead send it as an info string, which the GUI will more likely display.
closes https://github.com/official-stockfish/Stockfish/pull/3031
No functional change.
add new ARCH targets
x86-32-sse41-popcnt > x86 32-bit with sse41 and popcnt support
x86-32-sse2 > x86 32-bit with sse2 support
x86-32 > x86 32-bit generic (with mmx and sse support)
retire x86-32-old (use general-32)
closes https://github.com/official-stockfish/Stockfish/pull/3022
No functional change.
The easiest way to use the NDK in conjunction with this Makefile (tested on linux-x86_64):
1. Download the latest NDK (r21d) from Google from https://developer.android.com/ndk/downloads
2. Place and unzip the NDK in $HOME/ndk folder
3. Export the path variable e.g., `export PATH=$PATH:$HOME/ndk/android-ndk-r21d/toolchains/llvm/prebuilt/linux-x86_64/bin`
4. cd to your Stockfish/src dir
5. Issue `make -j ARCH=armv8 COMP=ndk build` (use `ARCH=armv7` or `ARCH=armv7-neon` for older CPUs)
6. Optionally `make -j ARCH=armv8 COMP=ndk strip`
7. That's all. Enjoy!
Improves support from Raspberry Pi (incomplete?) and compiling on arm in general
closes https://github.com/official-stockfish/Stockfish/pull/3015
fixes https://github.com/official-stockfish/Stockfish/issues/2860
fixes https://github.com/official-stockfish/Stockfish/issues/2641
Support is still fragile as we're missing CI on these targets. Nevertheless tested with:
```bash
# build crosses from ubuntu 20.04 on x86 to various arch/OS combos
# tested with suitable packages installed
# (build-essentials, mingw-w64, g++-arm-linux-gnueabihf, NDK (r21d) from google)
# cross to Android
export PATH=$HOME/ndk/android-ndk-r21d/toolchains/llvm/prebuilt/linux-x86_64/bin:$PATH
make clean && make -j build ARCH=armv7 COMP=ndk && make -j build ARCH=armv7 COMP=ndk strip
make clean && make -j build ARCH=armv7-neon COMP=ndk && make -j build ARCH=armv7-neon COMP=ndk strip
make clean && make -j build ARCH=armv8 COMP=ndk && make -j build ARCH=armv8 COMP=ndk strip
# cross to Raspberry Pi
make clean && make -j build ARCH=armv7 COMP=gcc COMPILER=arm-linux-gnueabihf-g++
make clean && make -j build ARCH=armv7-neon COMP=gcc COMPILER=arm-linux-gnueabihf-g++
# cross to Windows
make clean && make -j build ARCH=x86-64-modern COMP=mingw
```
No functional change
This patch fixes the byte order when reading 16- and 32-bit values from the network file on a big-endian machine.
Bytes are ordered in read_le() using unsigned arithmetic, which doesn't need tricks to determine the endianness of the machine. Unfortunately the compiler doesn't seem to be able to optimise the ordering operation, but reading in the weights is not a time-critical operation and the extra time it takes should not be noticeable.
Big endian systems are still untested with NNUE.
fixes#3007
closes https://github.com/official-stockfish/Stockfish/pull/3009
No functional change.
Increases the use of NNUE evaluation in positions without captures/pawn moves,
by increasing the NNUEThreshold threshold with rule50_count.
This patch will force Stockfish to use NNUE eval more and more in materially
unbalanced positions, when it seems that the classical eval is struggling to
win and only manages to shuffle. This will ask the (slower) NNUE eval to
double-check the potential fortress branches of the search tree, but only
when necessary.
passed STC:
https://tests.stockfishchess.org/tests/view/5f36f1bf11a9b1a1dbf192d8
LLR: 2.93 (-2.94,2.94) {-0.50,1.50}
Total: 51824 W: 5836 L: 5653 D: 40335
Ptnml(0-2): 264, 4356, 16512, 4493, 287
passed LTC:
https://tests.stockfishchess.org/tests/view/5f37836111a9b1a1dbf1936d
LLR: 2.93 (-2.94,2.94) {0.25,1.75}
Total: 29768 W: 1747 L: 1590 D: 26431
Ptnml(0-2): 33, 1347, 11977, 1484, 43
closes https://github.com/official-stockfish/Stockfish/pull/3011
Bench: 4173967
Do not show the details of the default architecture for a simple "make help"
invocation, as the details are most likely to confuse beginners. Instead we
make it clear which architecture is the default and put an example at the end
of the Makefile as an incentative to use "make help ARCH=blah" to discover
the flags used by the different architectures.
```
make help
make help ARCH=x86-64-ssse3
```
Also clean-up and modernize a bit the Makefile examples while at it.
closes https://github.com/official-stockfish/Stockfish/pull/2996
No functional change
Adds support for Vector Neural Network Instructions (avx512), as available on Intel Cascade Lake
The _mm512_dpbusd_epi32() intrinsic (vpdpbusd instruction) is taylor made for NNUE.
on a cascade lake CPU (AWS C5.24x.large, gcc 10) NNUE eval is at roughly 78% nps of classical
(single core test)
bench 1024 1 24 default depth:
target classical NNUE ratio
vnni 2207232 1725987 78.20
avx512 2216789 1671734 75.41
avx2 2194006 1611263 73.44
modern 2185001 1352469 61.90
closes https://github.com/official-stockfish/Stockfish/pull/2987
No functional change
I mentioned this a while back in discord, but nothing seems to have ever come from it. Anyway, to the best of my knowledge most current training data gen is being done at relatively low fixed depths. With this in mind, the change to not allow LMP in PvNodes should result in a fairly significant increase in strength and reliability of the PV.
fails to build on that target, because of missing Intel Intrinsics.
macOS has posix_memalign() since ~2014 so we can simplify the code and just use that for all Apple platforms.
closes https://github.com/official-stockfish/Stockfish/pull/2985
No functional change.