Use block sparse input for the first layer.

Use block sparse input for the first fully connected layer on architectures with at least SSSE3.

Depending on the CPU architecture, this yields a speedup of up to 10%, e.g.

```
Result of 100 runs of 'bench 16 1 13 default depth NNUE'

base (...ockfish-base) =     959345  +/- 7477
test (...ckfish-patch) =    1054340  +/- 9640
diff                   =     +94995  +/- 3999

speedup        = +0.0990
P(speedup > 0) =  1.0000

CPU: 8 x AMD Ryzen 7 5700U with Radeon Graphics
Hyperthreading: on
```

Passed STC:
https://tests.stockfishchess.org/tests/view/6485aa0965ffe077ca12409c
LLR: 2.93 (-2.94,2.94) <0.00,2.00>
Total: 8864 W: 2479 L: 2223 D: 4162
Ptnml(0-2): 13, 829, 2504, 1061, 25

This commit includes a net with reordered weights, to increase the likelihood of block sparse inputs,
but otherwise equivalent to the previous master net (nn-ea57bea57e32.nnue).

Activation data collected with https://github.com/AndrovT/Stockfish/tree/log-activations, running bench 16 1 13 varied_1000.epd depth NNUE on this data. Net parameters permuted with https://gist.github.com/AndrovT/9e3fbaebb7082734dc84d27e02094cb3.

closes https://github.com/official-stockfish/Stockfish/pull/4612

No functional change
This commit is contained in:
AndrovT
2023-06-11 03:24:04 +02:00
committed by Joost VandeVondele
parent b7ee7290b5
commit 38e61663d8
4 changed files with 290 additions and 2 deletions

View File

@@ -159,6 +159,7 @@ Norman Schmidt (FireFather)
notruck
Ofek Shochat (OfekShochat, ghostway)
Ondrej Mosnáček (WOnder93)
Ondřej Mišina (AndrovT)
Oskar Werkelin Ahlin
Pablo Vazquez
Panthee