Add an UCI option "Zugzwang detection" OFF by default that
enables correct detection of zugzwang.
This is just to let 1.7.1 be 100% compatible with 1.7 and
should be removed after release.
Verified 100% functional equivalent to 1.7
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
It fails in test position:
8/7P/8/8/K2b4/p7/1k6/1B6 b - -
Not clear why but we revert because it fixes the issue.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
In this case use a normal VALUE_TYPE_LOWER TT type instead of
VALUE_TYPE_NS_LO. This allow us to TT cut-off in a bit more nodes.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
This patch fixes an issue with zugzwang well explained by Tord:
"Assume that a zugzwang position occurs at iteration N,
at a search depth d, with d < 6*OnePly. The null move search
fails high, and no verification search is done, because the
depth is too small. The position gets stored in the transposition
table with a good score and a depth of d.
Now, consider what happens when the same position occurs at iteration
N+1, this time with a depth of d+OnePly (i.e. one ply deeper than at
the previous iteration). Once again, the null move search fails
high. The point is that the verification search will also fail high,
because of an instant transposition table cutoff caused by the value
stored in the TT during the previous iteration."
With this patch we simply do not allow TT cutoffs at the root node
of a null move verification search if the TT value was found by a
null search.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Add VALUE_TYPE_NS_LO to enum ValueType and use it when
saving in TT a value from a null search.
Currently no action is performed, the next patch will enable
the new type.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Reported warning is:
warning #2514-D: pointless comparison of unsigned
integer with a negative constant
Spotted by Richard Lloyd.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Under gcc we have:
warning: dereferencing type-punned pointer will break
strict-aliasing rules
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Use full depth, not reduced one. This allows
to avoid to do a null search when in the same
position and at the same or bigger depth the
null search failed high.
A very small increase, if any.
After 963 games at 1+0
Mod vs Orig: +158 =657 -147 +4 ELO
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
In this form it is even more evident we have some
issue there to be fixed sooner then later....
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
After previous patch we don't need any more the call parameters.
This fixes a couple of warnings under MSVC.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
After 500 games at 5+0 on my QUAD (3 days) there
is no difference with old version, so probably it
is a feature that doesn't scale with search depth.
So revert for now, perhaps we should readd under a
different form.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
MSVC complains about an implicit conversion from double to int.
Also small comments tweaks.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
In the beginning use milder reduction and at the end be
more aggressive.
After 1500 games on Joona's QUAD
Mod - Orig: 791 - 720 +16 elo
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Causes very small functional change which is not observable with
our usual set of test positions.
However change is observable fx. with following position:
4k3/3r4/5Q2/6K1/8/8/8/8 w - - 0 1
go depth 24
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
If null search fails high return null value instead of beta.
With TT hash there may be a small advantage for fail-soft since
storing slightly better bounds may cause slightly more hash hits.
After 990 games at 1+0
Mod vs Orig +171 =665 -154 +6 ELO
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Array castleRightsMask[] is not static because it can
be different for different positions, so let it be
a Position member data. This allows to remove tricky
hacks to take in account that although it was defined
static it could change.
Theoretically now copying a position is a bit slower because
we need to copy also an array of 64 integers, but because in
split() we don't copy the position anymore, but just keep the
pointer, the added burden is not mesurable even in MP case.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Also don't use __cpuid() intrinsic for Intel under
Linux because gives wrong results when detecting HT,
use the gcc version instead. Finally clean up the code.
Error was due to changed __cpuid() signature for
gcc compiler.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
This will allow to use the function also for other
purposes then detecting POPCNT.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
It will be used by future patches and also rearranges some
half cooked code that mistakenly ended up in master in the
past.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Currently, in case of fail high/low we research with
a window increased by 2*AspirationDelta at first
attempt, this patch instead makes the research be
done with an increase of just AspirationDelta size,
in case of a consecutive fail we will widen to
2*AspirationDelta and so on.
After Joona's test:
Orig - Mod: 850 - 890
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
One for failing highs and one for failing lows, this
should reduce average window size in case of positions
that fail first high and then low (or the contrary).
After ~2000 games on Joona's quad we have:
Mod - Orig: 1012- 975 (+6 elo)
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Instead to leave uninitialized or scattered in the code
as is the case for ExtraSearchTime.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Currently we use original sorting after a fail low to
research at wider window. This patch instead sorts the
moves according to the last available move's scores.
Strangely no functional change, but should be.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Changed FutilityMarginsMatrix dimensions to be a power of two
so that compiler can produce a faster accessing code.
Introduced print_pv_info() to remove some redundant code in
root_search()
Remaining stuff is triviality and documentation tweaks.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Under Windows we use CreateThread() to setup threads and
we pass a pointer to a variable that receives the thread
identifier, but this parameter is optional and we don't
use it, so remove it.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
It happens that NULL is 0, but the conventional meaning is of
a zero pointer, so repleace with an explicit 0 integer value.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Consider sligtly negative captures as good if at low
depth and far from beta.
After 999 games at 1+0
Mod vs Orig +169 =694 -136 +11 ELO
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
It raises an assert under Windows, it is not clear why but it
happens that idle_loop() is called with incorrect threadID and
the assert triggered is:
assert(threadID >= 0 && threadID < MAX_THREADS);
So revert the patch for now, but we should understand why it
fails.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
We can't do it with full guarantee anyway because
there is always a possible race between the setting of
state to THREAD_SLEEPING and actual sleeping.
So just remove the not perfect code to avoid misunderstandings.
This reflects what we have done in wake_sleeping_threads() in
the previous patch.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Currently there is no guarantee that threads are sleeping
when calling wake_sleeping_threads() because put_threads_to_sleep()
returns without waiting for threads to actually sleep.
Assert can be easily triggered calling put_threads_to_sleep() and
wake_sleeping_threads() in a tight loop.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
This option likely has very low meaning for playing strength and style,
so I see no need to keep this configurable
No functional change
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
search() is used as a "leading star" and other routines
are modified according to it.
No functional change
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Was broken and fixing would be too messy.
Now this option is only activated in single thread mode
No functional change
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Just to avoid misunderstandings.
True staticValue is available through search stack
No functional change
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Bug spotted by Jouni Uski and fix suggested by Pablo Vazquez
Also add note that we are not always handling fifty move rule correctly
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
I cannot see connection between the two.
Also add one FIXME for illogical behaviour
No functional change
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
I don't know if enumerating sections is a good idea,
but for me code is more readable this way
No functional change
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
I cannot see any reason to do this. Even this is not enough to fix
theoretical race case on Windows which doesn't seem to cause any
problems in practice anyhow
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Resurrect extra check for sleeping in POSIX code.
This necessary to prevent ugly races between
thread_broadcast and thread_cond_wait.
After thread has woken up, it marks itself as available.
Another thread must not do this, because of possible race.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
So we can use a const value instead of a pointer in
split().
Also pass NULL instead of a faked address of alpha in
case split is called from a non-PV node.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Intrinsic __popcnt64() returns an unsigned __int64, cast
to an integer and silence the warning.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Joona says that sp_update_pv() does not pass the split point
boundaries, so there is no risk to corrupt data from another
split point. Also the race on thread_should_stop() is harmless
because of this.
So revert the patch and come back to single lock.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Joona says:
1. We should not be afraid of "AllThreadsShouldExit" flag.
Because when this is set to true we _must_not_ be searching (= All
splits must have been undone).
And if we are not searching it's impossible that some other thread
could give us work to do. So setting state to THREAD_AVAILABLE
doesn't do any harm. If you want to add check for this, you could do
it like this:
if (threads[threadID].state == THREAD_WORKISWAITING)
{
+ assert(!AllThreadsShouldExit)
threads[threadID].state = THREAD_SEARCHING;
2a. If waitSp->cpus == 0, setting state to THREAD_AVAILABLE makes
no harm either, because helpful master concept dictates that _only_
our own slave can book us. If we don't have any slaves, noone has the
right to book us.
2b. If point (2a) is not correct then your extra check only adds extra race:
In smp code checking for waitSp->cpus > 0 is not enough. It's possible that
our slave immediately exits and another thread
books us as a slave when our state is still
THREAD_AVAILABLE. So instead of adding extra level of security we have
just introduced extra race.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
When we found a cut-off then lock all the split point chain,
not only current one to avoid races in case two threads running
on different split points where one is ancestor then the other,
find a beta cut-off at the same time, in this case we want only
one to call sp_update_pv().
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
This is a per split-point request, not per-thread. When we find
a beta cut-off in current thread's split point or in or in some
ancestor of the current split point then threads should stop
immediately the search and return to idle_loop().
The check is done by thread_should_stop() that now looks only
at split point's chain.
No functional change and a good semplification.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
This is easier to follow and also reduces the points
where state changes to mainly idle_loop() and split().
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
This is because when we are below 4 * OnePly, the null move
will directly jump to qsearch and if we are below beta,
our opponent is above beta and will get immediate
stand pat cut off.
So basically this patch is just optimizing away useless
evaluation calls. dbg_hit_on() runs show that this heuristic
is correct >99% of cases. Transposition table probably causes
some inaccurary?
After 1148 games on QUAD
mod-orig: 583 - 565 +5 elo
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
In this case is dangerous because in split() we reset the flag to
false, but if it was set due to a cut-off higher in the tree we
completely miss that and go on with the full search.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Instead of other flags this is not a state flag, i.e. does
not defines a state for the thread, but a request because
after we raise 'stopRequest' flag the corresponding thread is
not stopped, but continues to run for a while until it returns
from sp_search() in idle_loop.
It is important the name reflects this.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Among them 'stop' and 'printCurrentLineRequest' could have
random value, so reset to a known state before to leave the
search.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
This is the first nice effect of previous patch !
Because thread_should_stop() should be declared 'const' we
need to remove the setting of 'stop' flag to true that
turns out to be a bug because thread_should_stop() is called
outside from lock protection while 'stop' flag is a volatile
shared variable so cannot be changed when not in lock.
Note that this bugs fires ONLY when we use more then 2 threads,
so commonly only in a QUAD or OCTAL machine.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Main aim of this patch is to consolidate all the thread related stuff
behind a single class interface so to avoid messing with global flags
and having thread code scattered among non-thread related stuff.
Another advantage is that now access to thread's variables is
more controlled, in particular we can differentiate between
read and write accesses by the mean of different interfaces, it
is so simpler to understand how a function is related to threads.
Lastly this rewrite is the base for future code consolidations and
semplifications that are easier now that we have only one thread's
access point.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Ensure threads are sleeping when leaving init_threads() and
the newly introduced put_threads_to_sleep().
Also ensure threads are not sleeping when leaving
wake_sleeping_threads().
As a side effect we now leave think() with all the threads
(but the main one) guaranteed to sleep. So when we enter
again in think(), after the opponent next move, we know
threads must be sleeping.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Wait inside wake_sleeping_threads() for the threads to be
effectively and reliably woken up.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Will be used by future patches. Also:
- Renamed Idle in AllThreadsShouldSleep
- Explicitly inited AllThreadsShouldExit and AllThreadsShouldSleep
in init_thread() instead of use an anonymous global initialization.
- Rewritten idle_loop() while condition to avoid a 'break' statement
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Patch from Richard Lloyd (slightly edited from me), following the list
of changes as described by the author:
src/Makefile:
- Added PREFIX and BINDIR for the install: rule.
- Added a "make hpux" line to the help: rule.
- Added "make test"/"make check" rule that runs the $(PGOBENCH) command.
- "make clean" now additionally removes core and bench.txt.
- Added an hpux: rule.
- Added an install: rule to mkdir $(BINDIR), copy $(EXE) to $(BINDIR) and
then strip it.
- "make strip" now ensures that $(EXE) is built first before trying to
strip it.
- Hide errors and output from the g++ command used by the .depend: rule and
then touch .depend in case g++ isn't available.
- Hide errors from the "include .depend" in case .depend doesn't exist
(e.g. directly after a "make clean").
src/book.cpp and src/book.h:
- HP-UX's aCC really didn't like the const keywords used for the
Book::file_name() definitions, so they were removed. I checked that this
didn't affect a Linux build and it was still fine.
src/misc.cpp:
- HP-UX uses <sys/pstat.h> and pstat_getdynamic() to determine the number of
CPU cores, so added conditional code for that (if pstat_getdynamic() fails,
set the number of cores to 1).
src/tt.cpp:
- <xmmintrin.h> and _mm_prefetch() seem highly specific to the Intel x86(_64)
and gcc platforms - neither exist in HP-UX, so conditionally avoid that
code in HP-UX's case. Perhaps some sort of define is needed here
such as -DHAS_MM_PREFETCH that could be #ifdef'ed for instead?
Even after these changes, it's more convenient for HP-UX users to edit the
default: rule in the Makefile to run "$(MAKE) hpux" before they build
stockfish, but that's not a big deal if they're warned about that first (the
same applies to all other builds other than the standard "$(MAKE) gcc" one).
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Compiler complains because in Book we have a d'tor but not
copy c'tor and assignement operator (warning C4511 and C4512),
note that after adding them (just declared) you now need also
default c'tor !
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
It is an hidden bug waiting to fire. The main problem is
that ss[ply] is overwritten by search() and qsearch() called
from IID and razoring, so that we cannot hold a pointer to a
local EvalInfo variable.
For instance if we go razoring then we overwrite the pointer
with the address of a variable local to qsearch(), when we return
from qsearch() variable goes out of scope and now ss[ply].evalInfo
holds a stale pointer !
Because we are not looking for troubles we go through the
safe route and we remove it entirely.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Gain value is multiplied by 16 to be of comparable magnitudo
of negative history, on average.
This patch shows very good results in tactical tests, but
started very bad in real games, so I have run two test matches.
After 896 games at 1+0
Mod vs Orig +187 =525 -184 +1 ELO
After 999 games at 1+0
Mod vs Orig +223 =590 -186 +13 ELO
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
We can do this only when needed, if we get a cut-off
before we skip sorting entirely. This reduces sorting
time of about 20%.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
In search routines we use information from previous ply
and init killers two plies ahead.
So for me it seems correct to copy 4 searchstack items
in split:
ply - 1, ply, ply + 1, ply + 2
Because
a) we do not split at root (ply == 0)
b) ply < PLY_MAX and SearchStack size is PLY_MAX_PLUS_2
there should be no risk of underflows or overflows
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
With negative history we don't have anymore a
lot of zeroes to score, so just split moves in
positives and non-positives sets.
Speed up is almost zero, we cannot test speed directly
because node count changed due to reorder, but I have
verified sorting is correct. With a profiler I have
seen we gain a little in sort_moves() and lose a little
in insertion_sort(), so the net effect is almost zero,
but code is simpler.
No real change, just move reordering.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
It was only used to control StopOnPonderHit variable.
Now use FailLow variable instead.
Patch has a minor effect on time management when ponder is on.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
noProblemFound condition is never true.
This was verified by running 800 games 1+0 match in 1 CPU computer.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Because H.move_ordering_score() can return negative values
some negative see moves could be searched before non-negative
see moves with negative history.
This patch restores proper ordering.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Now that history can go negative and is almost alwyas
non zero we have no more reasons to use also psqt term.
After 994 games at 1+0
Mod vs Orig +204 =597 -193 +4 ELO
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Instead of piece-from-to, in this way it is similar
to what we already do for history.
Almost no change, but seems a bit simpler in this way.
After 995 games at 1+0
Mod vs Orig +207 =596 -192 +5 ELO
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
It seems more standard conformant. Also added a bit of
description directly from Tord.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
We erroneusly added two times the same scaling function
to endgame's map.
Fix detected by valgrind becasue resulted in a memleak
of the first added scaling function.
Bug introduced by 30e8f0c9ad6a473 of 13/02/2009
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
We had an overflow due to use an integer for hash size,
now we use a size_t as we should, so we can increase to
an higher limit.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Normally it's up to the GUI to check for option's limits,
but we could receive the new value directly from the user
by teminal window. So let's check the bounds anyway.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
With new target 'make gcc-popcnt' it is now
possible to compile with enabled hardware POPCNT
support also with gcc. Until now was possible only
for Intel and MSVC compilers.
When this instruction is supported by CPU, for instance
on Intel i7 or i5 family, produced binary is a bit faster.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
This reverts commit c43c5fe9e0.
Produces following build error after 'make clean', 'make icc' under Mandriva
with icc version 11.0
Makefile:306: .depend: No such file or directory
In file included from tt.cpp:28:
/usr/lib/gcc/i586-manbo-linux-gnu/4.3.2/include/xmmintrin.h:35:3: error: #error "SSE instruction set not enabled"
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Futility captures alone does not seem an improvment.
Perhaps is a combination of stand pat + futility that is winning,
so revert for now and continue testing starting from a standard
base until we find the correct receipe.
After 999 games at 1+0
Mod vs Orig +231 0506 -201 +10 ELO
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Under some rare cases we can have a search tree explosion
due to a perpetual check or to a very long non-capture TT
sequence.
This avoids the tree explosion not following TT moves that
are not captures or promotions when we are below the
'generate checks' depth.
Idea suggested by Richard Vida.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Not a biggie but is a reduced pruned patch that doesn't
seems to hurt, so it is welcomed ;-)
After 999 games at 1+0
Mod vs Orig +207 =601 -191 +6 ELO
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
At ply == OnePly (common case) we avoid some useless
floating point computation.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Previously input like "setoption name Use Search Log value true "
(note space at the end of the line) didn't work.
Now parse value same way as option name. This way we implicitly
left- and right-trim value.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Initializing high-level object at startup is very dangerous,
because low-level snippets are not yet initialized.
For example Position's constructor calls find_checkers() which
calls attackers_to() which depends on various global bitboard arrays
which are not yet initialized. I think we are lucky not to crash.
RootPosition.from_fen(StartPosition); is called immediately after
all initializations are made at uci_main_loop() which is the
correct behaviour
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Joona new aspiration window. Main idea is to always
research aspiration fail highs/low at the same
ply and use much smaller aspiration window than previously.
Testing result is very positive.
1CPU:
953-1149
4CPU:
545 - 656
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Avoid to take the lock two times in a tight sequence, the first
in get_next_move() and the second to update sp->moves.
Do all with one lock and so retire the now useless locked version
of get_next_move().
Also fix some theorical race due to comparison sp->bestValue < sp->beta
is done out of lock protection. Finally fix another (harmless but time
waster) race that coudl occur because thread_should_stop() is also
called outside of lock protection.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
In debug run with 2 threads it happens to be following
assert after some minutes:
assert(value > -VALUE_INFINITE && value < VALUE_INFINITE);
in search(), line 1615.
I am not able to understand why, anyhow reverted for the moment.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
This avoid us to forget some very needed tests now that
futility has changed in a whole big chunk we need to fine
tuning every splitted change.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
This will be useful to use gains table in move
ordering along with history table.
No functional change and big code remove.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
With current search control system, I can see absolutely no
reason to classify fixed time search as infinite search.
So remove old dated hack
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Value of uip.eof() should not be trusted.
input like "go infinite searchmoves " (note space in the end of line)
causes problems.
Check the return value of (uip >> token) instead
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
In less then 1% of cases value > sp->bestValue, so avoid
an useless lock in the common case. This is the same change
already applied to sp_search().
Also SplitPoint futilityValue is not volatile because
never changes after has been assigned in split()
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
When we have more then 2 threads then we do an array
access with index 'Threads[slave].activeSplitPoints - 1'
This should be >= 0 because we tested the variable just
few statements before, but because is a shared variable
it could be that the 'slave' thread set the value to zero
just after we test it, so that when we use the decremented
variable for array access we crash.
Bug spotted by Bruno Causse.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Unify start loop for master and slave threads. Also guarantee
that all the 'stop' flags are set to false before first slave
is started, should be no harm because only master thread can
reset 'stop' flag of slaves to true, so should be no race but
better safe then sorry.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Only pure blocking evasions are candidate
for pruning.
After 998 games at 1+0
Mod vs Orig +215 =596 -187 +10 ELO
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
In move_to_san() we create by copy a new position just
to detect if move gives check. This could be very costly in
line_to_san() that calls move_to_san() for every move, so
create the position only once and pass a reference to move_to_san()
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
When pondering threads are put to sleep, but when thinking
the threads are parked in idle_loop in a tight polling loop
checking for workIsWaiting falg.
So before we set the slave's flag workIsWaiting we have to
guarantee that all the slave data is already setup because
slave can start in any moment from there.
Rearrange the last loop to fix this race.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Once we have allocated our slave threads and we have removed
master from available threads we can safely remove the lock
so that the lenghty search stack copy operation will not
impact lock contention.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
A pointer is enough because after a split point has been
setup master and slaves thread end up calling sp_search() or
sp_search_pv() and here a full copy of split point position is
done again, note that even master does another copy (of itself)
and this is done before any do_move() call so that master Position
is never updated between split() and sp_search().
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
And detach splitPoint Position from the master one.
So we duplicate StateInfo only once in split() instead
of one for each thread in sp_search
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Renamed a bit the functions to be more clear what
we actually are doing when we craete a Position object
and explained how StateInfo works.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Also removed some trailing whitespaces and aligned
indentation to current standard.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Only the previous, the current and the next ply SearchStack
are copied.
This reduces split overhead especially at low depth (high ply)
and with many threads.
Possibly no functional change (it is not easy to prove in SMP)
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
The init_eval() function corrupted the static array castleRightsMask[]
in the Position class, resulting in instant crashes in most Chess960
games. Fixed by repairing the damage directly after the function is
called. Also modified the Position::to_fen() function to display
castle rights correctly for Chess960 positions, and added sanity checks
for uncastled rook files in Position::is_ok().
It is a good programming practice to verify a system
call has indeed succeed.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
When a search fails high then sp->alpha is increased and
slave threads are requested to stop.
So we have to check for a stop request before to start a search
otherwise we could end up with sp->alpha >= sp->beta
leading to an assert in debug run in search_pv().
This patch fixes the assert and get rid of some of possible races.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
maximum depth during a "go movetime ..." search. This prevents
Stockfish from hanging forever after finding a mate in two or
three while running a test suite at a level of a few seconds
per move.
No functional change when playing games at normal time controls.
In qsearch() try to get a cutoff with the help of an
extra check if we are already very near.
Small increase in actual games but a good result in tactical
test sets where this patch makes SF more tactical.
Mod vs Orig +197 =620 -181 +6 ELO
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Louis Zulli reports a miscompile with g++-4.4 from MacPorts.
Namely enum Value is compiled as unsigned instead of signed integer
and this yields an issue in score_string() where float(v) is incorrectly
casted when Value v is negative.
This patch ensure that compiler choses a signed variable to store a Value.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Value is never used un-initialized, but MSVC is not
smart enough to detect itself :-(
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Use the same scoring system used for evasions. Small if any
increase, but should be in at least for completeness.
After 999 games at 1+0
Mod vs Orig +208 =601 -190 +6 ELO
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Now that we have position static score we don't
need to call evaluate() a second time.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
This will allow to have wider access to attack
information, for instance from MovePicker.
Note that 'eval' field become obsolete, it is kept
just becasue when we get a position score from TT
we update 'eval' even without an EvalInfo object.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
In sp_search_pv() we do a LMR search using sp->alpha, at the end
we detect a fail high with condition (value > sp->alpha), but if
another thread has increased sp->alpha during our LMR search we
could miss to detect a fail high event becasue value will be equal
to old alpha and so smaller then new one.
This patch fixes this SMP-bug and changes also the non SMP versions
of the search to keep code style in sync.
Bug spotted by Bruno Causse.
No functional change (for single CPU case)
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
futilityValue is now calculated immediately after
staticValue, so remove small bunch of unused code
No functional change
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
MSVC raises an "use of partially uninitialized variable" for futilityValue
and staticValue but this is not rue becasue when !isCheck variables
are never used, anyhow silence the warning.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
This is less prone to bugs because now it's up to the
compiler don't forget this important initialization.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
No change in functionality signature
The only functional change is that when we reach PLY_MAX,
we now return VALUE_DRAW instead of evaluating position.
But we reach PLY_MAX only when position is dead drawn and
transposition table is filled with draw scores, so this
shouldn't matter at all.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
This way razoring is always based on exact evaluation and
follows simple formula.
Joona's test results are positive:
32-bit 1CPU:
Mod - Orig: 1073 - 993
64-bit 4CPU:
Mod - Orig: 759 - 721
Functionality Signature: 11448962
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Don't clamp to zero if a move continues to fail.
After 946 games at 1+0
Mod vs Orig +208 =562 -176 +12 ELO
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Also retire razoring margins vector and use
a simpler formula instead.
Now that we use a more accurate static evaluation
try to avoid useless null searches when we are well
below beta. And for teh same reason increase a bit
the razoring.
After 972 games at 1+0
Mod vs Orig +224 =558 -190 +12 ELO
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
It means we have already received "stop" or "quit" commands.
This fixes an hang in tactical test in Fritz GUI. Bug
introduced by previous bug fix :-(
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
According to UCI standard once engine receives 'go infinite'
command it should search until the "stop" command and do not exit
the search without being told so, even if PLY_MAX has been reached.
Patch is quite invasive because it cleanups some hacks used
by fixed depth and fixed nodes modes, mainly during benchmarks.
Bug found by Pascal Georges.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
After deep tests Louis Zulli found on his OCTAL machine that
best setup for an 8 core CPU is as following
"Threads" = 8
"Minimum Split Depth" = 6 or 7 (mSD)
"Maximum Number of Threads per Split Point" = not important (MNTpSP)
Here are testing results:
mSD7 (8 threads) vs mSD4 (8 threads): 291 - 120 - 589
mSD6 vs mSD7: 168 - 188 - 644
mSD6-MNTpSP5 vs mSD6-MNTpSP6: 172 - 172 - 656
SF-7threads vs SF-8threads: 179 - 204 - 617
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
So to have the same layout and be as much similar as
possible. The only functional change is that now we
try ttMove as first also in PV nodes and at the end
we save the ttMove, as it happens in search. This
should have almost zero impact on ELO but it seems
the correct thing to do.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
This is an important design change because we know
compute evaluation in each node.
This is a 2.0 type change!
After 977 games at 1+0
Mod vs Orig +236 =538 -202 51.74% +12 ELO
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
If after 'moves' there is a space then we crash.
The problem is that operator>>() trims whitespaces so that
after 'moves' has been extract we are still not at eof()
but remaining string contains only spaces. So that the next
extarction operation uip >> token ends up with unchanged token
value that remains 'moves', this garbage value is then feeded
to RootPosition.do_move() through move_from_string() that does
not detect the invalid move value leading to a crash.
This bug is triggered by Shredder 12 interface under Mac that
puts a space after 'moves' without any actual move list.
Bug fixed by Justin Blanchard
After reviewing UCI parsing code I spotted other possible weak
points due to the fact that we don't test if the last extract
operation has been succesful. So I have extended Justing patch
to fix the remaining possible holes in uci.cpp
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
According to standard en-passant is recorded in fen string regardless
of whether there is a pawn in position to make an en passant capture.
Instead internally we set ep square only if the pawn can be captured.
So teach from_fen() to correctly handle this difference.
Bug reported and fixed by Justin Blanchard.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Try to get a position evaluation better then
the quick one with the help of the TT table.
This allows the null search conditions and
chosen reductions to be more accurate.
After 908 games at 1+0
Mod vs Orig +209 =526 -173 +14 ELO
Functionality Signature: 16627355
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Linear rule, less aggressive then Dann's one.
It seems it scales well with depth. We will need to
verify against weaker engine if it keeps the score.
After 999 games at 1+0 on my Dual Core
Mod vs Orig +232 =534 -207 +9 ELO
After 1000 games by Martin Thoresen on his QUAD at 1+0
Mod vs Orig 521/479 52.10%
Functionality Signature: 17655312
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
The most interesting thing is a bit of rewrite
and semplification in connected_moves()
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
It is not useful becasue it is safe to call
get_next_move() multiple times when phase == PH_STOP
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Because that's the correct meaning. Note that also the
corresponding UCI option has been renamed.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
After history accounting rewrite in 1.6, a small
tweak of history parameters seems positive.
Note that these are not to be considered the optimal
values, just a wild guess that proved good.
Finding the optimal values would require a much longer
testing time.
After 967 games at 1+0
Mod vs Orig 240 529 198 +15 ELO
Functionality Signature: 21222553
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
According to the standard, compiler is free to choose
the enum type as long as can keep its data.
Also cast to short and right shift are implementation
defined in case of a signed integer.
Normally all the compilers implement this stuff in
the "usual" way, but gcc with -O3 and -O2 pushes
aggressively the language to its limits to squeeze
even the last bit of speed. And this broke our
not 100% standard conforming code.
The fix is to rewrite the Score enum and the 16 bits
word extracting functions in a way that is 100% standard
compliant and with no speed regression on gcc and also on
the other compilers.
Verified it works on all compilers and with equivalent
functionality.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
The compiler is allowed to chose the size of an enum variable
based on the values it is expected to store. So force the compiler
to use at least a 32 bit integer type for the Score.
MSVC and Intel do not change, while gcc under -O3 is affected
by this change.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
We cannot cast a pointer type to an unrelated pointer type.
This is a violation of the strict aliasing rules.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
It is not clear why is not true, even in single thread
case, but as a matter of fact it is not!
So remove it.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Unfortunatly we need to slow down to -O1 to be sure
it works always.
Note that sometime it works also with -O2 or even -O3,
but user has to try himself.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Instead should be read by the corresponding UCI
option "Book File".
Bug reported and fixed by Justin Blanchard (Arch Linux)
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Was a long standing hidden bug from Glaurung times,
triggered only now that we enable IID at non PV nodes.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
We want to increrase the opportunities
of doing an exclusion search.
After 999 games at 1+0
Mod vs Orig +216 =574 -209 50.35% 503.0/999 +2 ELO
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Instead decrement history value on failure.
After 999 games at 1+0
Mod vs Orig +236 =558 -204 51.60% +11 ELO
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
This way we avoid overwriting valuable TT entries which
are needed to calculate exclusion search extension for pv.
Mod - Orig: 483 - 410 (+28 elo!)
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
After 999 games we are almost equal (+2 ELO),
but we have a good result against Rybka
Rybka 2.3.2a mp 32-bit vs Mod 254.5 - 242.5 +152/-140/=205 51.21%
Rybka 2.3.2a mp 32-bit vs Orig 259.5 - 236.5 +151/-128/=217 52.32%
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Now we always try to filter out moves, we will have
more wasted evaluation calls, but also more pruned
nodes.
After 786 games
Mod vs Orig +196 =413 -177 +8 ELO
Verified also against Rybka it increases score to 50-51%
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
We might be asked to ponder mate or stalemate position.
This being the case, simply wait for stop or ponderhit.
Currently we crash.
UCI specs aren't clear on the issue, but it cost nothing to
add little check.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Problem is that
istream& operator>> (istream& is, char* str );
according to C++ documentation "Ends extraction when the
next character is either a valid whitespace or a null character,
or if the End-Of-File is reached."
So if the parameter value is a string with spaces the currently
used instruction 'ss >> ret;' copies the chars only up to the first
white space and not the whole string.
Use a specialization of get_option_value() to fix this corner case.
Bug reported by xiaozhi
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Now we use a formula to calculate margins on the fly.
Node count has changed because we fixed a leftover when
we still where using FutilityMargins to calculate futilityValue
in the case that we had the evaluation score in TT.
Also small indentation fix.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Increase pruning at low depths while tone downa bit at
higher depths (linearize a bit the logaritmic behaviour)
This goes togheter with IncrementalFutilityMargin decreased
to 4 compensate the bigger pruning effect.
Total pruned nodes are more or less the same. We go from 36%
of nodes after prune to 37% with this patch.
After 999 games at 1+0
Mod vs Orig +250 =526 -223 +9 ELO
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
If we arrive until the exclusion search call then
we know that move == ttMove == tte->move()
But using ttMove in search call while, during excluded search
conditions we have used tte->Move()could be a little bit suboptimal.
On the other side using tte->move() also in search call is a bit ugly
so opt for the third choice that is the most clean becasue from the
conditions the reader easily understands that we are talking of ttMove
and that we ant to exclude the move we are evaluating in that moment.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
If tte->move() != MOVE_NONE then tte->move() == ttMove
What could happen is that we have a ttMove without a tte, or,
we have a tte but tte->move() == MOVE_NONE
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Due to IID we could have a ttMove and not a tte, or,
even if we have a tte they could belong to different
searches so that the depth and type of tte don't
have the same origin of the ttMove.
To fix this we always use tte entry in excluded search
condition and, after an IID, we reprobe the TT table.
No functional change. Apart from possible crash fix.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
They gave bad results:
Mod - Orig: 361 - 404
Master is now verified to be functional equivalent with F_63
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Becasue we have a lot of zero scores (around 30% of moves)
it is a good idea to do a couple a presorting loops across
the move list and shuffle the moves a bit so that with a
small effort we end up with 3 groups of moves: positives
scores, zero scores and negative scores.
We have two advantages
1) We don't need to sort zero scores
2) Sort two small groups is faster then sort a single big one
Speed up is of about 2%
Because equal scored moves could be reordered in a different way
this is not a "no functional change" although I have verified
the output list is always correctly sorted.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
We already reset loseOnTime flag at the beginning of
a new game, so we can simplify a bit the ligic there.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
* New version is documented and logic should be easier to follow
* Add extra check to not use LSN with x moves / y seconds time control
* New code fixes some rear cases where old code (still) causes program to lose on time at move 1.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
* The function is called only in one place
* It must not be called elsewhere
* The function call easily replaced with simple one line condition
No functional change (tested with usual set + 2000 random positions)
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
* First one is without any documentation, code is working just fine,
so there seems to be nothing that really should be fixed.
* Second one requesting emergency measures on aspiration fail low
when we are running out of time and we are without good move.
After very long time, I've come to conclusion that this is
impossible to fix, so remove request.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Use an exponenital law instead of a linear one for
history pruning.
This should prune more at low depths and a bit less
at high depths.
After 965 games
Mod vs Orig +233 =504 -228 +2 ELO
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Correct beta by razor margin when callin qsearch
After 1019 games on Joona's QUAD
Mod - Orig: 524 - 495 (+10 elo)
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Game phase is a strictly function of the material
combination so its natural place is MaterialInfo,
not position.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Call directly 'perft 6' to search up to depth 6*OnePly
instead of the old 'perft depth 6'.
It is more in line to what other engines do. Also a bit
of cleanup while there.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Move the code to the caller and also move mob_area
computation out of evaluate_pieces(). It is more clear
the code flow and it is also faster.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
When false (common case) we avoid to update checkers
bitboard that although not so costly slows down a bit
this very hot and critical path.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Avoid a 64 bit load using a pointer. It saves a couple of push/pop
instructions so advantage is only theorical, but anyway we use
pop_1st_bit() as a reference implementation for 32 bit systems so
we keep it more for documentation purposes then for other reasons.
Idea of pointer is of Eric Mullins.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
It will be overwritten anyway.
Also other little small touches that seem to increase
speed more then the whole enum Score patch series :-(
Optimization is really a black art.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Increases performance because now we use one integer
for both midgame and endgame scores.
Unfortunatly the latest patches seem to have reduced a bit
the speed so at the end we are more or less at the same
performance level of the beginning. But this patch series
introduced also some code cleanup so it is the main reason
we commit anyway.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Instead of MgPieceSquareTable[16][64] and EgPieceSquareTable[16][64]
This allows to fetch mg and eg values from adjacent words in memory.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Save mid and end game scores in an union so to
operate on both values in one instruction.
This patch just introduces the infrastructure and changes
EvalInfo to use a single Score value instead of mgValue
and egValue.
Speed is more or less the same because we still don't use
unified midgame-endgame tables where the single assignment
optimization can prove effective.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Try an internal iterative deepening not only when we don't
have a TT move but also if search depth is more then 4*OnePly
higher then TT move depth.
On some tests it seems that in around 20% of cases ttMove changes !
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Also remove some fallback templates that prevent a
compile error in case the user runs 'make icc-profile-popcnt'
from a non supported machine.
We want to loudly fail in that case instead of silently
fallback in a non-popcount compilation.
Updated documentation too.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Variable 'f' in 'for' loop scope hides same named
one in outer scope.
Of curse no functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Thanks to Eric Mullins we have now endian friendly
pop_1st_bit() and also is removed the need to use
-fno-strict-aliasing compiler option with GCC.
Speed is almost as fast, very small difference if any in
perft test, so I assume almost no difference in real games.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
This allow us to avoid the generation of the
evasion moves if we already have a TT move, and
in case we have a cut-off we skip evasion generation
altoghter.
Node count is changed because now we try TT move _before_
to generate evasions. The search on the TT move alters the
piece lists so that when we come back to generate evasions
we build the move list with a diferent order and this alters
the node count.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
It is now no more needed to know dc candidates
inside MovePicker, so avoid calculating there.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Big cleanup and semplification of pawns evasions that
now are pseudo-legal as the remaining moves. This
allow us to remove a lot of tricky code.
Verified against perft: no functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
This allow a big semplification in move generation
that will be committed with the next patch. And makes
handling of evasions similar to the other type of moves.
This patch plus the next seem to improve also on
the performance side because after 640 games to
verify there are no hidden regressions we are at +9 ELO
Verified with perft no functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Generate captures of checking piece and blocking
evasions in one go.
Also reduce of one indentation level early returning
when we have a double check.
Verified with perft no functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Don't seem to help, perhaps because we
return an approximate SEE score instead of the
real negative score so that we have some bad capture
or evasion sub-optimal ordering that compensates
the speed up.
Anyhow after 999 games at 1+0
Mod vs Orig +240 =514 -245 -2 ELO
So almost no harm to remove and make the code simpler.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Always try ttMove as first. Then try good captures ordered
by MVV/LVA, then non-captures if destination square is not
under attack, ordered by history value, and at the end
bad-captures and non-captures with a negative SEE. This
last group is ordered by the SEE score.
After 999 games at 1+0
Mod vs Orig +254 =546 -199 +19 ELO
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Because we only generate legal moves we can assume
a king cannot be recaptured, so we can safely return
immediately with the captured piece score. If the move
turns out to be illegal it will be pruned anyhow,
independently from SEE value. This gives a good speed up
especially now that we SEE-test all the evasions that
are always legal and very often are king moves.
Another optimization catches almost 15% of cases, unfortunatly
we have already calculated the very expensive attacks, so
benefits are not so big anyway.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
This patch cuts 30% of SEE calculations, as a drawback
a returned negative value is no more always correct if
a shortcut is found.
This could impact move order when based on negative see
score as example bad captures and evasions.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Check generation is used only in qsearch and
only at Depth(0), castling moves that give check
are very rare overall and even almost not exsistent
at Depth(0).
So retire this almost never used code that adds
a small but consistent slow down in the normal path.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Because discovery checks are very rare it is better to handle
them all in one go and strip from usual check generation
function.
Also rewrite direct checks generation to use piece lists instead
of pop_1st_bit()
On perft test we have a +6% of speed up and is verified we
generate the same moves, although in a different order.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Patch from Joona with extension to benchmark and inclusion
of Depth(0) moves generation by me.
Note that to test also qsearch and in particulary checks
generations a change in the end condition is needed.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Verified against tuning branch.
After 100 games at 1+0 on Joona QUAD
Mod - Orig: 527.5 - 471.5 (+20 elo)
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Give a bonus for each kind of attacked piece. Bonus
value is based on the type of attacked piece and the
type of attacking one.
Penalize pieces attacked by enemy pawns, also in
this case penality value depends on the type of
attacked piece.
This patch oboletes as redundant the increased mobility
count of the attcked squares that is then removed.
After 956 games at 1+0
Mod vs Orig +262 =462 -232 51.57% 493.0/956 +11 ELO
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Small code cleanup and a bit faster too.
The only functional change is that in extension
in pv node we extend promotions and not only captures
when condition met.
This is practically an undetectable change and has
no impact on strenght.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Even if SEE is negative there is always a good possibility
that TT move is a cut move anyway. For instance a lot of
BXN exchanges that have negative SEE can very easily be
good exchanges.
A nice side effect is a bit reduced frequency of
see_sign() calls.
After 643 games at 1+0
Mod vs Orig +174 =327 -142 52.49% 337.5/643 +17 ELO
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
When the move list is very small, like captures normally
are, it is faster to pick the best move with a linear
scan, one per cycle.
This has the added advantage that the picked capture move is
very possibly a cut-off move, so that other searches are
avoided. For non-captures it is still faster to sort in
advance.
Because scan-and-pick alghortim is not stable, node count
has changed.
After 885 games at 1+0
Mod vs Orig +196 =510 -179 50.96% 451.0/885
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Only in less then 2% of cases we have a new sp->bestValue,
so check before to lock and save a costly locking
most of the times.
Patch suggested by Joona.
No functional search.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Use futility pruning also in split points.
Do not use history pruning in split points when
getting mated.
After 1000 games on Joona QUAD
Orig - Mod: 496 - 504
Added an optimization to avoid a costly lock in the
very common case that sp->futilityValue <= sp->bestValue.
A test on a dual CPU shows only 114 hits on 23196 events,
so avoid a lock in all the other cases.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
It is stable and it is also a bit faster then std::sort()
on the tipical small move lists that we need to handle.
Verified to have same functionality of std::stable_sort()
After 999 games at 1+0
Mod vs Orig +240 =534 -225 50.75% 507.0/999 +5 ELO
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
If after the first tried 2 + int(depth) moves we still
have no any move that takes us out of a mate then do
not prune the following move, it is more important to
escape mate then speed up search.
This fixes an odd behaviour regarding mates, as example
the following diagram is a mate in 4, not in 3 as bogusly
reported before this patch.
1B2n3/8/2R5/5p2/3kp1n1/4p3/B3K3/8 w - - bm #4;
The performance impact should be minimal, the increment
in searched nodes is less then 0.1 %%
Idea and patch by Joona
After 999 games at 1+0
Mod vs Orig +193 =604 -202 -3 ELO
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Marco's new code for evaluating two unstoppable passed pawns where
one pawn promotes a single ply before the other tried to detect
cases where the pawn that promotes first could immediately capture
the pawn that promotes a ply later, but didn't work in cases where
the two pawns are on the same file. An example of this is the
following position:
8/8/3K4/2P5/2p5/3k4/8/8 w - -
With the new code, such positions are handled correctly.
In this case we call evaluate() being in check and this
is not allowed.
Bug found testing with reduced PLY_MAX value as suggested
by Miguel A. Ballicora on talkchess.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Add a rule about the situation when one side queens exactly
one ply before the other. To avoid difficult (but luckly rare)
cases we only handle the case of free paths to queen for
both sides.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Fix a condition for x-ray attack of a queen or
a rook behind a pawn of us. Previous condition does
not check if the enemy slider behind our pawn is
really attacking the pawn.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Unfortunatly std::stable_sort() implementation in gcc is
horrendously slow. We have a big performance regression on
Linux systems (-20% !)
So revert the commit and wait to fix the issue in a different
way, perhaps with an our home grown sorting, that should be
comparable in speed with std::sort()
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Standard does not mandate std::sort() to be stable, so we
can have, and actually do have different node count on
different platforms.
So use the platform independent std::stable_sort() and gain
same functionality on any platform (Windows, Unix, Mac OS)
and with any compiler (MSVC, gcc or Intel C++).
This sort is teoretically slower, but profiling shows only a very
minimal drop in performance, probably due to the fact that
the set to sort is very small, mainly only captures and with
less frequency non-captures, anyhow we are talking of 30-40 moves
in the worst average case. Sorting alghortims are build to work on
thousands or even milions of elements. With such small sets
performance difference seems not noticable.
After 999 games at 1+0
Mod vs Orig +234 =523 -242 -3 ELO
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
We want piece list to be terminated with SQ_NONE.
This happens with all the pieces but the pawns that
being 8 make the inner loop exit just before writing
the SQ_NONE value at the tail of the list.
This bug was hidden because currently we don't use
piece list to scan pawns, but this will change in the
future and in any case an initialization should be done
correctly for the whole array to avoid subtle bugs in
the future.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
This allow to resolve a lot of addresses at compile time
instead of an indirect access at runtime.
Speed up on pgo compile is of 1.3%
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
It is not equivalent because the check for the
50 moves rule get badly affected.
// Draw by the 50 moves rule?
if (st->rule50 > 100 || (st->rule50 == 100 && !is_check()))
return true;
So we _really_ need two counters.
Thanks to Joona and Tord to be patience with a silly guy ;-)
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
In case we reach ply == PLY_MAX we exit the function
writing
pv[PLY_MAX] = MOVE_NONE;
And because SearchStack is defined as:
struct SearchStack {
Move pv[PLY_MAX];
Move currentMove;
.....
We end up with the unwanted assignment
SearchStack.currentMove = MOVE_NONE;
Fortunatly this is harmless because currentMove is not used where
extarct_pv() is called. But neverthless this is a bug that
needs to be fixed.
Thanks to Uri Blass for spotting out this.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Null moves can artificially create a repetition
draw where instead there is no one.
So use a second counter to reset history after
a null move.
Idea from Joona.
After 999 games at 1+0
Mod vs Orig +238 =553 -208 51.50% 514.5/999 +10 ELO
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
When pondering InfiniteSearch == false but myTime == 0 so that
NodesBetweenPolls = 1000 instead of the standard.
The patch fixes the bug and is more robust because checks
directly myTime for a non-zero value, without relying on
an indirect test (InfiniteSearch in this case).
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Instead of checking the time every 100 nodes in the last second,
and every 1000 nodes in the last five seconds, Stockfish now checks
every 1000 nodes in the last second and every 5000 nodes in the last
five seconds. This was tested in 1036 games at a time control of
40 moves/10 seconds, and no losses on time occured.
Also fixed a bug pointed out by Marco: In infinite mode, myTime
is actually 0, but of course we still don't want to check the time
more frequently than the standard once per 30000 nodes in this
case.
In RootMoveList c'tor we allocate a search stack and then
call directly qsearch.
There is called init_node() that clears all the fields of the search
stack array that refers to current ply but not the the killer moves.
The killer moves cleared correspond to ply+2.
In id_loop() this is not a problem because killer moves of
corresponding ply are cleared anyway few instructions later,
but in RootMoveList c'tor we leave them uninitialized.
This patch fixes this very old bug. It comes direclty
from Glaurung age.
Bug spotted by Valgrind.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
In particular don't use an array of StateInfo, this
avoids a possible overflow and is in any case redundant.
Also pass as argument the pv[] array size to avoid a second
possible overflow on this one.
Fix suggested by Joona.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
It turned out that we used do_move_bb to update the king and rook
bitboards when making and unmaking castling moves, which obviously
doesn't work in Chess960, where the source and destination squares
for the king or rook could be identical.
No functional change in normal chess.
If an enemy piece is defended by a pawn don't give the
extra mobility bonus in case we attack it.
Joona says that "Paralyzing pawn" is usually worth of nothing.
On Joona QUAD after 964 games:
Orig - Patch_2: 191 - 218 - 555 (+ 10 elo)
On my PC after 999 games at 1+0:
Mod vs Orig +227 =550 -222 50.25% 502.0/999 +2 ELO
In both cases we tested against the original version (without
increased mobility), not against the previous patch that instead
seems to fail on Joona QUAD:
Orig vs. Prev.Patch: 237 - 217 - 627 (-6 elo)
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Now in mobility we count enemy attacked pieces as
empty squares.
With this patch we try to give an higher score to positions
where the number of attacked pieces is higher.
After 999 games at 1+0
Mod vs Orig +262 =517 -219 52.15% 520.5/998 +15 ELO
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
They are pawn structure invariant so has a sense to
store togheter with pawn info instead of recalculating
them each time evaluate() is called.
Speed up is around 1%
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
This satisfies a specific user request of 28/8/2009
"The only issue I have is that during multiPV analysis, the depth 1
best move score is not reported by the engine (reporting for the best
move begins at depth 2). I need it at depth 1 also. Would it be
possible to make this modification in future versions? This would be
of great help as otherwise I will have to use a lesser engine.
The goal of my project is to calculate the ELO performance in a game
and also the ELO rating of individual moves. For this I need depth 1
scores for lower rated performances. I intend to distribute the program
for free upon completion.
Thanks, Jack Welbourne"
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Is used only in weight_option() so inline there.
Unroll color loop also for evaluate_space() and
finally also some assorted code style fixes.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Use templates to manually unroll the loops so that
many values could be calculated at compile time or at
runtime but with a fast direct memory access instead of
an indirect one.
This change gives a speed up of 3.5 % on pgo build !!! :-)
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Move to what we already do in generate_piece_moves()
This simple patch gives a spped up of 1.4% !!
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Use equivalent Windows function _ftime() instead.
This patch also removes two long standing warnings
under MSVC.
No functional change and no change for non-Windows systems.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
This patch make the piece list always terminated by SQ_NONE,
so that we can use a simpler and faster loop in move
generation.
Speedup is about 0.6%.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
With this very simple patch we get a speed boost
of 0.8% on my PC !
Sometime we find the most complex tricks to increase speed
when instead the best results come from the simplest solutions.
No functional change of course ;-)
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
A better and more specific name. Also a bit of code reshuffle.
Verified No functional change and No performance change
for the whole series.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Rewrite in the form normally used in other similar
functions like generate_pawn_noncaptures()
This allow an easier reading of the pawn moves generators
and simplify a bit the code.
No functional change (tested on more then 100M nodes).
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
A bunch of trivial code style and comment fixes.
Among them there is a real fix for a subtle case
involving promotion moves.
We currently check that a pawn push to 8/1th rank
must be a promotion, but we don't check the contary,
i.e. that a pawn push on a different rank must NOT be
a promotion. Note that, funny enough, we perform this
control for all the other pieces, but not for the pawns!
This patch fixes this really corner case.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Remove a bunch of difficult and tricky code to test
legality of castle and ep moves and instead use a slower
but simpler check against the list of generated legal moves.
Because these moves are very rare the performance impact
is small but code semplification is ver big: almost 100 lines
of difficult code removed !
No functionality change. No performance change (strangely enough
there is no even minimal performance regression in pgo builds but
instead a slightly and unexpected increase).
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
In qsearch at depth 0 we generate only captures and checks.
Queen promotion moves are generated among the captures, but
under-promotion moves (both captures and non-captures) are
never generated even if they could give a discovery check.
This patch fixes this limitation extending generate_pawn_noncaptures()
to generate also check moves when required.
Apart for adding the (rare) case of an under-promotion that gives
discovery check, the patch is also a good cleanup because removes
generate_pawn_checks() altoghter.
This patch does the code clean-up but not enables the functional
change so to allow an easier debug.
No functional change and no performance change (actually a very
very small speed increase).
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
We are generating also king moves that give check !
Of course these moves are illegal so are in any case
filtered out in MovePicker. Neverthless we should avoid
to generate them.
Also simplify a bit the code.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Idea from Joona.
After 999 games at 1+0 on my Intel Core 2 Duo
Orig - Mod: +215 =538 -226 (+11 ELO)
On Joona QUAD after 845 games at 1+0
Orig - Mod: 151 - 181 - 513 (+13 elo)
So it seems a good change !
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
When a node fails low and bestValue is still equal to
the original static node evaluation, then save this
in TT along with usual info.
This will allow us to avoid a future costly evaluation() call.
This patch extends to failed low nodes what we already do
for failed high ones.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Still not clear if it helps and, especially, how it
helps. So revert for now to avoid any influence on
future feature now under test.
With this patch we come back to be functional
equivalent to patch e33c94883 F_53.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
After 828 games at 1+0
Mod vs Orig +191 =447 -190 50.06% 414.5/828
So almost no difference. Patch is committed more for
documentation purposes then for other reasons.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
It is in line with attackers_to() and is shorter and
piece is already redundant because is passed as template
parameter anyway.
Integrate also pawn_attacks_from() in the attacks_from()
family so to have an uniform attack info API.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Use the definition in the few places where is needed.
As a nice side effect there is also an optimization in
generate_evasions() where the bitboard of enemy pieces
is computed only once and out of a tight loop.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
It is a bit longer but much easier to understand especially
for people new to the sources. I remember it was not trivial
for me to understand the returned attack bitboard refers to
attacks launched from the given square and not attacking the
given square.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Most of them are not required to be public and are
used in one place only so remove them and use its
definitions.
Also rename piece_attacks_square() in piece_attacks()
to be aligned to the current naming policy.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
These functions return bitboard of attacking pieces,
not the attacks themselfs so reflect this in the name.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Instead of pawn_attacks(Color c, Square s) define as
pawn_attacks(Square s, Color c) to be more aligned to
the others attack info functions.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Remove undefined functions sliding_attacks() and ray_attacks()
and retire square_is_attacked(), use the corresponding definition
instead. It is more clear that we are computing full attack
info for the given square.
Alos fix some obsolete comments in move generation functions.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
It seems that it works better without compensation
of drifted value when saving static evaluation in TT.
After 818 games at 1+0
Mod vs Orig +217 =429 -172 52.75% 431.5/818 +19 ELO
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
This avoids inclusion of a bunch of not very commonly
used headers from windows.h
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Return the bitboard with the pawn attacks for both colors
so to be aligned to the meaning of the others piece_attacks<Piece>
templates.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
One of the most time critical functions is move_is_check()
and in particular the call to type_of_piece_on(from) in the
switch statement.
This call lookups in board[] array and can be slow if board[from]
is not already cached. Few instructions before in the execution stream,
we check the move for legality with pl_move_is_legal().
This patch changes pl_move_is_legal() to use type_of_piece_on(from)
for checking for a king move so that board[from] is automatically
cached in L1 and ready to be used by the near follower move_is_check()
Another advantage is that the call to king_square(us) in pl_move_is_legal()
is avoided most of the times.
Speed up of this nice and tricky patch is 0.7% !
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
This patch is built on Tord idea to use functions instead of
templates to access position's bitboards. This has the added advantage
that we don't need fallback functions for cases where the piece
type or the color is a variable and not a constant.
Also added Joona suggestion to workaround request for two types
of pieces like bishop_and_queens() and rook_and_queens().
No functionality or performance change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Use a single template to get bitboard representation of
the position given the type of piece as a constant.
This removes almost 80 lines of code and introduces an
uniform notation to be used for querying for piece type.
No functional change and no performance change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
After 934 games at 1+0
Mod vs Orig +228 =493 -213 50.80% 474.5/934 +6 ELO
So it seems not negative and there is also the added
benefit to unify LMRPVMoves use in search_pv() and in
root list.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
I managed to completely mismerge correct values
for QuadraticCoefficientsOppositeColor table :-(
Now it correspond to tuning branch for real.
After 999 games at 1+0
Mod vs Orig +247 =512 -240 50.35% 503.0/999 +2 ELO
So almost no change, but the new values comes from the
same tuning session of the others, so has more sense to
use these ones.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Because of a hard-to-spot single-character bug in connected_moves(),
the discovered check code had no effect whatsoever. The condition
in the if (...) statement at the beginning of the code would always
return false.
Thanks to Edsel Apostol for pointing out this bug!
It is used mainly in a bunch of inline oneliners
just below its definition. So substitute it with
the explicit definition and avoid information hiding.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
It is more clear that only in that case the move number is
correct, otherwise is only a partial quantity: the number of
moves of that phase.
In case of PH_EVASIONS instead we have only one phase.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Array index[] and pieceList[] are not guaranteed to be
invariant to a do_move() + undo_move() sequence when a
capture move is involved.
The reason is that the captured piece is removed form
the list and substituted with the last one in do_move()
while in undo_move() is added again but at the end of
the list.
Because index[] and pieceList[] are used in move generation
to scan the pieces it means that moves will be generated
in a different order before and after a do_move() + undo_move()
sequence as, for instance, the one in Position::has_mate_threat()
After latest patches, move generation could now be invoked
also by MovePicker c'tor and this explains why order of
picked moves is different if MovePicker object is istantiated
before or after a Position::has_mate_threat() call.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
It seems that pos.has_mate_threat() changes the position !
So that calling MovePicker c'tor before or after the
has_mate_threat() call changes the things !
Bug was unhidden by previous patch that makes MovePicker c'tor
to generate, score and sort good captures under some circumstances.
Because scoring the captures is position dependent it seems that
the moves returned by MovePicker are different when c'tor is
called before has_mate_threat()
Of course this is only a workaround because the real bug is still
hidden :-(
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
If we don't have tt moves to search skip the
useless loop associated with TT_MOVES phase.
Another 1% speed boost that brings this series
to a +6.2% against original revision 595a90df
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
This not only cleans up the code but gives another
speed boost of 1.8%
From revision 595a90dfd0 we have increased pgo compiled binary
speed of a whopping +5.2% without any functional change !!
This is really awsome considering that we have also
cut line count by 25 lines.
Sometime we spend days for getting an extra 1% from move
generation while instead the biggest optimizations come
from anonymous and apparently dull parts of the code.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Does not seem to improve on the standard, latest results
from Joona after 2040 games are negative:
Orig - Mod: 454 - 424 - 1162
And is more or less the same I got few days ago.
So revert for now.
Verified same functionality of 595a90dfd
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Use the same way of loop along the move list used for
the others move kinds so to be consistent in get_next_move()
And a bit of the usual clean up too, but just a bit.
It is even a bit (+0.3%) faster now. ;-)
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Always after TT move but before captures.
This seems a better setup against version before this
patch.
After 999 games at 1+0
Mod - Orig +252 =527 -220 +11 ELO
Unfortunatly it does not seems to improve on the standard
version, with null move outside of movepicker (595a90df) with
the latest speed-up patches added in.
After 999 games at 1+0
Mod - Standard +244 =506 -249 -2 ELO
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
This avoids calculating the array entry position
at each access and gives another boost of almost 1%.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
In MovePicker we get the next move with pick_move_from_list(),
then check if the return value is equal to MOVE_NONE and
in this case we update the state to the new phase.
This patch reorders the flow so that now from pick_move_from_list()
renamed get_next_move() we directly call go_next_phase() to
generate and sort the next bunch of moves when there are no more
move to try. This avoids to always check for pick_move_from_list()
returned value and the flow is more linear and natural.
Also use a local variable instead of a pointer dereferencing in a
time critical switch statement in get_next_move()
With this patch alone we have an incredible speed up of 3.2% !!!
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Try good captures before null move when depth < 3 * OnePly.
Use this kind of null move also in Depth == OnePly.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Original patch from Joona with added optimizations
by me.
Great cleanup of MovePicker with speed improvment of 1%
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Explicitly write the conditions for pawn to 7th
and passed pawn instead of wrapping in redundant
helpers.
Also retire the now unused move_is_pawn_push_to_7th()
and the never used move_was_passed_pawn_push() and
move_is_deep_pawn_push()
Function extension() is so time critical that this
simple patch speeds up the pgo compile of 0.5% and
it is also more clear what actually happens there.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Remove the 'b' uint32_t local variable.
Optimized assembly is more or less the same
(one 'mov' instruction less), but now it is
written in a way more similar to the final assembly
flow so it should be easier for compiler to optimize.
Also guarantee that BitTable[] is always aligned.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Verified correct against tuning branch.
After 999 games at 1+0
Mod vs Orig +257 =510 -232 51.20% +9 ELO
Very small increase but an increase anyway !
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
The following new targets were added:
* osx-icc32: 32-bit x86 compiled with icpc.
* osx-icc64: 64-bit x86 compiled with icpc.
* osx-icc32-profile: 32-bit x86 compiled with icpc and pgo.
* osx-icc64-profile: 64-bit x86 compiled with icpc and pgo.
There were two asserts.
The first was raised because is_ok() was called at the
beginning of do_castle_move() and this is wrong after
the last code reformatting because at that point the state
is already modified by the caller do_move().
The second, raised by debugIncrementalEval, was due to a
rounding error in compute_value() that occurs because
TempoValueEndgame was updated in an odd number by patch
"Merge Joona Kiiski evaluation tweaks" (3ed603cd) of 13/3/2009
This line in compute_value() is the guilty one:
result += (side_to_move() == WHITE)? TempoValue / 2 : -TempoValue / 2;
The fix is to increment TempoValueEndgame so to be even.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
This patch seems bigger then what actually is.
It just moves some code around and adds a bit of coding style fixes
to do_move() and undo_move() so to have uniformity of naming in both
functions.
The diffstat for the whole patch series is
239 insertions(+), 426 deletions(-)
And final MSVC pgo build is even a bit faster:
Before 448.051 nodes/sec
After 453.810 nodes/sec (+1.3%)
No functional change (tested on more then 100M of nodes)
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Integrate undo_ep_move in undo_move() this reduces line count
and code readibility.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Integrate do_ep_move in undo_move() this reduces line count
and code readibility.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Integrate do_promotion_move() in do_move() this reduces line count
and code readibility.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Integrate do_ep_move in do_move() this reduces line count
and code readibility.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
In Movepicker c'tor we access during initialization one of
MainSearchPhaseIndex..QsearchWithoutChecksPhaseIndex globals.
Postpone definition of PhaseTable[] just after them so that
when PhaseTable[] will be accessed later in get_next_move()
it will be already present in L1/L2.
It works like an implicit prefetching of PhaseTable[].
Also shrink PhaseTable[] to fit an L1 cache line of 16 bytes
using uint8_t instead of int.
This apparentely innocuous patch gives an astonish speed
up of 1.6% under MSVC 2010 beta, pgo optimized !
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
It was due to a missing -msse compiler option !
Without this option the CPU silently discards
prefetcht2 instructions during execution.
Also added a (gcc documented) hack to prevent Intel
compiler to optimize away the prefetches.
Special thanks to Heinz for testing and suggesting
improvments. And for Jim for testing icc on Windows.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
But this time with the guarantee of an always aligned
access so that prefetching is not adversely impacted.
On Joona PC
1+0, 64Mb hash:
Orig - Mod: 174 - 237 - 359
Instead after 1000 games at 1+0 with 128MB hash size
we are at + 1 ELO (just 4 games of difference).
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
After fixing the cpu frequency with RightMark tool I was
able to test speed all the different prefetch combinations.
Here the results:
OS Windows Vista 32bit, MSVC compile
CPU Intecl Core 2 Duo T5220 1.55 GHz
bench on depth 12, 1 thread, 26552844 nodes searched
results in nodes/sec
no-prefetch
402486, 402005, 402767, 401439, 403060
single prefetch (aligned 64)
410145, 409159, 408078, 410443, 409652
double prefetch (aligned 64) 0+32
414739, 411238, 413937, 414641, 413834
double prefetch (aligned 64) 0+64
413537, 414337, 413537, 414842, 414240
And now also some crazy stuff:
single prefetch (aligned 128)
410145, 407395, 406230, 410050, 409949
double prefetch (aligned 64) 0+0
409753, 410044, 409456
single prefetch (aligned 64) +32
408379, 408272, 406809
single prefetch (aligned 64) +64
408279, 409059, 407395
So it seems the best is a double prefetch at the addres + 32 or +64,
I will choose the second one because it seems more natural to me.
It is still a mystery why it doesn't work under Linux :-(
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Prefetch always form a chache line boundary. It seems
that if prefetch address is not cache line aligned then
performance is adversely impacted.
Hopefully we will resuse that 32 bits of padding for something
useful in the future.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
This fix a compile error under Linux with gcc when
there aren't the intel dev libraries.
Also simplify the previous patch moving TT definition
from search.cpp to tt.cpp so to avoid using passing a
pointer to TT to the current position.
Finally simplify do_move(), now we miss a prefetch in the
rare case of setting an en-passant square but code is
much cleaner and performance penalty is almost zero.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Move prefetching code inside do_move() so to allow a
very early prefetching and to put as many instructions
as possible between prefetching and following retrieve().
With this patch retrieve() times are cutted of another 25%
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
TT.retrieve() is the most time consuming function
because almost always involves a very slow RAM access.
TT table is so big that is never cached. This patch
prefetches TT data just after a move is done, so that
subsequent TT.retrieve will be very fast.
Profiling with VTune shows that TT:retrieve() times are
almost cutted in half !
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Shrink key to 32 bits instead of 64. To still avoid
collisions use the high 32 bits of position key as the
TT key and the low 32 bits to retrieve the correct
cluster index in the table.
With this patch size og TTentry shrinks to 96 bits instead
of 128 and the cluster of 4 TTEntry sums to 48 bytes instead
of 64.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Binaries are always built with symbol table in to easy
debugging and profiling.
It is now possible to run:
make strip
To remove symbol table from the compiled binary. This
could be useful to prepare the release version.
Patch by Heinz van Saanen.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Current formula enable LMR when
i + MultiPV >= LMRPVMoves
It means that, for instance, if MultiPV == 1 then LMR
will be started to be considered at move i = LMRPVMoves - 1,
while if MultiPV == 3 then it will start before,
at move i = LMRPVMoves - 3.
With this patch the formula becomes
i >= MultiPV + LMRPVMoves - 2
So that LMR will always start after LMRPVMoves - 1 moves
from the last PV move.
No functional change when MultiPV == 1
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Access pos.piece_count() only once and avoid some
branches in the inner loop.
Profiling with VTune shows a 20% speed improvement in
get_material_info(), and it is also a bit more cleaned
up this way ;-)
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
And put it in an already existing one so to
optimze a bit.
Also additional cleanups and code shuffles
all around the place.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
And build also symbol table. It can easily stripped
after .exe is done and it is necessary for profiling.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
After Joona's direct testing with ~2000 games it seems
values after 100.000 games does not give any advantage,
so revert for now.
Score of Stockfish_0 vs Stockfish_15: 491 - 392 - 1102
Score of Stockfish_0 vs Stockfish_40: 461 - 439 - 1076
Score of Stockfish_0 vs Stockfish_65: 442 - 518 - 1018 (13 elo)
Score of Stockfish_0 vs Stockfish_100: 504 - 502 - 984
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Currently minimum split depth is set automatically to 6
when number of CPUs is more than 4. I believe this is a bad
idea since for example my quad (4CPU with hyperthreading) is
detected as 8CPU computer. I've manually lowered down the number
of Threads, but so far I have played all games with Minimum
Split Depth set to 6!
Since 4CPU computers with hyperthreading are quite common and
8 CPU computers extremely rear (I expect we can get a direct
jump to 16 or 32 cores), this automatic adjusting is likely
to do more harm than good. Add a note in Readme.txt, so that
those rear 8CPU owners can manually tweak the "Minimum Split
Depth" parameter
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
previous used move_is_legal to verify that the move from the TT
was legal, and the old version of move_is_legal only works when
the side to move is not in check. Fixed this by adding a separate,
slower version of move_is_legal which works even when the side to
move is in check.
down the transposition table.
When the search was stopped before a fail high at the root was
resolved, Stockfish would often print a very short PV, sometimes
consisting of just a single move. This was not only a little
user-unfriendly, but also harmed the strength a little in
ponder-on games: Single-move PVs mean that there is no ponder
move to search.
It is perhaps worth considering to remove the pv[][] array
entirely, and always build the entire PV from the transposition
table. This would simplify the source code somewhat and probably
make the program infinitesimally faster, at the expense of
sometimes getting shorter PVs or PVs with rubbish moves near
the end.
Added the UCI_LimitStrength and the UCI_Elo options, with an Elo
range of 2100-2900. When UCI_LimitStrength is enabled, the number
of threads is set to 1, and the search speed is slowed down according
to the chosen Elo level.
Todo:
1. Implement Elo levels below 2100 by blundering on purpose and/or
crippling the evaluation.
2. Automatically calibrate the maximum Elo by measuring the CPU speed
during program initialization, perhaps by doing some bitboard
computations and measuring the time taken.
No functional change when UCI_LimitStrength is false (the default).
It is like a never finished painting. Everyday a little touch
more.
But this time it is very little ;-)
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Small micro-optimization in this very
time critical function.
Use bitwise 'or' instead of logic 'or' to avoid branches
in the assembly and use the result to skip an handful of checks.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Verified it is equivalent to the tuning branch results
with parameter values sampled after 100.000 games.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Some unwanted changes to Makefile slept in in patch
"Introduced the UCI_AnalyseMode option".
Revert them. No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
This is more similar to how get_material_info() and
get_pawn_info() work and also removes some clutter from
evaluate_king().
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
When ordering moves we push all captures with negative SEE values
to badCaptures[] array during the scoring phase.
This patch delays the costly SEE call up to when the move has been
picked up in pick_move_from_list(), this way we save some SEE calls
in case we get a cutoff.
It seems we have a speed gain of about 1-1.5 % in terms of nodes/sec
and profiling seems to confirm the small but real speed increase.
Idea from Pablo Vazquez on talkchess.com
http://www.talkchess.com/forum/viewtopic.php?t=29018&start=20
It would be a no functional change but actually it is not because
now sorting set is different and so std::sort(), that is not a
stable sort, does not guarantees the order of same scored moves to
remain the same as before.
After 952 games at 1+0 we are below error bar, almost equal just
6 games of difference (+2 ELO)
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Do not call count_1s_max_15() if not necessary, as is
not in the common case (>95%).
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Use a polynomial weighted evaluation to calculate
material value.
This is far more flexible and elegant then applying
a series of single euristic rules as before.
Also correct a design issue in which we returned two
values, one for middle game and one for endgame, while
instead, because game phase is a function of board
material itself, only one value should be calculated and
used both for mid and end game.
Verified it is equivalent to the tuning branch results with
parameter values sampled after 40.000 games.
After 999 games at 1+0
Mod vs Orig +277 =482 -240 51.85% 518.0/999 +13 ELO
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
To use the same naming rule of the other types and
to be compatible with inttypes.h, used under Linux.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
We do not accept null search returned mate values,
but we always do a full search in those cases.
So the variable mateThreat that is set only if null move
search returns a mate value is always false.
Restore the functionality of mateThreat moving the
assignement where it can be triggered.
After 999 games at 1+0
Mod vs Orig +253 =517 -229 51.20% +8 ELO
Bug reported by xiaozhi
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Tord says that using a lower horizon at PV nodes
looks strange and inconsistent with the general
philosophy of our search (i.e. always being more
conservative at PV nodes). So set LMR at 3 also
on search_pv().
Test result after 601 games seems to confirm this.
Mod vs Orig +156 =318 -127 52.41% 315.0/601 +17 ELO
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Test extension of LMR horizon to 3 plies alone, without
touching null move search. To keep the patch minimal we still
don't change LMR horizon in PV search. This will be the object
of the next patch.
Result seems good after 998 games:
Mod vs Orig +252/=518/-228 51.20% 511.0/998 +8 ELO
So dynamic null move reduction seems a bit stronger then
fixed reduction even with LMR horizon set to 3.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Revert to LMR horizont of 2 plies. Only if parent move
is a null move increase to 3 so to avoid the bad combination
of null move reduction + LMR reduction. This is a more
aggressive patch then previous one, but it seems we are
going in the wromg direction.
After 531 games result is not good:
Mod vs Orig +123/=265/-143 48.12% 255.5/531 -13 ELO
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Set null move reduction to R=4, but increase the LMR horizon
to 3 plies. The two tweaks are related and should compensate
the combined effect of null move + LMR reduction at shallow
depths.
Idea from Tord.
After 999 games at 1+0
Mod vs Orig +251 =522 -225 51.30% + 9 ELO
On Tord iMac Core 2 Duo 2.8 GHz, one thread,
Mac OS X 10.6, at 1+0 time control we have:
Mod vs Orig 994-1006 -1.4 ELO
But Orig version is pgo compiled and Mod is not.
The PGO compiled version is about 8% faster, which
corresponds to about 7 Elo points. This means that
results are reasonably consistent.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Code that compiles cleanly under MSVC triggers one
compile error (correct) under Intel C++ and two(!)
under gcc.
The first is the same complained by Intel, but the second
is an interesting corner case of C++ standard (there are many)
that is correctly spotted only by gcc.
Both MSVC and Intel pass this silently, probably to avoid
breaking people code.
Now we are fully C++ compliant ;-)
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
This avoid to duplicate storage allocation for every file
where they are used.
Note that simple numeric constant can remain in header because
are automatically folded by the compiler.
Patch suggested by Tord.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Push on the templatization even more to chip out some code
and take the opportunity to show some neat template trick ;-)
Ok. I would say we can stop here now....it is quickly becoming
a style exercise but we are not boost developers so give it a stop.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
a static eval cached in the transposition table would always equal the static
eval of the current position. This is in general not true, because the cached
value could be from a previous search with different evaluation parameter
settings, or from a search from the opposite side (Stockfish's evaluation
function is assymmetric by default).
We really don't need to have global endgame functions. We can
allocate them on the heap at initialization time and store the
corresponding pointer directly in the functions maps. To avoid
leaks we just need to remember to deallocate them in map d'tor.
These functions are always created in couple, one for each color,
so remove a lot of redundant hard coded info and just use the minimum
required: the type and the corresponding named string.
This greatly simplifies the code and also it is less error prone,
now is much simpler to add a new endgame specialized function: just
add the corresponding enum in endgame.h and the obvious add_xx()
call in EndgameFunctions c'tor, and of course, the most important part,
the EvaluationFunction<xxx>::apply() specialization in endgame.cpp
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
This feature makes sense during development, but
It doesn't seem to make sense for normal users.
Also fix a possible race where the GUI adjudicates
the game a fraction of second before the engine sets
looseOnTime flag so that it will bogusly waits until
it ran out of time at the beginning of the next new game.
The fix is to always reset looseOnTime at the beginning
of a new game.
Race condition spotted by Tord.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
This is another moves serialization macro but this time
focused on pawn moves where the 'from' square is given as
a delta from the 'to' square.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
It is very rare we have pawns on 7(2) rank, so we
can skip the promotion handling stuff in most cases.
With this patch pawn moves generation is almost 20% faster.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Mostly of times we are interested only in the sign of SEE,
namely if a capture is negative or not.
If the capturing piece is smaller then the captured one we
already know SEE cannot be negative and this information
is enough most of the times. And of course it is much
faster to detect then a full SEE.
Note that in case see_sign() is negative then the returned
value is exactly the see() value, this is very important,
especially for ordering capturing moves.
With this patch the calls to the costly see() are reduced
of almost 30%.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Some variables were global due to some old and now removed code,
but now can be moved in local scope.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Verification test give unusless result
After 999 games at 1+0
Mod vs Orig +250 =503 -246 50.20% +1 ELO
So we are well below our radar level. Neverthless
there are 100.000 games on Joona QUAD that we could
take in account and that shows that this tweak perhaps
has something good in it, altough very little.
Verification tests shows should not be a regression, at
least not a big one even in the worst case, so apply the
change anyway and keep the finger crossed ;-)
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
It turned out that the input sent to set_option_value() when it is called by
set_option() in uci.cpp always started with at least one whitespace. In most
cases, this is not a problem, because the majority of UCI options have numeric
values. It did, however, cause a problem for UCI options with non-numerical
values, like options of type CHECK and COMBO. In particular, changing the
value of an option of type CHECK didn't work, because the comparisons with
"true" and "false" would always return false. This means that the "Ponder"
and "UCI_Chess960" options haven't been working for a while.
Unfortunatly this tweak does not give good results.
After 894 games at 1+0 we have:
Mod vs Orig +205/-236/=453 48.27% -12 ELO !!
Perhaps we should test again, but in the mean time
we are going to revert this.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
A promotion move is not considered a possible evasion as it could be.
Bug introduced by patch
Convert also generate_pawn_blocking_evasions() to new API (7/5/2009)
Bug spotted by Kenny Dail.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Set gcc as default compiler on Linux, also compile
with symbols stripped to shrink binary file.
Original patch by Heinz van Saanen.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Predefined macro __INTEL_COMPILER is defined only for Intel,
while _MSC_VER is defined for both Intel C++ and MSVC.
So rearrange ifdefs to take in account this and test __INTEL_COMPILER
first and only if not defined check _MSC_VER for MSVC.
Patch suggested by Joona.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Add a new argument to bench to specify the name of the
file where timing information will be saved for each
benchmark session.
This argument is optional, if not specified file will
not be created.
Original patch by Heinz van Saanen
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
This is mainly intended to allow 64 bit compiles on any
system and avoid to crash when the binary, compiled on a
box where POPCNT is not supported, is run on a Core i7
system or similar CPU.
What could happen is that when compiled in a standard 64 bit
system, because the correct headers for the POPCNT intrinsic
are not found, the compiler creates dummy bit count functions
instead, these are never called at runtime on the machine where
Stockfish has been compiled. But if we run the same binary on a
Core i7 system, because POPCNT is detected at run time, the dummy
bitcount functions will be called giving false results that will
crash the application.
Note that would be possible to fallback on software bit count in
these cases, but this is even more subtle because POPCNT path is not
optimized so that we have an application working but at sub-optimal
speed, so better to crash, at least user is loudly warned that there
is something wrong.
If, instead, Stockfish is compiled on a Core i7 system with POPCNT
enabled, then if the PGO compile has been done properly, the same binary
will run at optimal speed _both_ on the Core i7 machine and on any other
64 bit standard machine. This is the ideal mode for binary distribution.
Finally this patch disables bsfq support under Windows, because it seems
inline assembly is not supported both by MSVC and by Intel Windows version.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Also rename DISABLE_POPCNT_SUPPORT in NO_POPCNT and simplify a bit
the macro logic.
Always define a __popcnt64()or _mm_popcnt_u64() template, if the proper
function with the same name is defined in the intrinsics header, then it
will be choosen as first otherwise we fall back on the dummy template
that is never called at runtime anyway because cpu_has_popcnt() returns
false.
This fixes the compile error reported by Jim.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Avoid indirect calling of piece_of_color_and_type(c, PAWN) and its
alias pawns(c) in the pawn evaluation loop, but use the pawns
bitboards accessed only once before entering the loop.
Also explicitly mark functions as static to better self-document.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Was erroneusly changed with the 32bit in recent
patch "Retire USE_COMPACT_ROOK_ATTACKS...".
Also another clean up of define magics. Move compiler
specific definitions in types.h and remove redundant cruft.
Now this macro ugly mess seems more reasonable.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
On 64 bit systems we can use bsfq instruction to count
set bits in a bitboard.
This is a patch for GCC and Intel compilers to take advantage
of that and get a 2% speed up.
Original patch from Heinz van Saanen, adapted to current tree
by me.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
This greatly simplifies bitboard.cpp that now has only two setups,
respectively for 32 and 64 bits CPU according to IS_64BIT define
that is automatically set but can be tweaked manually in
bitboard.h
No functional change both in 32 and in 64 bits.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Testing on Joona QUAD failed to give any
advantage. Actually we had a little loss:
Mod - Orig: 342.0 - 374.0
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
This is the backport of tuned piece values.
We needed to change also the psqt tables so that their
values, that are relative to piece values, remain the same.
Amost no change after 999 games:
Mod vs Orig 594-495 + 2 ELO points so well within error bar
It was expected somehow given the very little change of the
parameters values.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Instead of add and subtract pqst values corrisponding to
the move starting and destination squares, do it in one
go with the helper function pst_delta<>()
This simplifies the code and also better documents that what
we need is a delta value, not an absolute one.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Rename to move_is_promotion() to be more clear, also add
a new function move_promotion_piece() to get the
promotion piece type in the few places where is needed.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Under MS Visual C++ debug window always unconditionally closes
when program exits, this is bad because we want to read results before.
So limit this kludge on Windows only.
Original patch by Heinz van Saanen.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Micro optimization in do_move(), a quick check
avoid us to update castle rights in almost 90%
of cases.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
When we are hunting for mate, transposition table is filled in
with mate scores. Current implemenatation of aspiration search
can't cope with this very well.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Problem is that npMaterial is compared to _endgame_ value
of rook, although npMaterial is always (also in endgame!)
calculated using _middlegame_ values.
Bug was hidden as long as Rook middlegame
and endgame values were same.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
These are the tuned values of mobility and outposts
after 100.000 games on Joona QUAD.
After 999 games at 1+0
Mod vs Orig +248 =537 -214 51.70% +12 ELO
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
When SEE piece values changed in aaad48464b
of 9/12/2008 we forgot to update the value assigned in
case of captured king.
In that patch we changed the SEE piece values but without
proper testing. Probably it is a good idea to make some
tests with the old Glaurung values.
Bug spotted by Joona.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Move TT object away from heavy write accessed NodesSincePoll
and also, inside TT isolate the heavy accessed writes variable.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
We can have false positives, but these are filtered out
anyhow by following conditions so they are harmless.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
This was needed by an old optimization in sorting of
non-captures that is now obsoleted by new std::sort()
approach.
Remove also the unused depth member data. Interestingly
this has always been unused since the Glaurung days.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
These are the tuned psqt values after 100.000 games
on Joona QUAD. Results seem very good.
On PC 1 after 999 games
Mod vs Orig +261 =511 -227 51.70 % +12 ELO
On PC 2 after 913 games
Mod vs Orig +254 =448 -211 52.35 % +16 ELO
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Creating an History object requires clearing the History tables,
although fast is an useless job in san.cpp where History is used
just as a dummy argument for MovePicker c'tor.
So use a file scoped constant instead of creating a new History()
object each time MovePicker c'tor is called as in move_ambiguity()
This optimizes pretty_pv() through the following calling chain:
pretty_pv() -> line_to_san() -> move_to_san() -> move_ambiguity()
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
This also allow us to better track what is already
optimized and what still needs optimization.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
King evaluation is special in any case and as an added
benefit we can use the HasPopCnt optimization also for king.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
This is an old patch, was part of a series, but is
good also alone as a cleanup.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Also move NodesSincePoll away from the same cache line
of other heavy read accessed only variables.
Fortunatly we don't have anymore write access contention,
but still read access contention in some cases.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
They are always true anyway and are heavy used file scope
variables where there could be SMP contention. Although read only.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
This reduces contention and reduce history trashing
effect especially at high search depths.
No functional change for single CPU case.
After 999 games at 1+0 on Dual Core Intel we have
Mod vs Orig +233 =526 -240 -2 ELO
We need to test at longer time controls and possibly with
a QUAD where we could foreseen an improvment.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
This is a first step for future patches and in
any case seems a nice thing to do.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
In Position we store a pointer to a StateInfo record
kept outside of the Position object.
When copying a position we copy also that pointer so
after the copy we have two Position objects pointing
to the same StateInfo record. This can be dangerous
so fix by copying also the StateInfo record inside
the new Position object and let the new st pointer
point to it. This completely detach the copied
Position from the original one.
Also rename setStartState() as saveState() and clean up
the API to state more clearly what the function does.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Do not penalize if in our adavncing pawn's path there are
non-pawns enemy pieces. Especially if they can be attacked
by us.
Patch is mine, but original idea and also fixing of a first, wrong,
version of the patch is from Eelco de Groot.
Tests with Joona framework seems to confirm patch is good
Results for patch 'disabled' based on 5776 games: Win percentage:
41.309 (+- 0.526) [+- 1.053]
Results for patch 'enabled' based on 6400 games: Win percentage:
42.422 (+- 0.500) [+- 1.000]
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Tests on Joona luxury iCore7 QUAD show that speed increase
against standrd 64bit routine is between 3% and 4%.
So it seems a good thing to have. Also the user feedback at
startup regarding the compile and the hardware detection can
be an useful debug tool.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
In MovePicker consider killer moves as a separate
phase from non-capture picking.
Note that this change guarantees that killer1 is always
tried before killer2. Until now, because scoring difference
of the two moves was just 1 point, if psqt tables of killer1
gave a lower value then killer2, the latter was tried as first.
After 999 games at 1+0 we have
Mod vs Orig: +245 =527 -227 +6 ELO
Not a lot but patch is anyhow something worth to have.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
It shows almost no improvment and adds a good
bunch of complexity.
So remove for now. No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
We don't need different names between a function and a
template. Compiler will know when use one or the other.
This let use restore original count_1s_xx() names instead of
sw_count_1s_xxx so to simplify a bit the code.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
With this patch at the applications startup a line is printed
with info about use of optimized 64 bit routines and hardware
POPCNT.
Also allow the possibility to disable POPCNT support during
PGO compiles to exercise the fallback software only path.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Runtime detect POPCNT instruction support and
use it.
Also if POPCNT is not supported we don't add _any_ overhead so
that we don't lose any speed in standard case.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Unfortunatly, due to Chess960 compatibility we cannot
extend also to castling where the destinations squares
are not guaranteed to be empty.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Avoid a clear_bit() + set_bit() sequence but update bitboards
with only one xor instructions.
This is faster and simplifies the code.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
This gives more weight to newer entries.
After 999 games at 1'+ 0" we have:
Mod vs Orig +233/-208/=558 51.25% +9 ELO
Confirmed by another session of 437 games:
Mod vs Orig +109/-92/=236 51.95% +14 ELO
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Use the plain array lookup in the only place where it
is used. This remove an unecessary indirection and better
clarifies what code does.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
This is the same change we have already done in search.cpp,
this time for evaluation.cpp
If a variable will be populated reading an UCI option
then do not hard code its default values.
This avoids misleadings when reading the sources.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
This should reduce concurrent accessing in SMP case.
Suggestion by Tord Romstad.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Both with added comment and changing the API to
reflect that only destination square and moved piece
is important for history.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Aspiration window search must be disabled for
multi-pv case.
We missed one point where aspiration window should
be disabled in this case.
Patch from Joona, with a little added edit by me.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Use a better condition to find candidate direct check pawns.
In particular consider only pawns in the front ranks of the
enemy king, this greatly reduces pawns candidates bitboard
that now is empty more then 90% of the time so that we
can early skip further tests.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Be more strict, is not enough dc bitboard is not empty, but
needs to inclde also at least one pawn.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
We can parametrize for the capture direction too.
Also use a single template parameter and calculate (at
compile time) remainin parameters directly in the function.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Centralize in a single object all the global resources
management and avoid a bunch of sparse exit() calls.
This is more reliable and clean and more stick to C++ coding
practices.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Move closing of file in Book destructor. This
guarantees us against leaving the file open under
any case and simplifies also Book use.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Let Book be derived directly from std::ifstream
and rewrite the functions accordingly.
Also simplify file reading by use of operator>>()
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
In case we have a correct white pawn move but pawn
is black (or the contrary) we fail to detect the
move as illegal.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Use pieces_of_type() instead of pieces_of_color_and_type()
in an hot loop and cut of almost 10% SEE execution time.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Instead of a position because the key is all that we
need.
Interface is more clear and also very very little bit faster.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
// Only declared but not defined. We don't want to multiply two scores due to
// a very high risk of overflow. So user should explicitly convert to integer.
inlineScoreoperator*(Scores1,Scores2);
////
//// Constants and variables
////
@@ -62,16 +106,16 @@ enum Value {
///
/// Values modified by Joona Kiiski
constValuePawnValueMidgame=Value(0x0CC);
constValuePawnValueEndgame=Value(0x101);
constValueKnightValueMidgame=Value(0x332);
constValuePawnValueMidgame=Value(0x0C6);
constValuePawnValueEndgame=Value(0x102);
constValueKnightValueMidgame=Value(0x331);
constValueKnightValueEndgame=Value(0x34E);
constValueBishopValueMidgame=Value(0x345);
constValueBishopValueEndgame=Value(0x356);
constValueRookValueMidgame=Value(0x4F8);
constValueRookValueEndgame=Value(0x500);
constValueQueenValueMidgame=Value(0x9D5);
constValueQueenValueEndgame=Value(0x9FB);
constValueBishopValueMidgame=Value(0x344);
constValueBishopValueEndgame=Value(0x359);
constValueRookValueMidgame=Value(0x4F6);
constValueRookValueEndgame=Value(0x4FE);
constValueQueenValueMidgame=Value(0x9D9);
constValueQueenValueEndgame=Value(0x9FE);
constValuePieceValueMidgame[17]={
Value(0),
@@ -93,10 +137,9 @@ const Value PieceValueEndgame[17] = {
Value(0),Value(0),Value(0)
};
/// Bonus for having the side to move
/// Bonus for having the side to move (modified by Joona Kiiski)
constValueTempoValueMidgame=Value(48);
constValueTempoValueEndgame=Value(21);
constScoreTempoValue =make_score(48,22);
////
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.