Fix TT comment and static_assert()

Comment is based on a misunderstanding of what unaligned memory access is. Here
is an article that explains it very clearly:
https://www.kernel.org/doc/Documentation/unaligned-memory-access.txt

No matter how we define TTEntry or TTCluster, there will never be any unaligned
memory access. This is because the complier knows the alignment rules, and does
the necessary adjustments to make sure unaligned memory access does not occur.

The issue being adressed here has nothing to do with unaligned memory access. It
is about cache performance. In order to achieve best cache performance:
- we prefetch the cacheline as soon as possible.
- we ensure that TT clusters do not spread across two cachelines. If they did,
  we would need to prefetch 2 cachelines, which could hurt cache performance.

Therefore the true conditions to achieve this are:
1/ start adress of TT is cache line aligned. void TranspositionTable::resize()
enforces this.
2/ TT cluster size should *divide* the cache line size. Currently, we pack 2
clusters per cache lines. It used to be 1 before "TT sardines". Does not matter
what the ratio is, all we want is to fit an integer number of clusters per cache
line.

No functional change.

Resolves #506
This commit is contained in:
lucasart
2015-11-20 23:23:53 -08:00
committed by Joona Kiiski
parent 93195555ed
commit 328098d027

View File

@@ -76,8 +76,9 @@ private:
/// A TranspositionTable consists of a power of 2 number of clusters and each /// A TranspositionTable consists of a power of 2 number of clusters and each
/// cluster consists of ClusterSize number of TTEntry. Each non-empty entry /// cluster consists of ClusterSize number of TTEntry. Each non-empty entry
/// contains information of exactly one position. The size of a cluster should /// contains information of exactly one position. The size of a cluster should
/// not be bigger than a cache line size. In case it is less, it should be padded /// divide the size of a cache line size, to ensure that clusters never cross
/// to guarantee always aligned accesses. /// cache lines. This ensures best cache performance, as the cacheline is
/// prefetched, as soon as possible.
class TranspositionTable { class TranspositionTable {
@@ -86,10 +87,10 @@ class TranspositionTable {
struct Cluster { struct Cluster {
TTEntry entry[ClusterSize]; TTEntry entry[ClusterSize];
char padding[2]; // Align to the cache line size char padding[2]; // Align to a divisor of the cache line size
}; };
static_assert(sizeof(Cluster) == CacheLineSize / 2, "Cluster size incorrect"); static_assert(CacheLineSize % sizeof(Cluster) == 0, "Cluster size incorrect");
public: public:
~TranspositionTable() { free(mem); } ~TranspositionTable() { free(mem); }