technical

Minimax Search Algorithm

At its core, Gillespie uses the minimax algorithm to search the game tree. The algorithm explores possible future positions by alternating between maximizing White's advantage and minimizing Black's advantage, assuming both players make optimal moves.

Core Concept

1.
Generate Moves: From current position, generate all legal moves
2.
Recursive Search: For each move, recursively search opponent's responses
3.
Evaluate Leaves: At maximum depth, evaluate position using MoE-NNUE
4.
Propagate Values: Back-propagate best values up the tree

MAX vs MIN: At each level, the algorithm alternates between maximizing (White's turn) and minimizing (Black's turn) the evaluation score.

Search Tree Visualization

ROOT (MAX)
+0.5
MIN
+0.5
e4
MIN
+0.2
Nf3
MIN
+0.3
d4
+0.7
+0.5
+0.3
+0.2
+0.4
+0.3
Leaf Nodes (NNUE Eval)
Nodes searched ~2.5M nodes/sec
Typical depth 20-25 ply

Alpha-Beta Pruning

Alpha-beta pruning is an optimization that dramatically reduces the number of nodes searched. It eliminates branches that cannot possibly affect the final decision, achieving the same result as minimax but orders of magnitude faster.

How It Works

Alpha (α)

Best value maximizer can guarantee. Initialized to -∞, increases during search.

Beta (β)

Best value minimizer can guarantee. Initialized to +∞, decreases during search.

Cutoff Condition

When β ≤ α, remaining moves cannot improve the position and are pruned.

Efficiency Gains

Pure minimax bd nodes
Alpha-beta (avg) b3d/4 nodes
Alpha-beta (perfect) bd/2 nodes
Speedup factor ~100-1000×

b = branching factor (~35 for chess), d = search depth

Move Ordering Impact

The effectiveness of alpha-beta pruning heavily depends on move ordering. Gillespie employs several techniques to search promising moves first:

Hash Move
Try move from transposition table first
MVV-LVA
Most Valuable Victim - Least Valuable Attacker for captures
Killer Moves
Moves that caused cutoffs at same depth

Advanced Search Techniques

Quiescence Search

Extends search beyond nominal depth for tactical positions (captures, checks, promotions) to avoid horizon effect. Prevents evaluation of unstable positions.

AVG EXTENSION
+8 ply

Transposition Table

Hash table storing previously evaluated positions. Different move orders can reach identical positions; the table prevents redundant evaluation.

TABLE SIZE
16 GB

Null Move Pruning

If passing the turn (illegal in chess) still yields a good position, current position is likely strong. Allows aggressive pruning in non-zugzwang positions.

REDUCTION
R = 3 ply

Late Move Reductions

Later moves in well-ordered search are less likely to be best. Search them at reduced depth initially, re-search if they exceed expectations.

REDUCTION
1-4 ply

Search Statistics

Nodes/Second
2.5M
Avg Depth
22 ply
TT Hit Rate
87%
Prune Rate
65%

MoE-NNUE Architecture

Mixture of Experts

Traditional chess engines employ a single neural network for position evaluation. Gillespie pioneers a Mixture of Experts (MoE) approach, utilizing multiple specialized NNUE models that dynamically adapt to different game phases and position types.

NNUE (Efficiently Updatable Neural Network): A neural network architecture designed for chess evaluation that can be incrementally updated as moves are made, enabling real-time position assessment at millions of nodes per second.

System Overview

01

Position Analysis

Extract features: material balance, king safety, pawn structure, piece mobility

02

Expert Selection

Gating network routes position to specialized expert models based on game phase

03

Evaluation Output

Weighted combination of expert outputs produces final centipawn evaluation

Architecture Flow

Position Input
Feature Extraction
Gating Network
Expert Models (×5)
Evaluation

Expert Model Specialization

Each expert model is trained on specific game phases and position types, developing deep understanding of contextual evaluation:

Opening Expert

Specializes in moves 1-15. Focus: development, center control, king safety preparation.

TRAINED POSITIONS
847K

Middlegame Expert

Specializes in moves 16-40. Focus: tactical combinations, piece coordination, attack patterns.

TRAINED POSITIONS
1.2M

Endgame Expert

Specializes in moves 40+. Focus: king activity, pawn promotion, zugzwang positions.

TRAINED POSITIONS
923K

Tactical Expert

Activates on sharp positions. Focus: sacrifices, forcing sequences, mate threats.

TRAINED POSITIONS
654K

Positional Expert

Handles quiet, closed positions. Focus: space advantage, piece placement, long-term planning.

TRAINED POSITIONS
782K

Gating Network Mechanism

The gating network is a lightweight neural network that analyzes position features and determines which expert models to activate. It outputs a probability distribution over all experts.

Input Features (64-dim vector)

Move number normalized 0-1
Material count 32 pieces → 5 bits
Position complexity 0-100 scale
Tactical density threats/square
Pawn structure hash encoding

SOFTMAX OUTPUT: The gating network applies softmax activation to produce weights that sum to 1.0, enabling smooth blending between experts during transitional positions.

Example Gate Activations

Opening Expert 0.78
Middlegame Expert 0.15
Endgame Expert 0.02
Tactical Expert 0.03
Positional Expert 0.02

Position Context

Move 8, Italian Game opening, material equal, low tactical complexity

Network Architecture Details

Gating Network

Input Layer
64 features → position encoding
Hidden Layer 1
128 neurons, ReLU activation
Hidden Layer 2
64 neurons, ReLU activation
Output Layer
5 neurons, Softmax activation
Parameters
12.8K
Inference Time
~2 μs

Expert NNUE Models

Input Layer
768 features → HalfKP encoding
Hidden Layers
4 × [512→256→128→64], ClippedReLU
Accumulator
Incremental update cache
Output Layer
1 neuron → centipawn evaluation
Parameters (each)
49.4M
Inference Time
~15 μs

Complete Forward Pass

STEP 1 Position features extracted from board state ~1 μs
STEP 2 Gating network computes expert weights ~2 μs
STEP 3 Top-2 experts evaluated in parallel (AVX2) ~15 μs
STEP 4 Weighted combination of expert outputs <1 μs
TOTAL Final evaluation computed ~19 μs

Performance Metrics

MoE-NNUE vs Single-Model NNUE

Opening Phase Accuracy +8.3%
Single
84.2%
MoE
92.5%
Middlegame Accuracy +6.1%
Single
89.7%
MoE
95.8%
Endgame Accuracy +11.4%
Single
82.1%
MoE
93.5%

Computational Overhead

Single-Model Inference ~15 μs
MoE Inference (Top-2) ~19 μs
Overhead +26.7%

EFFICIENCY TRADE-OFF: The 27% computational overhead yields 8-11% accuracy improvements across all game phases, resulting in +180 Elo gain against baseline single-model engine.

Net Elo Improvement
+180
vs. single-model NNUE

Training Methodology

Two-Stage Training Pipeline

STAGE 1

Expert Pre-Training

  • Each expert trained independently on phase-specific positions
  • Self-play games filtered by move number and complexity
  • Loss function: Mean Squared Error on centipawn values
  • Training time: ~120 hours per expert on 8× A100 GPUs
STAGE 2

Gating Network Training

  • Expert weights frozen, only gating network trainable
  • Reinforcement learning via self-play tournaments
  • Reward signal: game outcome + evaluation accuracy
  • Training time: ~80 hours on single machine

Training Data Statistics

Total Games
4.3M
Positions
187M
GPU Hours
680
Dataset Size
2.1TB

Bullet Trainer: High-Velocity Training System

The Bullet Trainer is Gillespie's specialized training infrastructure designed for rapid iteration and high-throughput position generation. It generates training data through ultra-fast self-play at drastically reduced time controls.

System Architecture

01
Distributed Engine Cluster
512 CPU cores running parallel self-play games at 1+0 time control
02
Position Filtering
Filter out trivial positions, duplicates, and theoretical endgames
03
Quality Scoring
Rank positions by complexity, tactical richness, and learning value
04
Batch Assembly
Group positions into balanced training batches by game phase

Generation Rate

GAMES/HOUR
~43K
POSITIONS/HOUR
~1.8M

Training Loop

Initial Bootstrap

EPOCH 0

Train on human GM games (Lichess Elite Database). Establishes baseline chess knowledge.

Self-Play Generation

LOOP

Current model plays against itself. Exploration noise added to move selection for diversity.

Model Update

TRAIN

Train on new positions, backpropagate game outcomes. Update weights via Adam optimizer.

Validation

TEST

Play 1000 games vs. previous checkpoint. Keep if Elo gain > +10.

Position Quality Metrics

Complexity Score

Weighted sum of: piece count, mobility, tactical opportunities, pawn structure variety

THRESHOLD
≥ 18.0

Evaluation Variance

Standard deviation of evaluations across multiple depths. High variance = uncertain position

IDEAL RANGE
0.3 - 1.2

Uniqueness Hash

Zobrist hash with piece-square feature encoding. Prevents similar positions in training batch

COLLISION RATE
< 0.01%

Data Augmentation

Horizontal Flip Mirror positions along vertical axis (a-file ↔ h-file)
Color Swap Invert board and swap piece colors (White ↔ Black)
Noise Injection Add ±0.05 random noise to evaluation targets

Hardware Infrastructure

CPU Cluster 64× AMD EPYC 7763
GPU Training 8× NVIDIA A100 80GB
Storage 128TB NVMe RAID
Network 100 Gbps InfiniBand
Total Cost ~$450K

Implementation & Optimization

CPU Optimizations

SIMD Vectorization

AVX2/AVX-512 instructions for parallel expert evaluation. Process 8 positions simultaneously with 256-bit registers.

Incremental Updates

NNUE accumulator caching reduces recomputation. Only affected neurons updated when pieces move.

Sparse Activation

Top-K expert selection (K=2). Only 40% of experts evaluated per position, reducing inference cost.

Memory Layout

Expert Models (5×) 1.2 GB
Gating Network 51 KB
Accumulator Cache 128 MB
Hash Tables 16 GB
Total Memory ~17.3 GB

Quantization

INT8 quantization reduces model size by 75% with <2% accuracy loss:

FP32
1.2 GB
INT8
312 MB