ARMS & HAT Memory Lab
Interactive demonstrations of the Hierarchical Attention Tree (HAT) and Attention Reasoning Memory Store (ARMS) systems. Explore how AI memory can be spatial, hierarchical, and persistent.
Experiments
HAT Tree Visualization
activeInteractive 3D visualization of the Hierarchical Attention Tree structure showing how attention states are organized by session, document, and chunk
4096D Coordinate Explorer
activeExplore how attention states map to coordinates in high-dimensional space. Visualize clustering and discrimination between topics
Memory Retrieval Demo
activeQuery the HAT index and see how beam search navigates the tree to find relevant attention states
Compression Analyzer
plannedAnalyze attention pattern sparsity and compression ratios across different model architectures
About This Lab
The ARMS & HAT Memory Lab provides interactive tools to explore our breakthrough research in AI memory systems. These experiments demonstrate how:
- HAT organizes attention states into navigable hierarchical trees
- ARMS stores states at coordinate positions in 4096-dimensional space
- Memory retrieval achieves 100% recall with O(log n) complexity
The Core Innovations
HAT: Structure Over Learning
Traditional vector databases (HNSW, Annoy, FAISS) learn topology from data. HAT exploits known structure:
Traditional: Points → Learn topology → Navigate
HAT: Points → Use known hierarchy → Navigate directly
Result: 100% recall vs 70% for HNSW
70× faster construction
ARMS: Position Is Memory
Traditional memory systems project states to lower dimensions, losing information at each step. ARMS stores states at their actual coordinate positions:
Traditional: State → Project → Index → Retrieve → Reconstruct (lossy)
ARMS: State → Store AT coords → Retrieve → Inject (lossless)
Result: 5,372× compression with exact restoration
Interactive Experiments
1. HAT Tree Visualization
Explore the hierarchical structure of a HAT index:
- Global root containing all memory
- Session nodes (conversation boundaries)
- Document nodes (topic groupings)
- Chunk leaves (individual attention states)
See how centroids propagate up the tree and how beam search navigates down.
2. 4096D Coordinate Explorer
Visualize how attention states cluster in high-dimensional space:
- t-SNE projections showing topic separation
- UMAP embeddings revealing structure
- Cross-topic similarity: -0.33 (excellent discrimination)
3. Memory Retrieval Demo
Watch HAT find memories in real-time:
- Enter a query
- See beam search expand candidates at each level
- Observe centroid similarity scores
- View retrieved attention states
4. Compression Analyzer (Coming Soon)
Analyze why attention patterns are highly compressible:
- ~90% of weights have less than 1% magnitude
- Similar queries → similar patterns (cacheable)
- Compression potential: 5,000-18,000×
Key Metrics
| System | Metric | Value |
|---|---|---|
| HAT | Recall@10 | 100% |
| HAT | Build Time vs HNSW | 70× faster |
| HAT | Query Latency | 3.1ms |
| ARMS | Compression Ratio | 5,372× |
| ARMS | Cross-topic Similarity | -0.33 |
| Combined | Context Extension | 6× (10K → 60K+) |
The Hippocampus Model
Our architecture mirrors human memory:
| Human Memory | System Equivalent |
|---|---|
| Working memory (7±2 items) | Current context window |
| Short-term memory | Recent session containers |
| Long-term episodic | HAT hierarchical storage |
| Memory consolidation | Consolidation phases (α/β/δ/θ) |
| Hippocampal indexing | Centroid-based routing |
Technical Stack
- Core: Rust (performance-critical paths)
- Bindings: PyO3 (Python integration)
- Index: Custom HAT + FAISS baseline
- Encoder: Sentence Transformers
- Visualization: React Three Fiber + D3
Getting Started
from arms_hat import HatIndex
# Create index
index = HatIndex.cosine(1536)
# Start a conversation (session)
index.new_session()
# Add messages
for message in conversation:
embedding = encoder.encode(message)
index.add(embedding, message)
# Query memory
query = encoder.encode("What did we discuss about X?")
results = index.near(query, k=10)
# Retrieve with 100% accuracy
for state_id, similarity in results:
print(f"Found: {state_id} @ {similarity:.3f}")
Research Applications
- Persistent conversation memory - Cross-session context
- Knowledge graph construction - Structured fact extraction
- Model debugging - Inspect attention patterns
- Compute caching - Skip redundant attention computation
- Multi-agent memory - Shared attention manifolds
This lab is part of ongoing research at Automate Capture. Experiments are interactive and run in your browser where possible.