AGORA
Nodes
61/ 92
Peak
61
Loss
2.922
TPS
122 kTPS
Protocol overhead
37%
Pluralis
92%
Wait
04 min
Showing single run
Pluralis-8BLive
Pluralis-8BLiveModelPluralis-8BDatasetFineWeb-Edu (1.3T)
Waitlist0 in queue · 4 min
Joining queue
0Waiting
4 min
Pipeline0 Stages · 0 Workers
No swarm data
Trainers
NAActive
All Pluralis
Range
Smooth0.00
X
Training loss
Cross-entropy
Throughput
Tokens / sec
Throughput per TFLOP
Tok/s divided by total dense BF16 TFLOPs · ratio of training output to swarm compute
4.27
Centralised baseline: TorchTitan FSDP2+compile BF16 · 8× H100 → 6.74 tok/s/TFLOP. See Analysis tab for full breakdown + sources.
Total swarm TFLOPS
Total dense BF16 compute · Pluralis-operated vs Contributor split
28.42kPluralis 92%Contributor 8%
Mean MFU
Model FLOP utilisation · Achieved ÷ peak (Chinchilla 6N, 8B, denom: BF16)
20.5%
Centralised reference: 40% MFU · 8× H100 · Llama 3 8B · BF16 · FSDP2 + torch.compile, BF16 — published in TorchTitan benchmarks · Dec 2024.
Evaluations
No data yet for Pluralis-8B
Benchmark results will appear here as the training run progresses.