KellyBench tested eight frontier AI models on a full Premier League betting season. Every model lost money, exposing limits in long-horizon agent reasoning.
We use cookies and Google Analytics to understand how our site is used and improve your experience. See our Privacy Policy.
Partnerships, listings, corrections, or press — we'd love to hear from you.