AI Now Rivals Prediction Markets in Real-World Event Forecasting, Groundbreaking Study Reveals
Machines are calling the shots—and getting scarily good at it. Artificial intelligence now matches prediction markets in forecasting real-world events, according to new research. That means algorithms go toe-to-toe with crowd wisdom—and Wall Street’s favourite gambling parlours.
How AI pulled it off
Neural networks digest insane amounts of data—news, historical trends, social chatter—then spit out probabilities with unnerving accuracy. No gut feelings, no emotional bias. Just cold, hard math outperforming human consensus.
Why traders should care
Prediction markets have long been finance’s dirty little secret for hedging real-world risk. Now AI bypasses the middleman—and the spread. Some hedge funds are already quietly layering these models into high-stakes decision pipelines. The edge? Speed, scale, and no lunch breaks.
But remember—past performance is never gospel, even when delivered by silicon. Just ask anyone who’s ever leveraged ‘can’t-miss’ algo signals into a margin call.
“By anchoring evaluations in unresolved, real-world events, Prophet Arena ensures a level playing field. There is no pre-training advantage, no secret fine-tuning trick, no leakage of test samples,” the Prophet Arena team said in the benchmark’s official blog post.
The benchmark says it is trying to address a fundamental question about artificial intelligence: “Can AI systems reliably predict the future by connecting the dots across existing real-world information?”
Early results suggest they can. GPT-5 currently leads the leaderboard with a Brier score of 82.21%. Meanwhile, OpenAI's o3-mini model has emerged as the profit champion, generating the highest average returns when its predictions are translated into simulated bets (usually an underdog with enough chances to win can provide a lot more return, given the proper conditions).
DeepSeek R1 appears to be the contrarian AI in the group, frequently making predictions that diverge sharply from both other models and market consensus, so probably not the best model to trust if you want to make a quick buck on Myriad Markets.


The platform reveals distinct "personalities" among AI models when facing identical information. In one example, when predicting whether AI regulation WOULD become federal law before 2026, the market assigned just a 25% probability. But the models diverged wildly: Qwen 3 predicted 75%, GPT-4.1 estimated 60%, while Llama 4 Maverick stayed conservative at 35%.
In another case, o3-mini earned a simulated $9 return on a $1 bet by correctly predicting Toronto FC would beat San Diego FC in a Major League Soccer match. The model gave Toronto a 30% chance of winning, while the market priced it at just 11%. Toronto won.
"(Prophet Arena) tests models' forecasting capability, a high FORM of intelligence that demands a broad range of capabilities, including understanding existing information and news sources, reasoning under uncertainty, and making time-sensitive predictions about unfolding events," the researchers wrote.
The Prophet Arena also enables human-AI collaboration. Users can supply additional news and context to see how predictions shift, while AI models provide detailed rationales for their forecasts.
As prediction markets themselves integrate AI—Kalshi recently partnered with Elon Musk's Grok, while Polymarket generates AI-powered market summaries—Prophet Arena offers the first systematic comparison of machine forecasting against collective human judgment.
And, if they get really good at it, then machines can be purely factual, with no sentiments or emotions playing a role in the decisions. They could potentially match or exceed the wisdom of crowds, changing the way institutions approach risk assessment, investment decisions, and strategic planning.
The Prophet Arena platform continues updating daily as events resolve, providing an evolving picture of whether artificial intelligence can truly predict the future by connecting today's dots.