BTCC / BTCC Square / Cryptopolitan /
Shocking Study: 97% of AI Agents Fail Miserably at Basic Upwork Tasks

Shocking Study: 97% of AI Agents Fail Miserably at Basic Upwork Tasks

Published:
2025-11-06 18:25:06
9
1

Research found that AI agents can’t complete 97% of tasks on Upwork to even a basic standard

AI hype meets harsh reality: New research exposes a glaring gap between promise and performance in autonomous digital labor.

The cold hard numbers

When put to the test on real-world freelance platforms, artificial intelligence couldn't deliver even baseline competency for 97% of assigned jobs. The findings deliver a sobering reality check to VC-funded startups promising 'AI revolution.'

Why this matters for the future of work

While chatbots dazzle with conversational parlor tricks, practical task execution remains firmly in human territory. The study suggests current-gen AI still can't replace basic cognitive labor—despite what your crypto-bro coworker claims about 'decentralized autonomous organizations.'

Perhaps the only thing less reliable than AI workers? The tokenomics behind most Web3 'disruptors' banking on them.

Researchers believe AI won’t replace jobs any time soon

The researchers discovered that AI agents struggle with multi-step workflows, taking initiative, or using judgment. They also agreed that AI will not replace jobs for a while yet. 

According to research by the European Broadcasting Union and BBC, AI models, including ChatGPT, Copilot, and Perplexity, are not good at reporting news. The research found that AI models fail to meet key criteria, such as sourcing, accuracy, generating text, and distinguishing between opinion and fact.

AI models had at least one significant issue in 45% of answers, while only 31% of AI answers were scored correctly. 20% of AI answers were wrong and had outdated info and hallucinated details. Out of all models, Gemini recorded 76% of significant issues in its responses.

Freelance.com revealed research that found AI-generated cover letters have jeopardized efforts in applications, resulting in employers hiring fewer people or even the wrong ones. The firm also revealed that skilled workers in the top quintile for abilities are being hired 19% less often than they were before, while those in the lowest quintile are being hired 14% more often.

The study corroborates an MIT research report from August, which concluded that 95% of organizations have garnered zero return from their collective $30 billion investment in AI. According to WorldTest from MIT and Basis Research, AI agents can match patterns and predict words, but struggle to build internal models of the world.

The study involved 129 tasks across 43 interactive worlds, which required the AIs to predict hidden aspects of the world, plan sequences of actions to achieve a goal, and determine when the rules of the environment changed. The researchers also tested 517 humans on the same tasks and found that humans achieve near-optimal scores while AI models frequently fail. 

The researchers argued that humans perform better on tasks because they intuitively understand their environments, adjust their perspectives, run experiments, start from scratch, and explore strategically. According to the study, adding more compute to existing models also doesn’t work; it only helps 25 out of 43 environments.

Crypto and AI Czar warns of AI-driven censorship on social media 

MIT Sloan researchers and SAFE Security found that AI drives 80% of ransomware attacks. According to a study of 2,800 ransomware attacks by the Cybersecurity Arms Race, adversarial AI was found to be automating entire attack sequences, including creating malware, phishing campaigns, and deepfake phone calls for social engineering purposes.

Researcher Kevin Beaumont disagrees with the research, claiming that generative AI isn’t a major part of any of them. Researcher Marcus Hutchins also called the paper absurd, adding that he burst out laughing.

“The paper is almost complete nonsense; it’s jaw-droppingly bad. It’s so bad it’s difficult to know where to start.”

–Kevin Beaumont, Security Researcher at Medium.

Crypto and AI Czar David Sacks also stated that he’s concerned that the censorship on social media and search engines seen in recent years will become thoroughly dystopian with the advent of generative AI. He argued that the term “woke AI” is insufficient to explain what’s going on because it somehow trivializes the issue. He pointed to Orwellian AI, which he claims distorts the answers, lies, and rewrites history in real-time to serve the current political agenda of those in power.

Sharpen your strategy with mentorship + daily ideas - 30 days free access to our trading program

|Square

Get the BTCC app to start your crypto journey

Get started today Scan to join our 100M+ users

All articles reposted on this platform are sourced from public networks and are intended solely for the purpose of disseminating industry information. They do not represent any official stance of BTCC. All intellectual property rights belong to their original authors. If you believe any content infringes upon your rights or is suspected of copyright violation, please contact us at [email protected]. We will address the matter promptly and in accordance with applicable laws.BTCC makes no explicit or implied warranties regarding the accuracy, timeliness, or completeness of the republished information and assumes no direct or indirect liability for any consequences arising from reliance on such content. All materials are provided for industry research reference only and shall not be construed as investment, legal, or business advice. BTCC bears no legal responsibility for any actions taken based on the content provided herein.