By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
  • Technology
    TechnologyShow More
    Apple Foldable iPhone Finally Revealed? Design and Features Leaked
    April 13, 2026
    Samsung Expands Its Browser to Windows with Built-In AI Assistant
    April 6, 2026
    Meta’s New Prescription AI Glasses Bring Smart Features to Everyday Eyewear
    April 5, 2026
    Google Drive Rolls Out Real-Time Malware Detection to Protect User Files
    April 4, 2026
    Trend Micro Focuses on AI-Powered Protection Amid Rising Cyber Risks
    March 31, 2026
  • Business
    BusinessShow More
    Latest WhatsApp Features Bringing AI Messaging and Multi Account Access
    March 30, 2026
    How Ferrari leveraged advanced logistics technology to overcome Strait of Hormuz disruption
    March 27, 2026
    Dubai Commits AED 12.8 Billion to Build 6,500-Company Tech Hub in Silicon Oasis
    March 25, 2026
    Meta’s Latest Moves in AI and Privacy Signal New Era for Social Platforms
    March 21, 2026
    Amazon Middle East: The $5B “AI Zone” Leap Amid Regional Infrastructure Resilience
    March 16, 2026
  • Transportation Technology
    Transportation TechnologyShow More
    WeRide, Uber Bring Fully Driverless, Paid Robotaxi Rides to Dubai Streets
    April 2, 2026
    How Ferrari leveraged advanced logistics technology to overcome Strait of Hormuz disruption
    March 27, 2026
    Lucid Motors Unveils Midsize Electric Vehicle Architecture and Lunar Robotaxi
    March 16, 2026
    Dubai’s Roads Are Getting Smarter: How RTA’s AI and Dubai Loop Will Change Your Commute
    March 3, 2026
    Dubai Flying Taxi 2026: Launch Date, Locations & Future of Urban Mobility
    February 26, 2026
  • Videos
    VideosShow More
    Sulmi EB-One: The UAE’s First E-Motorbike Unveiled at Gitex Today
    October 16, 2024
    The first communication between two humans in dreams has been officially achieved
    October 13, 2024
    Musk Unveils Tesla’s Self-Driving Taxi, Promising a Price Under $30,000
    October 11, 2024
    Discover the Hollyland LARK M2: A Superior Microphone for Your iPhone
    September 26, 2024
    Hands-On: A New Way to Manage Health with Samsung Watch
    September 25, 2024
  • Editor Preference
    Editor PreferenceShow More
    Huawei Mate X7 Review: Large Foldable Display Phone with Stunning Visual Quality
    April 19, 2026
    Nothing Headphone (a) Just Launched – Here’s Everything You Need to Know
    March 7, 2026
    A Powerhouse Tablet for Work and Gaming
    March 1, 2025
    Samsung Galaxy Ring – A Seamless Addition to Your Health and Fitness Gear
    November 13, 2024
    Elon Musk Plans to Build a One Million Person City on Mars by 2054
    October 23, 2024
  • Entrepreneurship
    EntrepreneurshipShow More
    Inside the Journey of Ronaldo Mouchawar and the $580M Souq.com–Amazon Deal
    April 11, 2026
    Building a Fintech Giant in the Middle East: The Success Story of Abdullah Al Othman
    March 29, 2026
    Careem started with rides, but drove an entire ecosystem forward
    March 23, 2026
    Temu Founder Becomes Richest Person in China Thanks to Cheap Goods
    September 28, 2024
  • Gaming
    GamingShow More
    Sony Launches INZONE H6 Air Headset in UAE with New Gaming Accessories Lineup
    April 17, 2026
    NVIDIA’s Latest AI Features Enhance Performance and Frame Rates for Gamers
    April 2, 2026
    Intel’s 18A Era Arrives: How Panther Lake and the Core Ultra 200S Plus are Shattering Benchmarks
    March 16, 2026
    Ultimate 2-in-1 Gaming Laptop for Gamers and Creators
    March 8, 2026
    ASUS Showcases Zenbook Ceraluminum™ and World’s Lightest Copilot+ PC at Dubai Design Week 2025
    November 6, 2025
  • العربيةالعربية
Reading: Is Math Out? Mario Challenges AI to the Ultimate Test
Share
  • Gitex 2024
  • AI
  • Interviews
  • Middle East
  • Saudi Arabia
  • United Arab of Emirates
  • Cyber Security
Font ResizerAa
TECHNOLOGY MEATECHNOLOGY MEA
  • Gitex 2025
  • AI
  • Interviews
  • Middle east
  • Saudi Arabia
  • United Arab of Emirates
  • Cyber Security
Search
  • Technology
  • Business
  • Transportation Technology
  • Videos
  • Editor Preference
  • Entrepreneurship
  • Gaming
  • العربيةالعربية
Follow US
© Copyright TECHNOLOGY MEA. All Rights Reserved.
AI

Is Math Out? Mario Challenges AI to the Ultimate Test

Last updated: March 17, 2025 6:07 PM
4 Min Read
Share
SHARE

Researchers have tested artificial intelligence’s ability to adapt quickly through the classic game Super Mario Bros.

The Claude 3.7 model excelled in fast responses and jump planning, while other models faced noticeable difficulties.

The experiment raised questions about how relevant game-based tests are to real-world AI capabilities.

As the quest to measure AI abilities continues, researchers are turning to a new approach that goes beyond traditional mathematical tests and enters the realm of games—something both fun and equally challenging.

Following Anthropic’s testing of its latest Claude 3.7 Sonnet model in Pokémon, a fresh attempt emerged using the iconic Super Mario Bros., a game released by Nintendo in 1985. This now serves as a new testing platform for AI’s capabilities, symbolizing a shift from classic logical puzzles to dynamic jumping challenges.

This innovative approach comes from the Hao AI lab at the University of California, San Diego, where researchers tested multiple advanced AI models using Super Mario as an evaluation tool. Rather than using traditional metrics, the team decided to assess AI in an environment that humans instinctively understand.

To carry out the experiment, they used an emulator version of the game combined with the GamingAgent system—a custom framework developed by the lab. This system provided the AI models with basic control instructions, guidance conditions, and real-time screenshots of the game, with the AI models controlling Mario’s movements via Python code. Although Super Mario Bros. is a relatively simple adventure game, researchers at Hao AI discovered that it required the AI to engage in complex planning and adapt rapidly. Success wasn’t just about computational power but also about making strategic decisions and performing precise, sequential actions in a fast-changing environment.

At the conclusion of the experiment, the Claude 3.7 model from Anthropic stood out as the most impressive, displaying rapid responses and expertly timed jumps while avoiding enemies. The Claude 3.5 model also performed well. However, the real surprise came with AI models designed for logical reasoning, such as GPT-4o from OpenAI and Gemini 1.5 Pro from Google, which struggled to keep pace with the demands of the game.

Researchers highlighted timing as a key factor in this test, noting that a fraction of a second could make the difference between success and failure. Models relying on deep logical reasoning tend to process information in sequential steps, which makes them slower to respond to quickly changing scenarios, leading to frequent game losses.

While using games to assess AI abilities isn’t new, some experts are questioning how relevant these tests are to real-world AI. Games often simplify real-world complexity, offering limited training data compared to the unpredictability and intricacy of the actual world.

In this context, AI researcher Andrej Karpathy raised concerns about a “valuation crisis” in the field, suggesting that the current testing methods—especially those involving games—might not provide an accurate picture of true AI progress.

An interesting and somewhat amusing question arises: If AI struggles to navigate the Mushroom Kingdom, can we trust it to handle the complexities of the real world? While the Super Mario test is an exciting way to explore AI’s capabilities, it also serves as a reminder of the challenges that even seemingly simple tasks present. For those interested in exploring this further, the Hao AI lab has made the GamingAgent framework open-source on GitHub.

You Might Also Like

Top 7 Coolest Gadgets from MWC Barcelona 2026 You Can Actually Use

Successful AI Will Simply Become Part of Life

Meta’s Latest Moves in AI and Privacy Signal New Era for Social Platforms

Meta’s Muse Spark LLM Sets New Standard for Multimodal Artificial Intelligence

Latest WhatsApp Features Bringing AI Messaging and Multi Account Access

TAGGED:super mario challenge
Share This Article
Facebook Twitter Email Print
Previous Article Kia EV4, PV5, and Concept EV2 Unveiled at 2025 Kia EV Day as Key Pillars of Enhanced Global EV Strategy
Next Article Amazon, Google, and Meta Call for Tripling Global Nuclear Power Capacity by 2050 to Strengthen Energy Security and Tackle Climate Change

Stay Connected

FacebookLike
TwitterFollow
InstagramFollow
YoutubeSubscribe

Recent Post

Huawei Mate X7 Review: Large Foldable Display Phone with Stunning Visual Quality
Editor Preference Mobiles
Middle East Invests in AI, Cybersecurity and Digital Infrastructure to Power Growth
Uncategorized
Sony Launches INZONE H6 Air Headset in UAE with New Gaming Accessories Lineup
Gaming United Arab of Emirates
Motorola Razr 60 Designed to Stand Out and Perform
Mobiles

You Might Also Like

AICyber SecurityTechnology

Trend Micro Focuses on AI-Powered Protection Amid Rising Cyber Risks

March 31, 2026
AI

RetailGPT: AI Insights Increase Shopper Understanding from 10% to 45%

October 29, 2024
AI

OpenAI Eyes India for Major AI Data Centre Under Stargate Program

September 2, 2025
AIMobiles

HONOR Magic7 Pro Launches in MEA: Redefining AI-Powered Camera and Performance Excellence

January 16, 2025

Join us as we explore how technology is shaping our world and discover insights that will keep you ahead of the curve.

Follow Us

Quick Links

  • Technology
  • Business
  • Transportation Technology
  • Videos
  • Editor Preference
  • Entrepreneurship
  • Gaming

Information

  • Contact Us
  • Privacy Policy
  • Terms of use

Sign Up for Our Newsletter

Subscribe to our newsletter to get our newest articles instantly!

Loading
© Copyright TECHNOLOGY MEA. All Rights Reserved.