Close Menu

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    What's Hot

    Xiaomi Pad 8 Series

    Lenovo IdeaPad Slim 5 16 laptop review: Intel Core i5 vs. AMD Ryzen 5

    Oppo Find N6: Leakers clarify international release plans for new foldable with OnePlus Open 2 also mooted

    Facebook X (Twitter) Instagram
    • Artificial Intelligence
    • Business Technology
    • Cryptocurrency
    • Gadgets
    • Gaming
    • Health
    • Software and Apps
    • Technology
    Facebook X (Twitter) Instagram Pinterest Vimeo
    Tech AI Verse
    • Home
    • Artificial Intelligence

      Apple’s AI chief abruptly steps down

      December 3, 2025

      The issue that’s scrambling both parties: From the Politics Desk

      December 3, 2025

      More of Silicon Valley is building on free Chinese AI

      December 1, 2025

      From Steve Bannon to Elizabeth Warren, backlash erupts over push to block states from regulating AI

      November 23, 2025

      Insurance companies are trying to avoid big payouts by making AI safer

      November 19, 2025
    • Business

      Public GitLab repositories exposed more than 17,000 secrets

      November 29, 2025

      ASUS warns of new critical auth bypass flaw in AiCloud routers

      November 28, 2025

      Windows 11 gets new Cloud Rebuild, Point-in-Time Restore tools

      November 18, 2025

      Government faces questions about why US AWS outage disrupted UK tax office and banking firms

      October 23, 2025

      Amazon’s AWS outage knocked services like Alexa, Snapchat, Fortnite, Venmo and more offline

      October 21, 2025
    • Crypto

      Five Cryptocurrencies That Often Rally Around Christmas

      December 3, 2025

      Why Trump-Backed Mining Company Struggles Despite Bitcoin’s Recovery

      December 3, 2025

      XRP ETFs Extend 11-Day Inflow Streak as $1 Billion Mark Nears

      December 3, 2025

      Why AI-Driven Crypto Exploits Are More Dangerous Than Ever Before

      December 3, 2025

      Bitcoin Is Recovering, But Can It Drop Below $80,000 Again?

      December 3, 2025
    • Technology

      Xiaomi Pad 8 Series

      December 3, 2025

      Lenovo IdeaPad Slim 5 16 laptop review: Intel Core i5 vs. AMD Ryzen 5

      December 3, 2025

      Oppo Find N6: Leakers clarify international release plans for new foldable with OnePlus Open 2 also mooted

      December 3, 2025

      Microsoft’s ugly sweater returns with an Xbox Edition alongside two others

      December 3, 2025

      Free Red Dead Redemption Switch 2 upgrade maximizes console’s specs for huge performance boost

      December 3, 2025
    • Others
      • Gadgets
      • Gaming
      • Health
      • Software and Apps
    Check BMI
    Tech AI Verse
    You are at:Home»Technology»What’s next for AI and math
    Technology

    What’s next for AI and math

    TechAiVerseBy TechAiVerseJune 5, 2025No Comments14 Mins Read2 Views
    Facebook Twitter Pinterest Telegram LinkedIn Tumblr Email Reddit
    What’s next for AI and math
    Share
    Facebook Twitter LinkedIn Pinterest WhatsApp Email

    What’s next for AI and math

    MIT Technology Review’s What’s Next series looks across industries, trends, and technologies to give you a first look at the future. You can read the rest of them here.

    The way DARPA tells it, math is stuck in the past. In April, the US Defense Advanced Research Projects Agency kicked off a new initiative called expMath—short for Exponentiating Mathematics—that it hopes will speed up the rate of progress in a field of research that underpins a wide range of crucial real-world applications, from computer science to medicine to national security.

    “Math is the source of huge impact, but it’s done more or less as it’s been done for centuries—by people standing at chalkboards,” DARPA program manager Patrick Shafto said in a video introducing the initiative. 

    The modern world is built on mathematics. Math lets us model complex systems such as the way air flows around an aircraft, the way financial markets fluctuate, and the way blood flows through the heart. And breakthroughs in advanced mathematics can unlock new technologies such as cryptography, which is essential for private messaging and online banking, and data compression, which lets us shoot images and video across the internet.

    But advances in math can be years in the making. DARPA wants to speed things up. The goal for expMath is to encourage mathematicians and artificial-intelligence researchers to develop what DARPA calls an AI coauthor, a tool that might break large, complex math problems into smaller, simpler ones that are easier to grasp and—so the thinking goes—quicker to solve.

    Mathematicians have used computers for decades, to speed up calculations or check whether certain mathematical statements are true. The new vision is that AI might help them crack problems that were previously uncrackable.  

    But there’s a huge difference between AI that can solve the kinds of problems set in high school—math that the latest generation of models has already mastered—and AI that could (in theory) solve the kinds of problems that professional mathematicians spend careers chipping away at.

    On one side are tools that might be able to automate certain tasks that math grads are employed to do; on the other are tools that might be able to push human knowledge beyond its existing limits.

    Here are three ways to think about that gulf.

    1/ AI needs more than just clever tricks

    Large language models are not known to be good at math. They make things up and can be persuaded that 2 + 2 = 5. But newer versions of this tech, especially so-called large reasoning models (LRMs) like OpenAI’s o3 and Anthropic’s Claude 4 Thinking, are far more capable—and that’s got mathematicians excited.

    This year, a number of LRMs, which try to solve a problem step by step rather than spit out the first result that comes to them, have achieved high scores on the American Invitational Mathematics Examination (AIME), a test given to the top 5% of US high school math students.

    At the same time, a handful of new hybrid models that combine LLMs with some kind of fact-checking system have also made breakthroughs. Emily de Oliveira Santos, a mathematician at the University of São Paulo, Brazil, points to Google DeepMind’s AlphaProof, a system that combines an LLM with DeepMind’s game-playing model AlphaZero, as one key milestone. Last year AlphaProof became the first computer program to match the performance of a silver medallist at the International Math Olympiad, one of the most prestigious mathematics competitions in the world.

    And in May, a Google DeepMind model called AlphaEvolve discovered better results than anything humans had yet come up with for more than 50 unsolved mathematics puzzles and several real-world computer science problems.

    The uptick in progress is clear. “GPT-4 couldn’t do math much beyond undergraduate level,” says de Oliveira Santos. “I remember testing it at the time of its release with a problem in topology, and it just couldn’t write more than a few lines without getting completely lost.” But when she gave the same problem to OpenAI’s o1, an LRM released in January, it nailed it.

    Does this mean such models are all set to become the kind of coauthor DARPA hopes for? Not necessarily, she says: “Math Olympiad problems often involve being able to carry out clever tricks, whereas research problems are much more explorative and often have many, many more moving pieces.” Success at one type of problem-solving may not carry over to another.

    Others agree. Martin Bridson, a mathematician at the University of Oxford, thinks the Math Olympiad result is a great achievement. “On the other hand, I don’t find it mind-blowing,” he says. “It’s not a change of paradigm in the sense that ‘Wow, I thought machines would never be able to do that.’ I expected machines to be able to do that.”

    That’s because even though the problems in the Math Olympiad—and similar high school or undergraduate tests like AIME—are hard, there’s a pattern to a lot of them. “We have training camps to train high school kids to do them,” says Bridson. “And if you can train a large number of people to do those problems, why shouldn’t you be able to train a machine to do them?”

    Sergei Gukov, a mathematician at the California Institute of Technology who coaches Math Olympiad teams, points out that the style of question does not change too much between competitions. New problems are set each year, but they can be solved with the same old tricks.

    “Sure, the specific problems didn’t appear before,” says Gukov. “But they’re very close—just a step away from zillions of things you have already seen. You immediately realize, ‘Oh my gosh, there are so many similarities—I’m going to apply the same tactic.’” As hard as competition-level math is, kids and machines alike can be taught how to beat it.

    That’s not true for most unsolved math problems. Bridson is president of the Clay Mathematics Institute, a nonprofit US-based research organization best known for setting up the Millenium Prize Problems in 2000—seven of the most important unsolved problems in mathematics, with a $1 million prize to be awarded to the first person to solve each of them. (One problem, the Poincaré conjecture, was solved in 2010; the others, which include P versus NP and the Riemann hypothesis, remain open). “We’re very far away from AI being able to say anything serious about any of those problems,” says Bridson.

    And yet it’s hard to know exactly how far away, because many of the existing benchmarks used to evaluate progress are maxed out. The best new models already outperform most humans on tests like AIME.

    To get a better idea of what existing systems can and cannot do, a startup called Epoch AI has created a new test called FrontierMath, released in December. Instead of co-opting math tests developed for humans, Epoch AI worked with more than 60 mathematicians around the world to come up with a set of math problems from scratch.

    FrontierMath is designed to probe the limits of what today’s AI can do. None of the problems have been seen before and the majority are being kept secret to avoid contaminating training data. Each problem demands hours of work from expert mathematicians to solve—if they can solve it at all: some of the problems require specialist knowledge to tackle.

    FrontierMath is set to become an industry standard. It’s not yet as popular as AIME, says de Oliveira Santos, who helped develop some of the problems: “But I expect this to not hold for much longer, since existing benchmarks are very close to being saturated.”

    On AIME, the best large language models (Anthropic’s Claude 4, OpenAI’s o3 and o4-mini, Google DeepMind’s Gemini 2.5 Pro, X-AI’s Grok 3) now score around 90%. On FrontierMath, 04-mini scores 19% and Gemini 2.5 Pro scores 13%. That’s still remarkable, but there’s clear room for improvement.     

    FrontierMath should give the best sense yet just how fast AI is progressing at math. But there are some problems that are still too hard for computers to take on.

    2/ AI needs to manage really vast sequences of steps

    Squint hard enough and in some ways math problems start to look the same: to solve them you need to take a sequence of steps from start to finish. The problem is finding those steps. 

    “Pretty much every math problem can be formulated as path-finding,” says Gukov. What makes some problems far harder than others is the number of steps on that path. “The difference between the Riemann hypothesis and high school math is that with high school math the paths that we’re looking for are short—10 steps, 20 steps, maybe 40 in the longest case.” The steps are also repeated between problems.

    “But to solve the Riemann hypothesis, we don’t have the steps, and what we’re looking for is a path that is extremely long”—maybe a million lines of computer proof, says Gukov.

    Finding very long sequences of steps can be thought of as a kind of complex game. It’s what DeepMind’s AlphaZero learned to do when it mastered Go and chess. A game of Go might only involve a few hundred moves. But to win, an AI must find a winning sequence of moves among a vast number of possible sequences. Imagine a number with 100 zeros at the end, says Gukov.

    But that’s still tiny compared with the number of possible sequences that could be involved in proving or disproving a very hard math problem: “A proof path with a thousand or a million moves involves a number with a thousand or a million zeros,” says Gukov. 

    No AI system can sift through that many possibilities. To address this, Gukov and his colleagues developed a system that shortens the length of a path by combining multiple moves into single supermoves. It’s like having boots that let you take giant strides: instead of taking 2,000 steps to walk a mile, you can now walk it in 20.

    The challenge was figuring out which moves to replace with supermoves. In a series of experiments, the researchers came up with a system in which one reinforcement-learning model suggests new moves and a second model checks to see if those moves help.

    They used this approach to make a breakthrough in a math problem called the Andrews-Curtis conjecture, a puzzle that has been unsolved for 60 years. It’s a problem that every professional mathematician will know, says Gukov.

    (An aside for math stans only: The AC conjecture states that a particular way of describing a type of set called a trivial group can be translated into a different but equivalent description with a certain sequence of steps. Most mathematicians think the AC conjecture is false, but nobody knows how to prove that. Gukov admits himself that it is an intellectual curiosity rather than a practical problem, but an important problem for mathematicians nonetheless.)

    Gukov and his colleagues didn’t solve the AC conjecture, but they found that a counterexample (suggesting that the conjecture is false) proposed 40 years ago was itself false. “It’s been a major direction of attack for 40 years,” says Gukov. With the help of AI, they showed that this direction was in fact a dead end.   

    “Ruling out possible counterexamples is a worthwhile thing,” says Bridson. “It can close off blind alleys, something you might spend a year of your life exploring.” 

    True, Gukov checked off just one piece of one esoteric puzzle. But he thinks the approach will work in any scenario where you need to find a long sequence of unknown moves, and he now plans to try it out on other problems.

    “Maybe it will lead to something that will help AI in general,” he says. “Because it’s teaching reinforcement learning models to go beyond their training. To me it’s basically about thinking outside of the box—miles away, megaparsecs away.”  

    3/ Can AI ever provide real insight?

    Thinking outside the box is exactly what mathematicians need to solve hard problems. Math is often thought to involve robotic, step-by-step procedures. But advanced math is an experimental pursuit, involving trial and error and flashes of insight.

    That’s where tools like AlphaEvolve come in. Google DeepMind’s latest model asks an LLM to generate code to solve a particular math problem. A second model then evaluates the proposed solutions, picks the best, and sends them back to the LLM to be improved. After hundreds of rounds of trial and error, AlphaEvolve was able to come up with solutions to a wide range of math problems that were better than anything people had yet come up with. But it can also work as a collaborative tool: at any step, humans can share their own insight with the LLM, prompting it with specific instructions.

    This kind of exploration is key to advanced mathematics. “I’m often looking for interesting phenomena and pushing myself in a certain direction,” says Geordie Williamson, a mathematician at the University of Sydney in Australia. “Like: ‘Let me look down this little alley. Oh, I found something!’”

    Williamson worked with Meta on an AI tool called PatternBoost, designed to support this kind of exploration. PatternBoost can take a mathematical idea or statement and generate similar ones. “It’s like: ‘Here’s a bunch of interesting things. I don’t know what’s going on, but can you produce more interesting things like that?’” he says.

    Such brainstorming is essential work in math. It’s how new ideas get conjured. Take the icosahedron, says Williamson: “It’s a beautiful example of this, which I kind of keep coming back to in my own work.” The icosahedron is a 20-sided 3D object where all the faces are triangles (think of a 20-sided die). The icosahedron is the largest of a family of exactly five such objects: there’s the tetrahedron (four sides), cube (six sides), octahedron (eight sides), and dodecahedron (12 sides).

    Remarkably, the fact that there are exactly five of these objects was proved by mathematicians in ancient Greece. “At the time that this theorem was proved, the icosahedron didn’t exist,” says Williamson. “You can’t go to a quarry and find it—someone found it in their mind. And the icosahedron goes on to have a profound effect on mathematics. It’s still influencing us today in very, very profound ways.”

    For Williamson, the exciting potential of tools like PatternBoost is that they might help people discover future mathematical objects like the icosahedron that go on to shape the way math is done. But we’re not there yet. “AI can contribute in a meaningful way to research-level problems,” he says. “But we’re certainly not getting inundated with new theorems at this stage.”

    Ultimately, it comes down to the fact that machines still lack what you might call intuition or creative thinking. Williamson sums it up like this: We now have AI that can beat humans when it knows the rules of the game. “But it’s one thing for a computer to play Go at a superhuman level and another thing for the computer to invent the game of Go.”

    “I think that applies to advanced mathematics,” he says. “Breakthroughs come from a new way of thinking about something, which is akin to finding completely new moves in a game. And I don’t really think we understand where those really brilliant moves in deep mathematics come from.”

    Perhaps AI tools like AlphaEvolve and PatternBoost are best thought of as advance scouts for human intuition. They can discover new directions and point out dead ends, saving mathematicians months or years of work. But the true breakthroughs will still come from the minds of people, as has been the case for thousands of years.

    For now, at least. “There’s plenty of tech companies that tell us that won’t last long,” says Williamson. “But you know—we’ll see.” 

    Share. Facebook Twitter Pinterest LinkedIn Reddit WhatsApp Telegram Email
    Previous ArticleMicrosoft unveils free EU cybersecurity program for governments
    Next Article How an Unstable US Dollar is Making Americans Rush to Bitcoin
    TechAiVerse
    • Website

    Jonathan is a tech enthusiast and the mind behind Tech AI Verse. With a passion for artificial intelligence, consumer tech, and emerging innovations, he deliver clear, insightful content to keep readers informed. From cutting-edge gadgets to AI advancements and cryptocurrency trends, Jonathan breaks down complex topics to make technology accessible to all.

    Related Posts

    Xiaomi Pad 8 Series

    December 3, 2025

    Lenovo IdeaPad Slim 5 16 laptop review: Intel Core i5 vs. AMD Ryzen 5

    December 3, 2025

    Oppo Find N6: Leakers clarify international release plans for new foldable with OnePlus Open 2 also mooted

    December 3, 2025
    Leave A Reply Cancel Reply

    Top Posts

    Ping, You’ve Got Whale: AI detection system alerts ships of whales in their path

    April 22, 2025468 Views

    Lumo vs. Duck AI: Which AI is Better for Your Privacy?

    July 31, 2025159 Views

    6.7 Cummins Lifter Failure: What Years Are Affected (And Possible Fixes)

    April 14, 202584 Views

    Is Libby Compatible With Kobo E-Readers?

    March 31, 202563 Views
    Don't Miss
    Technology December 3, 2025

    Xiaomi Pad 8 Series

    Xiaomi Pad 8 Series – Notebookcheck.net External Reviews Processor: Qualcomm Snapdragon 8 SD 8 Elite,…

    Lenovo IdeaPad Slim 5 16 laptop review: Intel Core i5 vs. AMD Ryzen 5

    Oppo Find N6: Leakers clarify international release plans for new foldable with OnePlus Open 2 also mooted

    Microsoft’s ugly sweater returns with an Xbox Edition alongside two others

    Stay In Touch
    • Facebook
    • Twitter
    • Pinterest
    • Instagram
    • YouTube
    • Vimeo

    Subscribe to Updates

    Get the latest creative news from SmartMag about art & design.

    About Us
    About Us

    Welcome to Tech AI Verse, your go-to destination for everything technology! We bring you the latest news, trends, and insights from the ever-evolving world of tech. Our coverage spans across global technology industry updates, artificial intelligence advancements, machine learning ethics, and automation innovations. Stay connected with us as we explore the limitless possibilities of technology!

    Facebook X (Twitter) Pinterest YouTube WhatsApp
    Our Picks

    Xiaomi Pad 8 Series

    December 3, 20250 Views

    Lenovo IdeaPad Slim 5 16 laptop review: Intel Core i5 vs. AMD Ryzen 5

    December 3, 20250 Views

    Oppo Find N6: Leakers clarify international release plans for new foldable with OnePlus Open 2 also mooted

    December 3, 20250 Views
    Most Popular

    Apple thinks people won’t use MagSafe on iPhone 16e

    March 12, 20250 Views

    Volkswagen’s cheapest EV ever is the first to use Rivian software

    March 12, 20250 Views

    Startup studio Hexa acquires majority stake in Veevart, a vertical SaaS platform for museums

    March 12, 20250 Views
    © 2025 TechAiVerse. Designed by Divya Tech.
    • Home
    • About Us
    • Contact Us
    • Privacy Policy
    • Terms & Conditions

    Type above and press Enter to search. Press Esc to cancel.