Close Menu

    Subscribe to Updates

    Get the latest creative news from infofortech

    What's Hot

    Adobe beats expectations but another top executive leaves, putting pressure on its stock

    June 12, 2026

    Here’s How to Use an AI Agent to Build a Cold Outreach Campaign

    June 12, 2026

    Researchers Are Developing Textiles That Can Produce Drinking Water From The Air

    June 12, 2026
    Facebook X (Twitter) Instagram
    InfoForTech
    • Home
    • Latest in Tech
    • Artificial Intelligence
    • Cybersecurity
    • Innovation
    Facebook X (Twitter) Instagram
    InfoForTech
    Home»Innovation»TestSprite launches an open-source command-line tool to help AI agents check their own work
    Innovation

    TestSprite launches an open-source command-line tool to help AI agents check their own work

    InfoForTechBy InfoForTechJune 11, 2026No Comments5 Mins Read
    Facebook Twitter Pinterest Telegram LinkedIn Tumblr WhatsApp Email
    TestSprite launches an open-source command-line tool to help AI agents check their own work
    Share
    Facebook Twitter LinkedIn Pinterest Telegram Email



    Autonomous artificial intelligence-powered software testing tool TestSprite Inc. today announced that the company has open-sourced its command-line interface tool that allows AI coding agents to verify their own work.

    As the AI coding revolution has rolled in, autonomous coding tools have become smarter and enabled developers to prompt their way to entire applications overnight. The result is faster code, but at the same time, it means that the software can come off the digital assembly line with unseen bugs that may not be caught by unit tests run by agentic tools.

    In too many cases, an AI agent might report a feature complete, but some of the tests failed, weren’t written correctly, were incomplete or were simply skipped. Other times a coding agent might write a function that appears to run on the surface but has a hidden bug that only triggers in an edge case that a customer will run into in particular circumstances (even 1 in 1,000 is too often) – or, in the worst-case scenario, it breaks some other part of the codebase altogether.

    “That’s exactly what’s driving developers crazy,” said founder and Chief Executive Yunhao Jiao. “You use AI, you ship something new, you fix one thing and then boom, another thing crashes. Even the best agent in our competition broke 12% of the features that already worked. That’s the gap a verifier closes.”

    TestSprite said today’s release provides a command-line interface, a space on the terminal, that gives coding agents a real quality assurance loop, not a spot check.

    The coding agent describes a behavior once. TestSprite then runs it in the cloud the way a real user might, driving a live browser or hitting a live application programming interface, never using mock protocols. It then returns a single, self-consistent failure mode: the failing step and its neighbors, screenshots, a Document Object Model manifest, the test source, a root cause hypothesis and a recommended fix.

    The AI coding agent can then read the data, fix the code and rerun.

    This becomes the test loop. Every time the agent runs a phase of work, TestSprite adds dozens of new tests, so coverage grows alongside the codebase. This provides a safety net that controls for potential gaps and can capture eventualities as the application complexity changes shape without getting tangled up.

    The TestSprite CLI is open source under the Apache 2.0 license and available today. Installation is simple using “npm install -g @testsprite/cli” for machines with Node.js 2.0 or higher. Documentation and reference are available on GitHub.

    CoderCup: Publicly refereed AI agent coding battle

    In addition to the CLI open-source announcement, TestSprite launched CoderCup, a public competition and leaderboard in which AI agents built and deployed the same app under one clock.

    The company used its newly open-sourced CLI as a neutral referee, mimicking the World Cup soccer, which also had its kickoff today. The test agent acted to score each phase and linked each score to public evidence supporting it.

    In the first event, several frontier agents went head-to-head, including Anthropic PBC’s Claude Code, OpenAI Group PBC’s Codex, and Google LLC’s Antigravity with TestSprite publishing the full results and per-phase scores openly at codercup.ai.

    “Most benchmarks score AI coding agents on a single number, but that’s not what developers actually feel,” Jiao said. “What matters day to day is stuff no leaderboard captures.”

    Those metrics include things such as what agents get right the first time, how often they break on something that used to work, and whether they can recover on their own.

    For the most part, many of the frontier players took to the field and dazzled with strengths and weaknesses. Claude Code rallied on consistency, whereas Codex and Antigravity were the quickest overall, ranking in cumulative minutes under 100.

    Beijing Moonshot AI Technology Co. Ltd.’s Kimi strolled in the opposite direction: slowest on clock, at around 350 minutes; but that slow roll paid off. While being smaller and cheaper, Kimi posted the highest correctness in the field at 0.89 and the lowest total cost, outclassing agents many times its size.

    Agents that ran the fastest were rarely the ones that made the grade. Every agent, even the most stalwart, kept breaking work it had already completed.

    “We built CoderCup to make those things visible. The soccer faceoff is the fun part; the metrics underneath are the real point,” Jiao added.

    Image: SiliconANGLE/Microsoft Designer

    Support our mission to keep content open and free by engaging with theCUBE community. Join theCUBE’s Alumni Trust Network, where technology leaders connect, share intelligence and create opportunities.

    • 15M+ viewers of theCUBE videos, powering conversations across AI, cloud, cybersecurity and more
    • 11.4k+ theCUBE alumni — Connect with more than 11,400 tech and business leaders shaping the future through a unique trusted-based network.

    About SiliconANGLE Media

    SiliconANGLE Media is a recognized leader in digital media innovation, uniting breakthrough technology, strategic insights and real-time audience engagement. As the parent company of SiliconANGLE, theCUBE Network, theCUBE Research, CUBE365, theCUBE AI and theCUBE SuperStudios — with flagship locations in Silicon Valley and the New York Stock Exchange — SiliconANGLE Media operates at the intersection of media, technology and AI.

    Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a dynamic ecosystem of industry-leading digital media brands that reach 15+ million elite tech professionals. Our new proprietary theCUBE AI Video Cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    InfoForTech
    • Website

    Related Posts

    Adobe beats expectations but another top executive leaves, putting pressure on its stock

    June 12, 2026

    Novo Nordisk’s Breach Is A Wake-Up Call

    June 12, 2026

    Google’s new Gemini TV controls are here and TCL owners get them first

    June 11, 2026

    Drug Sites Hijacked Spotify’s Search Ranking Through Fake Podcasts

    June 11, 2026

    What Is A Security Operations Center (SOC)?

    June 11, 2026

    If you’re using AI tools like ChatGPT to fact-check news, there’s some bad news for you

    June 11, 2026
    Leave A Reply Cancel Reply

    Advertisement
    Top Posts

    DoJ Disrupts 3 Million-Device IoT Botnets Behind Record 31.4 Tbps Global DDoS Attacks

    March 20, 202638 Views

    Microsoft is bringing an AI helper to Xbox consoles

    March 14, 202616 Views

    This is the tech that makes Volvo’s latest EV a major step forward

    January 24, 202616 Views

    Why Security Validation Is Becoming Agentic

    March 16, 202615 Views
    Stay In Touch
    • Facebook
    • Twitter
    • Pinterest
    • Instagram
    • YouTube
    • Vimeo
    Advertisement
    About Us
    About Us

    Our mission is to deliver clear, reliable, and up-to-date information about the technologies shaping the modern world. We focus on breaking down complex topics into easy-to-understand insights for professionals, enthusiasts, and everyday readers alike.

    We're accepting new partnerships right now.

    Facebook X (Twitter) YouTube
    Most Popular

    DoJ Disrupts 3 Million-Device IoT Botnets Behind Record 31.4 Tbps Global DDoS Attacks

    March 20, 202638 Views

    Microsoft is bringing an AI helper to Xbox consoles

    March 14, 202616 Views

    This is the tech that makes Volvo’s latest EV a major step forward

    January 24, 202616 Views
    Categories
    • Artificial Intelligence
    • Cybersecurity
    • Innovation
    • Latest in Tech
    © 2026 All Rights Reserved InfoForTech.
    • Home
    • About Us
    • Contact Us
    • Privacy Policy

    Type above and press Enter to search. Press Esc to cancel.

    Ad Blocker Enabled!
    Ad Blocker Enabled!
    Our website is made possible by displaying online advertisements to our visitors. Please support us by disabling your Ad Blocker.