Close Menu

    Subscribe to Updates

    Get the latest creative news from infofortech

    What's Hot

    Home Batteries: How They’re Installed and How Much They Cost

    June 21, 2026

    NASA Is Testing A Rover That Can Drive Faster And Lift Its Wheels To Climb Obstacles

    June 21, 2026

    AI, user data and the asymmetry of understanding

    June 20, 2026
    Facebook X (Twitter) Instagram
    InfoForTech
    • Home
    • Latest in Tech
    • Artificial Intelligence
    • Cybersecurity
    • Innovation
    Facebook X (Twitter) Instagram
    InfoForTech
    Home»Innovation»10 best practices for optimizing generative and agentic AI costs
    Innovation

    10 best practices for optimizing generative and agentic AI costs

    InfoForTechBy InfoForTechJune 14, 2026No Comments7 Mins Read
    Facebook Twitter Pinterest Telegram LinkedIn Tumblr WhatsApp Email
    10 best practices for optimizing generative and agentic AI costs
    Share
    Facebook Twitter LinkedIn Pinterest Telegram Email



    As enterprises scale initiatives, the cost of developing, deploying and operating generative artificial intelligence models rises significantly. The shift toward AI agents can further increase costs becausse of poor architecture, limited operational maturity and weak governance.

    Information technology leaders can adopt these 10 best practices for optimizing costs, enabling them to achieve quicker business value and operational efficiency:

    1. Be objective about model accuracy, performance and cost tradeoffs

    For IT leaders, selecting the right model requires balancing accuracy, performance and cost. IT leaders must be objective on the tradeoffs among accuracy, performance and costs. A tailored approach can deliver better performance and lower inference costs.

    Additionally, most application programming interface providers charge for input and output tokens separately, while some charge based on the number of characters. Normalizing these pricing models for a given use case enables an apples-to-apples comparison.

    Lastly, IT leaders should run extended pilots to vet their total cost of ownership assumptions and uncover any surprises or hidden costs.

    2. Create an AI model sandbox to promote safety, model choice and price transparency

    An excellent way for IT leaders to enable safe experimentation is to create an AI sandbox, which features available models in a self-service manner as part of a model catalog, underpinned by basic security and privacy principles.

    Besides creating an AI sandbox, IT leaders should create model cards for the models available in the sandbox, so that users have better visibility into how and where to use them. They should also ensure that the model costs are transparent to the users via reporting tools, which enables them to make better economical choices without jeopardizing their accuracy or performance.

    3. Balance upfront and operational costs in model augmentation and customization

    When customizing gen AI models, IT leaders must balance upfront investments, such as prompt engineering, retrieval-augmented generation and fine-tuning, with ongoing inference costs. Running costs can be optimized by effective context engineering or even by efficiently fine-tuning a model on a specific dataset through instruction tuning or continued pretraining.

    To balance costs, IT leaders should consider augmentation and customizations sequentially, only moving to a more advanced approach if a simpler one doesn’t meet the required output quality. To control gen AI costs, IT leaders can curate context inputs, ensuring each inference uses only the necessary information.

    4. Understand the tradeoffs of self-hosting

    Self-hosting gen AI models (often on-premises) can seem attractive for businesses seeking increased control and data privacy. IT leaders must be aware of the potential tradeoffs, as the list of cost drivers for self-hosting is extensive.

    The most underestimated cost is the specialized talent required to operate gen AI at scale. IT leaders will have to consider the complexity and cost implications before opting to self-host gen AI models. They’ll also need to evaluate their organization’s capacity for upfront investment, ongoing maintenance and expertise needed.

    5. Proactively manage software-as-a-service applications

    SaaS vendors are packaging AI agents in inconsistent ways via bundled offerings, forced upgrades, optional tiers and add-ons. Each carries different cost, adoption and lock-in risks for organizations.

    IT leaders will need to evaluate the real productivity impact of AI features, negotiate transparent cost attribution and avoid enterprise-wide upgrades without proven return on investment. In partnership, IT leaders should adopt a use-case-driven upgrade strategy by enabling AI only for roles or workflows where measurable gains justify the spend. At the same time, they should establish strict usage and access governance to prevent consumption sprawl and surprise costs. It will also help to demand transparent AI cost breakdowns from vendors.

    6. Negotiate new pricing models for agentic AI

    As AI agent pricing models continue to evolve to align more closely with IT leaders’ expectations on value delivery, IT leaders who anchor their investments in clear business value will be best-positioned to ensure long-term impact and sustainable returns.

    IT leaders can support this by pushing SaaS vendors for flexible and predictable pricing models. They also can support run controlled AI agent pilots and track the cost per task, time saved and outcomes. From there, they can build internal benchmarks and agree on value-based pricing metrics before scaling.

    7. Automate model selection, caching and routing

    Cost differences between models make manual selection challenging for IT leaders, making automated model selection an ideal solution.

    A new category of tools called AI gateways can help control costs by enforcing policies to track and manage access to AI services and by providing features such as caching and model routing to reduce costs.

    IT leaders should create a systematic decision process for selecting different large language models for different tasks to reduce costs while achieving the required performance. This first step toward automation in itself can lead to large cost savings. Additionally, they should use AI gateways as a cost optimization and governance control plane that shapes how all AI usage occurs across the enterprise.

    8. Build a shared RAG platform to prevent duplication

    A shared RAG platform is essential since it can prevent every team from building its own ingestion, chunking and embedding pipelines, which can lead to massive data and infrastructure duplication, and other issues.

    IT leaders should stand up a unified ingestion and embedding service, deploy a governed shared vector store and expose standardized APIs that teams can use for all gen AI applications and agents. They should also enforce policies that prevent team-level RAG sprawl and continuously monitor retrieval quality and cost metrics to optimize over time.

    9. Educate users on cost-effective use of gen AI

    Users must understand how to use gen AI efficiently to avoid waste and unnecessary costs. With so many different choices of applications, models and platforms to choose from, users should be educated on cost management best practices.

    IT leaders should organize workshops where employees can experiment with LLMs and AI agents and analyze successful and unsuccessful prompts to illustrate best practices and common pitfalls.

    10. Analyze visible and hidden costs on an ongoing basis

    Gen AI platform investments have a number of visible and hidden costs that need to be considered upfront to make an informed decision, including data costs, talent costs and application setup and integration costs.

    IT leaders will need to evaluate these cost factors and consider them in their total-cost-of-ownership assessment, keeping it in mind from the start. Additionally, IT leaders must focus on mitigating the key cost drivers. These are the variable costs that make a huge difference to the TCO.

    As organizations move from pilots to production, costs can escalate quickly. By implementing these 10 best practices, IT leaders can maximize the return on their gen AI investment and unlock its full potential.

    Arun Chandrasekaran is a distinguished VP analyst at Gartner, within the global CIO practice, where his research focus is on artificial intelligence. He wrote this article for SiliconANGLE. Chandrasekaran and other Gartner analysts will present how CIOs and IT executives can become agents of change in their organizations and harness AI for digital transformation at the Gartner IT Symposium/Xpo, in Orlando, Florida Oct. 19-22.

    Image: SiliconANGLE/Gemini

    Support our mission to keep content open and free by engaging with theCUBE community. Join theCUBE’s Alumni Trust Network, where technology leaders connect, share intelligence and create opportunities.

    • 15M+ viewers of theCUBE videos, powering conversations across AI, cloud, cybersecurity and more
    • 11.4k+ theCUBE alumni — Connect with more than 11,400 tech and business leaders shaping the future through a unique trusted-based network.

    About SiliconANGLE Media

    SiliconANGLE Media is a recognized leader in digital media innovation, uniting breakthrough technology, strategic insights and real-time audience engagement. As the parent company of SiliconANGLE, theCUBE Network, theCUBE Research, CUBE365, theCUBE AI and theCUBE SuperStudios — with flagship locations in Silicon Valley and the New York Stock Exchange — SiliconANGLE Media operates at the intersection of media, technology and AI.

    Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a dynamic ecosystem of industry-leading digital media brands that reach 15+ million elite tech professionals. Our new proprietary theCUBE AI Video Cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    InfoForTech
    • Website

    Related Posts

    Home Batteries: How They’re Installed and How Much They Cost

    June 21, 2026

    AI, user data and the asymmetry of understanding

    June 20, 2026

    Platform Engineering Is What Happens When Developer Chaos Gets A Structure

    June 20, 2026

    Sony’s wild PSN login patent could turn the DualSense into a security gatekeeper

    June 20, 2026

    The Most Promising Ebola Vaccine Has Been Sitting on the Shelf for 15 Years

    June 20, 2026

    US energy regulator moves to speed up data center projects

    June 19, 2026
    Leave A Reply Cancel Reply

    Advertisement
    Top Posts

    DoJ Disrupts 3 Million-Device IoT Botnets Behind Record 31.4 Tbps Global DDoS Attacks

    March 20, 202638 Views

    Microsoft is bringing an AI helper to Xbox consoles

    March 14, 202616 Views

    This is the tech that makes Volvo’s latest EV a major step forward

    January 24, 202616 Views

    Why Security Validation Is Becoming Agentic

    March 16, 202615 Views
    Stay In Touch
    • Facebook
    • Twitter
    • Pinterest
    • Instagram
    • YouTube
    • Vimeo
    Advertisement
    About Us
    About Us

    Our mission is to deliver clear, reliable, and up-to-date information about the technologies shaping the modern world. We focus on breaking down complex topics into easy-to-understand insights for professionals, enthusiasts, and everyday readers alike.

    We're accepting new partnerships right now.

    Facebook X (Twitter) YouTube
    Most Popular

    DoJ Disrupts 3 Million-Device IoT Botnets Behind Record 31.4 Tbps Global DDoS Attacks

    March 20, 202638 Views

    Microsoft is bringing an AI helper to Xbox consoles

    March 14, 202616 Views

    This is the tech that makes Volvo’s latest EV a major step forward

    January 24, 202616 Views
    Categories
    • Artificial Intelligence
    • Cybersecurity
    • Innovation
    • Latest in Tech
    © 2026 All Rights Reserved InfoForTech.
    • Home
    • About Us
    • Contact Us
    • Privacy Policy

    Type above and press Enter to search. Press Esc to cancel.

    Ad Blocker Enabled!
    Ad Blocker Enabled!
    Our website is made possible by displaying online advertisements to our visitors. Please support us by disabling your Ad Blocker.