Close Menu

    Subscribe to Updates

    Get the latest creative news from infofortech

    What's Hot

    Are You Eligible for Part of Apple’s $250M AI iPhone Settlement? How to Find Out

    June 21, 2026

    Agentic AI’s challenge is getting agents to act like a team, not a crowd

    June 21, 2026

    Why are so many shop units in Katong sitting empty?

    June 21, 2026
    Facebook X (Twitter) Instagram
    InfoForTech
    • Home
    • Latest in Tech
    • Artificial Intelligence
    • Cybersecurity
    • Innovation
    Facebook X (Twitter) Instagram
    InfoForTech
    Home»Artificial Intelligence»Scientists found the key to controlling AI behavior
    Artificial Intelligence

    Scientists found the key to controlling AI behavior

    InfoForTechBy InfoForTechFebruary 20, 2026No Comments3 Mins Read
    Facebook Twitter Pinterest Telegram LinkedIn Tumblr WhatsApp Email
    Scientists found the key to controlling AI behavior
    Share
    Facebook Twitter LinkedIn Pinterest Telegram Email


    For years, the inner workings of large language models (LLMs) like Llama and Claude have been compared to a “black box” – vast, complex, and notoriously difficult to steer. But a team of researchers from UC San Diego and MIT has just published a study in the Science Journal that suggests this box isn’t quite as mysterious as we thought.

    The team has discovered that complex concepts within AI – ranging from specific languages like Hindi to abstract ideas like conspiracy theories – are actually stored as simple, straight lines, or vectors, within the model’s mathematical space.

    By using a new tool called the Recursive Feature Machine (RFM) – a feature extraction technique that identifies linear patterns representing concepts, from moods and fears to complex reasoning – the researchers were able to trace these paths precisely. Once a concept’s direction is mapped, it can be “nudged”. By mathematically adding or subtracting these vectors, the team could instantly alter a model’s behavior without expensive retraining or complicated prompts.

    The efficiency of this method is what has the industry buzzing. Using just a single standard GPU (the NVIDIA A100), the team could identify and steer a concept in less than one minute, requiring fewer than 500 training samples.

    The practical applications of this “surgical” approach to AI are immediate. In one experiment, researchers steered a model to improve its ability to translate Python code into C++. By isolating the “logic” of the code from the “syntax” of the language, the steered model outperformed standard versions that were simply asked to “translate” via a text prompt.

    The researchers also found that internal “probing” of these vectors is a more effective way to catch AI hallucinations or toxic content than asking the AI to judge its own work. Essentially, the model often “knows” it is lying or being toxic internally, even if its final output suggests otherwise. By looking at the internal math, researchers can spot these issues before a single word is generated.

    However, the same technology that makes AI safer could also make it more dangerous. The study demonstrated that by “decreasing” the importance of the concept of refusal, the researchers could effectively “jailbreak” the models. In tests, steered models bypassed their own guardrails to provide instructions on illegal activities or promote debunked conspiracy theories.

    Perhaps the most surprising finding was the universality of these concepts. A “conspiracy theorist” vector extracted from English data worked just as effectively when the model was speaking Chinese or Hindi. This supports the “Linear Representation Hypothesis” – the idea that AI models organize human knowledge in a structured, linear way that transcends individual languages.

    While the study focused on open-source models like Meta’s Llama and DeepSeek, as well as OpenAI’s GPT-4o, the researchers believe the findings apply across the board. As models get larger and more sophisticated, they actually become more steerable, not less.

    The team’s next goal is to refine these steering methods to adapt to specific user inputs in real-time, potentially leading to a future where AI isn’t just a chatbot we talk to, but a system we can mathematically “tune” for perfect accuracy and safety.

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    InfoForTech
    • Website

    Related Posts

    A better way to model the behavior of metal alloys | MIT News

    June 20, 2026

    This Is What B2B Marketers Need to Know About the Future of Work

    June 19, 2026

    MIT in the media: For the future of tech, “Massachusetts can absolutely lead” | MIT News

    June 18, 2026

    In game theory, generalists sometimes win out over specialists | MIT News

    June 18, 2026

    The Best EDB to PST Conversion

    June 17, 2026

    Could AI tell you where you left your keys? | MIT News

    June 17, 2026
    Leave A Reply Cancel Reply

    Advertisement
    Top Posts

    DoJ Disrupts 3 Million-Device IoT Botnets Behind Record 31.4 Tbps Global DDoS Attacks

    March 20, 202638 Views

    Microsoft is bringing an AI helper to Xbox consoles

    March 14, 202616 Views

    This is the tech that makes Volvo’s latest EV a major step forward

    January 24, 202616 Views

    Why Security Validation Is Becoming Agentic

    March 16, 202615 Views
    Stay In Touch
    • Facebook
    • Twitter
    • Pinterest
    • Instagram
    • YouTube
    • Vimeo
    Advertisement
    About Us
    About Us

    Our mission is to deliver clear, reliable, and up-to-date information about the technologies shaping the modern world. We focus on breaking down complex topics into easy-to-understand insights for professionals, enthusiasts, and everyday readers alike.

    We're accepting new partnerships right now.

    Facebook X (Twitter) YouTube
    Most Popular

    DoJ Disrupts 3 Million-Device IoT Botnets Behind Record 31.4 Tbps Global DDoS Attacks

    March 20, 202638 Views

    Microsoft is bringing an AI helper to Xbox consoles

    March 14, 202616 Views

    This is the tech that makes Volvo’s latest EV a major step forward

    January 24, 202616 Views
    Categories
    • Artificial Intelligence
    • Cybersecurity
    • Innovation
    • Latest in Tech
    © 2026 All Rights Reserved InfoForTech.
    • Home
    • About Us
    • Contact Us
    • Privacy Policy

    Type above and press Enter to search. Press Esc to cancel.

    Ad Blocker Enabled!
    Ad Blocker Enabled!
    Our website is made possible by displaying online advertisements to our visitors. Please support us by disabling your Ad Blocker.