Learn from our AI Security Experts

Discover every model. Secure every workflow. Prevent AI attacks - without slowing innovation.

All Resources

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
Insights
xx
min read

RSAC 2025 Takeaways

RSA Conference 2025 may be over, but conversations are still echoing about what’s possible with AI and what’s at risk. This year’s theme, “Many Voices. One Community,” reflected the growing understanding that AI security isn’t a challenge one company or sector can solve alone. It takes shared responsibility, diverse perspectives, and purposeful collaboration.

Security for AI
Insights
xx
min read

Universal Bypass Discovery: Why AI Systems Everywhere Are at Risk

HiddenLayer researchers have developed the first single, universal prompt injection technique, post-instruction hierarchy, that successfully bypasses safety guardrails across nearly all major frontier AI models. This includes models from OpenAI (GPT-4o, GPT-4o-mini, and even the newly announced GPT-4.1), Google (Gemini 1.5, 2.0, and 2.5), Microsoft (Copilot), Anthropic (Claude 3.7 and 3.5), Meta (Llama 3 and 4 families), DeepSeek (V3, R1), Qwen (2.5 72B), and Mixtral (8x22B).

No items found.
Insights
xx
min read

How To Secure Agentic AI

Artificial Intelligence is entering a new chapter defined not just by generating content but by taking independent, goal-driven action. This evolution is called agentic AI. These systems don’t simply respond to prompts; they reason, make decisions, contact tools, and carry out tasks across systems, all with limited human oversight. In short, they are the architects of their own workflows.

No items found.
Insights
xx
min read

What’s New in AI

The past year brought significant advancements in AI across multiple domains, including multimodal models, retrieval-augmented generation (RAG), humanoid robotics, and agentic AI.

Security for AI
Insights
xx
min read

Securing Agentic AI: A Beginner's Guide

The rise of generative AI has unlocked new possibilities across industries, and among the most promising developments is the emergence of agentic AI. Unlike traditional AI systems that respond to isolated prompts, agentic AI systems can plan, reason, and take autonomous action to achieve complex goals.

Security for AI
Insights
xx
min read

AI Red Teaming Best Practices

Organizations deploying AI must ensure resilience against adversarial attacks before models go live. This blog covers best practices for <a href="https://hiddenlayer.com/innovation-hub/a-guide-to-ai-red-teaming/">AI red teaming, drawing on industry frameworks and insights from real-world engagements by HiddenLayer’s Professional Services team.

Security for AI
Insights
xx
min read

AI Security: 2025 Predictions Recommendations

It’s time to dust off the crystal ball once again! Over the past year, AI has truly been at the forefront of cyber security, with increased scrutiny from attackers, defenders, developers, and academia. As various forms of generative AI drive mass AI adoption, we find that the threats are not lagging far behind, with LLMs, RAGs, Agentic AI, integrations, and plugins being a hot topic for researchers and miscreants alike.

Security for AI
Insights
xx
min read

Securely Introducing Open Source Models into Your Organization

Open source models are powerful tools for data scientists, but they also come with risks. If your team downloads models from sources like Hugging Face without security checks, you could introduce security threats into your organization. You can eliminate this risk by introducing a process that scans models for vulnerabilities before they enter your organization and are utilized by data scientists. You can ensure that only safe models are used by leveraging HiddenLayer's Model Scanner combined with your CI/CD platform. In this blog, we'll walk you through how to set up a system where data scientists request models, security checks run automatically, and approved models are stored in a safe location like cloud storage, a model registry, or Databricks Unity Catalog.

Gen AI
Security for AI
Insights
xx
min read

Enhancing AI Security with HiddenLayer’s Refusal Detection

Security risks in AI applications are not one-size-fits-all. A system processing sensitive customer data presents vastly different security challenges compared to one that aggregates internet data for market analysis. To effectively safeguard an AI application, developers and security professionals must implement comprehensive mechanisms that instruct models to decline contextually malicious requests—such as revealing personally identifiable information (PII) or ingesting data from untrusted sources. Monitoring these refusals provides an early and high-accuracy warning system for potential malicious behavior.

No items found.

Understand AI Security, Clearly Defined

Explore our glossary to get clear, practical definitions of the terms shaping AI security, governance, and risk management.