Most of the industry is focused on using AI to reason about vulnerabilities.
However frontier models require more than reasoning to be harnessed in modern security programs. They requires memory, context, and systems that drive work to closure.
Anthropic’s own follow-up to their research points to the same conclusion:once discovery is solved, closing the patch gap becomes the defining problem for security teams.
If it’s not patched, it’s still exploitable - no matter how well it was triaged. Most security programs aren’t built for that world. A small security team cannot effectively interface with a large engineering organization to burn down the backlog. There simply isn’t enough human security talent to keep up.
So while the industry is solving for reasoning, the real gap is execution.
A handful of security engineers cannot coordinate, negotiate, and drive remediation across thousands of developers.
In practice, product security teams struggle with tasks downstream of discovery, like triage, exploit validation, negotiating with engineering, and chasing fixes across the org.
That’s thousands of hours of labor. And the humans to perform that labor do not exist.
At Nullify we’ve built a system of agents that replaces that work, and it’s creating the outcomes typically expected from a fully ramped product security team or managed offering in areas like automated vulnerability remediation, as they can now patch real exploits at scale within their SLAs without being constrained by headcount.
We didn’t start here. The first agents we built at Nullify were focused on investigating findings to triage them for exploitability and impact. As we made progress here and replaced legacy scanners by helping security teams focus on findings that were important to fix, we realised that even with a perfectly triaged backlog there is a clear dependency on labor to drive work to closure.
Signal to noise on the backlog is just one subset of the product security value chain, and while customers had fewer, higher-quality issues, the work didn’t go away - it just moved.
Even with perfect triage, the same pattern kept showing up: findings would enter the backlog, and then get stuck in the last mile.
That last mile looks like:
Debating over ownership, severity and timing all while SLAs slip waiting on approval isn’t security work. It’s a coordination problem born from the lack of headcount in security that can interface with engineering drive work to closure.
Across customers, we saw the same thing: the majority of time wasn’t spent validating our findings — it was spent negotiating, routing, and following up on fixes.
Improving triage only solved a small part of the end-to-end product security value chain.
It only made it more visible.
Better signal just meant more confidence in the issue and a faster handoff to engineering, which only made the same bottlenecks in coordination, ownership, and execution more apparent.
This is when it became clear:
The last mile isn’t a small part of the problem — it is the problem.
In a post-Mythos world, where discovery is effectively infinite, this layer becomes the limiting factor. Not detection or reasoning, but execution. If a vulnerability isn’t fixed, it doesn’t matter how well it was triaged.
It’s still exploitable.
That’s the gap most product security programs are leaving open, and it’s where breaches will happen more and more.
Our answer to this gap was Campaigns, an agentic interface was orchestrating the resolution of vulnerabilities at scale.
The paradigm shift that we wanted to make with Campaigns in how security teams ran their product security program was inspired by Amazon’s Security Guardians - an internal Amazon program that was built to empower developers to act as embedded security champions within their own product teams. These volunteers then acted as a "security conscience," bridging the gap between development and central security teams to improve security by design, reduce bottlenecks, and accelerate secure product launches.
It scales through distributed ownership.
But distributed ownership only works if execution does. The result is a different operating model:
But most companies cannot hire product security headcount in central security to build a program like that.
With Campaigns, Nullify enables you to mirror your remediation goals with your security programs objectives, no matter your headcount
First, tell Nullify the subset of the backlog you want remediated by what date, and how much of your developers storypoints you want assigned to reviewing Nullify’s fixes. Or let Nullify generate default campaigns based off of what it thinks you need to fix first with the capacity you have available, using it’s understanding of your company’s security posture, risk model in vault and your backlog.
With Campaigns, the concept of distributed security ownership is democratized as the labor constraint
Customers achieve the output of a fully ramped product security team — without being constrained by headcount.
Exploits are not just identified, but patched at scale
Backlogs don’t grow, they get burned down in alignment with engineering as security can mirror their remediation goals with the program’s broader objectives.
With a system designed for execution:
Entire classes of work disappear:
What remains is a system that takes responsibility for outcomes.
This is the missing layer in AI security: execution.
Next, we’ll go deeper into how we solved triage — how Vault assembles context and ontology across code, cloud, and business logic to validate exploitability with agentic tool use, and how we built systems to ingest and reason over unstructured organizational context.