jixiaxue 知识库
blog / anthropic-blog · 2026-05-23-glasswing-initial-update

2026-05-23-glasswing-initial-update

3 个章节 · 0 条产出 · 0 条证据
2026-05-23

Project Glasswing:初步进展报告

Project Glasswing 初步进展报告 — 结构化总结

一句话概要

Anthropic 公布 Project Glasswing 首月战报:Claude Mythos Preview 联合 50 家合作伙伴,在全球最关键的软件中发现超过一万个高危/严重漏洞,安全瓶颈从”找漏洞”彻底转向”修漏洞”。

核心数据

指标数值
合作伙伴数量~50 家
合作伙伴发现的高危/严重漏洞> 10,000 个
扫描的开源项目> 1,000 个
开源高危/严重漏洞(模型估计)6,202 个
开源漏洞总计(含中低危)23,019 个
经人工审核的真阳性率90.6%
经审核确认为高危/严重62.4%(1,094 个)
已向维护者披露的高危/严重漏洞530 个
已修补并发布公告65 个
平均修补周期2 周
Claude Security 企业修补漏洞数2,100+(3 周内)

关键发现

1. 漏洞发现效率飞跃

  • 多数合作伙伴的漏洞发现速率提高 10 倍以上
  • Cloudflare:发现 2,000 个漏洞(400 高危/严重),假阳性率优于人类
  • Mozilla:Firefox 150 中发现 271 个漏洞,比 Opus 4.6 在 Firefox 148 中多 10 倍

2. 外部权威验证

  • 英国 AI 安全研究所:Mythos Preview 是首个端到端攻克两个网络靶场的模型
  • XBOW:在 Web 漏洞利用基准中”显著超越所有现有模型”
  • ExploitBench / ExploitGym:两个学术基准中均为最强表现者

3. 典型案例

  • wolfSSL 证书伪造漏洞(CVE-2026-5194):构造可伪造证书的利用代码,影响数十亿设备
  • 银行反欺诈:帮助合作伙伴银行检测并阻止 150 万美元欺诈性电汇

4. 行业连锁反应

  • Palo Alto Networks 最新版本补丁量增至 5 倍
  • 微软宣布 Patch Tuesday 补丁数量将”持续增长”
  • Oracle 漏洞发现和修复速度提升数倍

核心矛盾:发现快 vs. 修复慢

AI 发现漏洞的速度远超人类修补能力,形成”漏洞堆积”。开源维护者产能严重受限,部分维护者甚至要求放慢披露速度。攻防不对称性被本质颠覆——Mythos 级模型使攻击成本趋近于零,迫使整个安全生态系统重新适应。

Anthropic 的应对措施

措施说明
Claude Security(公测)Enterprise 客户代码库扫描 + 修复建议
网络验证计划允许安全专业人员绕过某些安全限制进行合法研究
安全工具包开放Skills、扫描框架、威胁模型构建器向合格客户开放
OpenSSF Alpha-Omega 合作支持开源维护者处理漏洞报告
Claude for Open Source为开源维护者和贡献者提供支持
学术基准资助支持 ExploitBench、ExploitGym 开发

未来方向

  1. 扩展合作伙伴:与美国及盟国政府合作,将 Glasswing 扩展到更多组织
  2. Mythos 级别模型发布:待开发出足够强大的安全措施后,才会正式公开发布
  3. 持续开源扫描:继续扫描开源代码,预计高危漏洞数量将继续上升

战略意义

Project Glasswing 本质上是一场时间竞赛:在其他 AI 公司开发出同等能力模型(可能缺乏安全措施)之前,尽可能多地加固关键基础设施。Anthropic 选择了”先防后攻”的路线——不公开发布 Mythos 级模型,而是通过受控合作让防御者先建立优势。

Project Glasswing:初步进展报告

Project Glasswing:初步进展报告

上个月,我们启动了 Project Glasswing——一项协作计划,旨在利用 AI 在日益强大的模型被攻击者利用之前,率先加固全球最关键的软件。

自那以来,我们与大约 50 个合作伙伴共同使用 Claude Mythos Preview,在全球最具系统重要性的软件中发现了超过一万个高危或严重漏洞。软件安全的进展过去受限于漏洞发现速度,而现在,瓶颈变成了验证、披露和修补 AI 发现的大量漏洞的速度。

在这篇文章中,我们将讨论 Project Glasswing 运行头几周以来在网络安全这一关键挑战上的所学。我们聚焦于 Mythos Preview 性能的早期公开证据、扫描数千个开源项目的初步结果,以及这些进展对当今网络防御者的意义。我们还将介绍 Project Glasswing 的下一步计划,以及我们对未来发布 Mythos 级别模型的思考。

早期成果

关于讨论 Mythos Preview 发现的方式

软件行业的惯例是在漏洞被发现后 90 天内披露(如果在 90 天前已有补丁,则在补丁发布后约 45 天披露)。这为终端用户在攻击者利用漏洞之前更新软件留出了时间。我们自己的协调漏洞披露政策也采取了这种方式。

然而,这意味着已披露的漏洞是 AI 模型网络能力加速前沿的滞后指标:我们目前还无法在不危及终端用户的情况下完整公布合作伙伴使用 Mythos Preview 的发现细节。因此,我们提供模型性能的代表性示例和截至目前的汇总统计数据。一旦 Mythos Preview 发现的漏洞补丁被广泛部署,我们将提供更多详细信息。

来自合作伙伴和外部测试者的证据

Project Glasswing 的初始合作伙伴构建和维护着互联网及其他关键基础设施运转所依赖的基础软件。修复其代码中的缺陷可以降低依赖这些软件的众多组织的风险,从而降低数十亿终端用户的风险。

运行一个月后,大多数合作伙伴各自在其软件中发现了数百个严重或高危漏洞,合计超过一万个。多个合作伙伴表示,他们的漏洞发现速率提高了十倍以上。例如,Cloudflare 在其关键路径系统中发现了 2,000 个漏洞(其中 400 个为高危或严重级别),假阳性率被 Cloudflare 团队认为优于人类测试人员。

这与外部测试者对 Mythos Preview 性能的体验以及近期对该模型的额外评估一致:

  • 英国 AI 安全研究所报告称,Mythos Preview 是首个端到端解决其两个网络靶场(多步网络攻击模拟)的模型;
  • Mozilla 在测试 Mythos Preview 时,在 Firefox 150 中发现并修复了 271 个漏洞——是使用 Claude Opus 4.6 在 Firefox 148 中发现数量的十倍以上;
  • XBOW(独立安全平台)报告称,Mythos Preview 在其网络漏洞利用基准测试中”显著超越所有现有模型”,在 token 级别提供了”前所未有的精确度”;
  • ExploitBenchExploitGym 是两个新发布的学术基准测试,用于衡量模型的漏洞利用开发能力,Mythos Preview 在两项中均为最强表现者。我们在前沿红队博客上详细讨论了这些基准测试对该模型的评价。

更普遍地,我们看到已修补的软件正以更快速度推出。Palo Alto Networks 的最新版本包含了五倍以上的补丁。微软报告称新补丁数量将”在一段时间内持续增长”。Oracle 在其产品和云上发现并修复漏洞的速度比以前快了数倍

Mythos Preview 还被证明对其他类型的安全工作有用。例如,在我们的一家 Glasswing 合作伙伴银行中,Mythos Preview 在威胁行为者入侵客户邮箱账户并拨打欺诈电话后,帮助检测并阻止了一笔 150 万美元的欺诈性电汇。

开源软件

过去几个月,Anthropic 使用 Mythos Preview 扫描了超过 1,000 个开源项目,这些项目共同支撑着互联网的大部分运转——也包括我们自己的基础设施。

到目前为止,Mythos Preview 估计发现了 6,202 个高危或严重漏洞(总计 23,019 个,包括中低危级别)。

其中 1,752 个高危或严重级别漏洞已由六家独立安全研究公司之一(少数情况下由我们自行)仔细评估。其中 90.6%(1,587 个)被证实为有效的真阳性,62.4%(1,094 个)被确认为高危或严重级别。这意味着,即使 Mythos Preview 不再发现新漏洞,按照当前审核后的真阳性率,它有望在开源代码中发现近 3,900 个高危或严重漏洞——这还不包括为 Project Glasswing 合作伙伴发现的漏洞。需要说明的是,我们打算继续扫描开源代码一段时间,因此预计这个数字还会上升。

Mythos Preview 检测到的一个开源漏洞示例来自 wolfSSL——一个以安全性著称的开源加密库,被全球数十亿设备使用。Mythos Preview 构建了一个漏洞利用,攻击者可以伪造证书,使其能够(例如)托管一个银行或电子邮件提供商的假网站。该网站对终端用户看起来完全合法,但实际上由攻击者控制。我们将在未来几周发布对这个已修补漏洞(编号 CVE-2026-5194)的完整技术分析。

如上所述,修复此类漏洞的瓶颈在于人类进行分类、报告、设计和部署补丁的能力。利用 Mythos Preview,发现漏洞本身已变得容易得多。我们创建了一个开源漏洞仪表板(如下),展示了我们披露流程的不同阶段,并将持续跟踪进展。该仪表板显示所有严重级别的漏洞,而非仅限于 Mythos Preview 初始评估为高危或严重的子集。注意每个阶段的急剧下降,反映了验证和修复每个漏洞所需的大量人力投入。

我们的漏洞分类流程非常严格。首先,我们或合作的外部安全公司复现 Mythos 发现的问题并重新评估其严重程度。确认漏洞真实后,我们检查是否已有修复措施,并向软件维护者撰写详细报告。我们在此处非常谨慎:除了维护开源软件的常规挑战外,维护者还面临着大量低质量 AI 生成的漏洞报告的冲击。确实,多位维护者告诉我们他们目前产能严重受限,一些维护者甚至要求我们放慢披露速度,因为他们需要更多时间设计补丁。(平均而言,Mythos Preview 发现的高危或严重漏洞需要两周来修补。)

应维护者要求,我们有时会直接披露漏洞,不进行进一步评估。我们已报告了 1,129 个此类未审核的漏洞,其中 Mythos Preview 估计有 175 个为高危或严重级别。

我们估计到目前为止已向维护者披露了 530 个高危或严重漏洞。这基于直接披露情况下 Claude 对严重程度的评估,以及可用情况下维护者或安全合作伙伴的评估。另有 827 个已确认漏洞(以同样方式估计为高危或严重级别)正在尽快披露中。

已报告的 530 个高危或严重漏洞中,75 个已被修补,65 个已获公开安全公告。补丁数量仍相对较低,原因有三:第一,我们仍处于协调漏洞披露政策规定的 90 天窗口期早期,预计很快会有更多补丁。第二,我们可能低估了补丁数量,因为某些漏洞在没有公开公告的情况下被修补——这些情况下我们依靠 Claude 自行扫描检测补丁。第三,低补丁量反映了一个真实问题:即使我们的披露速度相对缓慢,Mythos Preview 仍在给本已超负荷的安全生态系统增加负担。

发现漏洞的相对容易与修复漏洞的困难形成了网络安全的重大挑战。成功应对这一挑战将使我们的软件比以前安全得多。下面我们讨论网络防御者可以采取的一些适应措施。

适应网络安全的新阶段

具备类似 Mythos Preview 网络安全技能的模型将很快更广泛地可用。软件行业迫切需要更大规模的努力来管理这些模型将产生的大量发现。

目前,从发现漏洞到创建补丁,再到补丁被终端用户广泛部署,通常存在很长的时间差。这为攻击者利用关键软件留下了重大窗口。Mythos 级别的模型显著缩短了发现和利用漏洞所需的时间和成本,放大了与这些时间差相关的风险。最终,Mythos 级别的模型将使开发者能够在部署前捕获漏洞,从而构建更安全的软件。但在这个过渡期——漏洞被快速发现而缓慢修补——存在新的风险。

软件开发者和用户应立即行动以降低这些风险。以下建议并非新鲜事物,许多研究人员(包括 Anthropic 的)正在开发更好、更持久的解决方案。在此期间,做好基本功至关重要:

  • 软件开发者应缩短补丁周期,尽快提供安全修复。合理使用公开可用的 AI 模型可以在此方面提供帮助;我们正在构建工具并分享研究成果以支持这一点(详见下文)。开发者还应尽可能方便用户安装更新,帮助用户保持软件最新状态;在可行的范围内,对仍在运行已知有漏洞软件的用户应更加主动地提醒。
  • 网络防御者应缩短补丁测试和部署时间。美国国家标准与技术研究院和英国国家网络安全中心等机构制定的关键控制措施现在更加重要,因为它们在不依赖单个补丁及时到位的情况下提升安全性。这些措施包括加固网络默认配置、强制实施多因素认证、保持全面的日志用于检测和响应等。

使用公开可用 AI 模型进行网络防御的工具

许多通用模型已经能够发现大量软件漏洞,即使它们无法像 Claude Mythos Preview 那样发现最复杂的漏洞或有效地利用它们。Project Glasswing 已经促使许多组织使用这些通用模型对自己的代码库采取行动;我们正在努力使这变得更加容易。

首先,我们为 Claude Enterprise 客户发布了公测版 Claude Security。这是一个帮助团队扫描代码库漏洞并生成修复建议的工具。在发布后的三周内,Claude Opus 4.7 已被用于修补超过 2,100 个漏洞。(这比上述开源修补更快,主要是因为企业在修复自己的代码,而开源修复通常需要志愿维护者通过协调披露来完成。)

我们还启动了网络验证计划,允许安全专业人员在合法的网络安全用途(如漏洞研究、渗透测试和红队演练)中使用我们的模型,而不受某些旨在防止网络滥用的安全措施的限制。

现在,我们正在按需向符合条件的客户安全团队提供我们和合作伙伴在 Mythos Preview 中使用的工具。我们的目标是让用户更容易地从高能力公开模型中获得最佳性能,而无需复杂的配置。此版本包括:

  • 我们和合作伙伴构建和分享的 skills(用于重复工作的自定义指令);
  • 一个框架,帮助 Claude 映射代码库、启动扫描子代理、分类发现结果并编写报告;
  • 一个威胁模型构建器,用于映射代码库以识别潜在攻击目标并相应优先排序模型的工作。

我们的 Project Glasswing 合作伙伴思科也最近开源了其 Foundry Security Spec,以帮助其他防御者建立类似的评估系统。

支持生态系统

我们与开源安全基金会的 Alpha-Omega 项目建立了合作关系,以支持该基金会协助维护者处理和分类漏洞报告的工作。我们还在继续发表研究,探索前沿模型能力如何最好地支持网络防御者。

我们还支持了 ExploitBenchExploitGym 的开发——这两个新基准测试允许研究人员跟踪前沿 AI 模型的漏洞利用开发能力的变化,我们在此处进行了讨论。我们通过外部研究人员访问计划支持其他高质量定量基准测试的开发。最后,Claude for Open Source 为维护者和贡献者提供支持,我们承诺未来将扫描我们自身采用的任何开源包。

Project Glasswing 的下一步

AI 进步的速度意味着,与 Mythos Preview 同等能力的模型将很快被许多不同的 AI 公司开发出来。目前,没有任何公司——包括 Anthropic——开发出足够强大的安全措施来防止此类模型被滥用并可能造成严重危害。这就是我们尚未向公众发布 Mythos 级别模型的原因。但这也是我们启动 Project Glasswing 的原因:如果一个同等能力的模型在没有此类安全措施的情况下被发布,世界上几乎任何人利用有缺陷的软件将很快变得极其廉价和容易。

Glasswing 帮助最具系统重要性的网络防御者获得不对称优势。然而,迫切需要尽可能多的组织加强其网络防御。我们希望我们的通用模型,以及随之提供的新工具、资源和研究,能够支持这些组织改善其网络安全态势。

接下来,我们将与关键合作伙伴——包括美国及盟国政府——合作,将 Project Glasswing 扩展到更多合作伙伴。在不久的将来,一旦我们开发出所需的更强大的安全措施,我们期待通过正式发布使 Mythos 级别的模型广泛可用。

在这些风险的另一端,有一个令人鼓舞的世界等待着我们:重要代码的加固程度远超今天,黑客攻击的流行程度大幅降低。虽然障碍重重,但我们仍然相信 Project Glasswing 可以帮助我们到达那里。

Project Glasswing: An initial update

Project Glasswing: An initial update

Last month, we launched Project Glasswing, our collaborative effort to secure the world’s most critical software before increasingly capable AI models can be turned against it.

Since then, we and our approximately 50 partners have used Claude Mythos Preview to find more than ten thousand high- or critical-severity vulnerabilities across the most systemically important software in the world. Progress on software security used to be limited by how quickly we could find new vulnerabilities. Now it’s limited by how quickly we can verify, disclose, and patch the large numbers of vulnerabilities found by AI.

In this post, we discuss what we’ve learned about this critical challenge for cybersecurity in the first weeks of Project Glasswing. We focus on the early public evidence of Mythos Preview’s performance, on the initial results of our effort to scan thousands of open-source software projects, and on what this progress means for cyberdefenders today. We also cover what to expect next from Project Glasswing, and how we’re thinking about releasing Mythos-class models in the future.

Our early results

Our approach to discussing Mythos Preview’s findings

The software industry’s longstanding convention is to disclose new vulnerabilities 90 days after they’re discovered (or, if a patch is created before the 90 days is up, around 45 days after the patch becomes available). This allows time for end users to update their software before a vulnerability can be exploited by attackers. Our own Coordinated Vulnerability Disclosure policy takes this approach.

However, this means that disclosed vulnerabilities are a lagging indicator of the accelerating frontier of AI models’ cyber capabilities: we’re not yet at the point where we can fully detail our partners’ findings with Mythos Preview without putting end users at risk. Instead, we provide illustrative examples of the model’s performance, along with aggregate statistics on our progress to date. Once patches for the vulnerabilities that Mythos Preview has discovered are widely deployed, we’ll provide much more detail about what we’ve learned.

Evidence from our partners and external testers

Project Glasswing’s initial partners build and maintain software that is fundamental to the functioning of the internet and other essential infrastructure. Fixing flaws in their code reduces risk for the many other organizations that rely on it, and therefore reduces risk for billions of end users.

After one month, most partners have each found hundreds of critical- or high-severity vulnerabilities in their software. Collectively, they’ve found more than ten thousand. Several have told us that their rate of bug-finding has increased by more than a factor of ten. For instance, Cloudflare has found 2,000 bugs (400 of which are high- or critical-severity) across their critical-path systems, with a false positive rate that Cloudflare’s team considers better than human testers.

This tallies with external testers’ experience of Mythos Preview’s performance, and with recent additional evaluations of the model:

  • The UK’s AI Security Institute reports that Mythos Preview is the first model to solve both of their cyber ranges (simulations of multistep cyberattacks) end to end;
  • Mozilla found and fixed 271 vulnerabilities in Firefox 150 while testing Mythos Preview—over ten times more than they found in Firefox 148 with Claude Opus 4.6;
  • XBOW, an independent security platform, reports that Mythos Preview is a “significant step up over all existing models” on its web exploit benchmark, and provides “absolutely unprecedented precision” on a token-for-token basis;
  • ExploitBench and ExploitGym, two recently released academic benchmarks for measuring models’ exploit development capabilities, show Mythos Preview as the strongest performer. We discuss what these benchmarks tell us about the model in more detail on our Frontier Red Team blog.

More generally, we’re now seeing that patched software is being rolled out much more quickly. The latest Palo Alto Networks release included over five times as many patches as usual. Microsoft has reported that the number of new patches they’ll release will “continue trending larger for some time.” And Oracle is finding and fixing vulnerabilities across its products and cloud multiple times faster than before.

Mythos Preview has also proved useful for other kinds of security work. For example, at one of our Glasswing partner banks, Mythos Preview helped to detect and prevent a fraudulent $1.5 million wire transfer after a threat actor compromised a customer’s email account and made spoof phone calls.

Open-source software

For the last few months, Anthropic has used Mythos Preview to scan more than 1,000 open-source projects, which collectively underpin much of the internet—and much of our own infrastructure.

So far, Mythos Preview has found what it estimates are 6,202 high- or critical-severity vulnerabilities in these projects (out of 23,019 in total, including those it estimates as medium- or low-severity).

1,752 of those high- or critical-rated vulnerabilities have now been carefully assessed by one of six independent security research firms, or in a small number of cases by ourselves. Of these, 90.6% (1,587) have proved to be valid true positives, and 62.4% (1,094) were confirmed as either high- or critical-severity. That means that even if Mythos Preview finds no further vulnerabilities, at our current post-triage true-positive rates, it’s on track to have surfaced nearly 3,900 high- or critical-severity vulnerabilities in open-source code—in addition to those it has found for Project Glasswing’s partners. To be clear, we intend to continue scanning open-source code for some time, so we expect this number to rise.

One example of an open-source vulnerability that Mythos Preview detected was in wolfSSL, an open-source cryptography library that’s known for its security and is used by billions of devices worldwide. Mythos Preview constructed an exploit that would let an attacker forge certificates that would (for instance) allow them to host a fake website for a bank or email provider. The website would look perfectly legitimate to an end user, despite being controlled by the attacker. We’ll release our full technical analysis of this now-patched vulnerability (assigned CVE-2026-5194) in the coming weeks.

As we noted above, the bottleneck in fixing bugs like these is the human capacity to triage, report, and design and deploy patches for them. Finding them in the first place has become vastly more straightforward with Mythos Preview. We’ve created a dashboard of the open-source vulnerabilities we’ve scanned, below, which shows the different steps in our disclosure process and will track our progress over time. This shows vulnerabilities of all severity levels, rather than only the subset initially assessed as high- or critical-severity by Mythos Preview. Note the steep drop-off at each phase, reflecting the amount of human effort required to verify and fix each of the vulnerabilities.

Our dashboard of open-source vulnerabilities, showing vulnerabilities of all severities (rather than only those estimated high- or critical-severity by Mythos Preview).

Our process for triaging vulnerabilities is intensive. First, we or one of the external security firms we work with reproduce the issue that Mythos has found and re-assess its severity. Once we’ve confirmed that a vulnerability is real, we check for whether there are already fixes in place, and write a detailed report to the software’s maintainers. We take considerable care here: on top of the regular challenges of maintaining open-source software, maintainers have been facing a deluge of low-quality, AI-generated bug reports. Indeed, several maintainers have told us they’re currently severely capacity constrained, and some have even asked us to slow down our rate of our disclosures because they need more time to design patches. (On average, a high- or critical-severity bug found by Mythos Preview takes two weeks to patch.)

On maintainers’ request, we sometimes disclose bugs directly, without further assessment. We’ve now reported 1,129 such unvetted bugs, of which Mythos Preview estimated that 175 were high- or critical-severity.

We estimate that we’ve disclosed 530 high- or critical-severity bugs to maintainers so far. This is based on Claude’s assessment of severity in the case of direct disclosures, and maintainers’ or our security partners’ assessment where available. There are a further 827 confirmed vulnerabilities (estimated as high- or critical-severity in the same manner) that we’re aiming to disclose as quickly as possible.

75 of the 530 high- or critical-severity bugs we’ve reported have now been patched, and 65 of those have been given public advisories. The number of patches is still relatively low for three reasons. First, we’re still early in the 90-day window that’s set out in our Coordinated Vulnerability Disclosure policy: we expect many more patches to land soon. Second, we are likely to be undercounting patches because some vulnerabilities are patched without a public advisory: in those cases, we’re reliant on scanning for the patches ourselves using Claude. Third, the low volume of patches reflects a genuine problem: even at our relatively slow pace of disclosures, Mythos Preview is adding to an already-overloaded security ecosystem.

The relative ease of finding vulnerabilities compared with the difficulty of fixing them amounts to a major challenge for cybersecurity. Confronting this challenge successfully will make our software far safer than before. Below we discuss some ways that cyber defenders can adapt.

Adapting to a new phase of cybersecurity

Models with similar cybersecurity skills to Mythos Preview will soon be more broadly available. There is a clear need for a larger effort across the software industry to manage the volume of findings that these models will generate.

Currently, there’s often a long lag between the discovery of a vulnerability, the creation of a patch for it, and the time when the patch is widely deployed by end users. This leaves open a significant window for attackers to exploit critical software. Mythos-class models significantly shrink the time and cost required to find and exploit vulnerabilities, magnifying the risk associated with these time lags. Ultimately, Mythos-class models will enable developers to build far more secure software by catching bugs before they are deployed. But this interim period—while vulnerabilities are being rapidly discovered and slowly patched—presents new risks.

Software developers and users should act now to reduce their exposure to these risks. The advice below is not new, and many researchers (including at Anthropic) are currently working on better and more durable solutions. In the meantime, it’s important to get the basics right:

  • Software developers should shorten their patch cycles and make security fixes available as quickly as possible. The thoughtful use of publicly available AI models can help here; we’re building tools and sharing our research to support this (more details below). Developers should also help their users stay up-to-date with their software by making it as easy as possible to install updates; to the extent feasible, they should be more persistent with users who are still running software with known vulnerabilities.
  • Network defenders should shorten their patch testing and deployment timelines. The critical controls laid out by organizations like the National Institute of Standards and Technology and the UK’s National Cyber Security Centre are now all the more important, since they improve security without depending on any single patch landing in time. These include steps like hardening networks’ default configurations, enforcing multi-factor authentication, and keeping comprehensive logs for detection and response.

Tools for cyberdefense with publicly available AI models

Many generally-available models can already find large numbers of software vulnerabilities, even if they can’t find the most sophisticated vulnerabilities or exploit them as effectively as Claude Mythos Preview. Project Glasswing has already spurred many other organizations to take action on their own codebases with these generally-available models; we’re working to make this much easier to do.

To begin, we’ve released Claude Security in public beta for Claude Enterprise customers. It’s a tool that helps teams scan their codebases for vulnerabilities, and which can generate proposed fixes for them. In the three weeks since launch, Claude Opus 4.7 has been used to patch over 2,100 vulnerabilities. (This is faster than the open-source patching described above in large part because enterprises are fixing their own code, whereas open-source fixes usually require volunteer maintainers who work through coordinated disclosure.)

We’ve also begun our Cyber Verification Program, which allows security professionals using our models for legitimate cybersecurity purposes (such as vulnerability research, penetration testing, and red-teaming) to do so without certain safeguards designed to prevent cyber misuse.

Now, we’re making the tools that we and our partners have used with Mythos Preview available to qualifying customers’ security teams on request. Our aim is to make it much easier to get the best performance out of highly capable public models without extensive setup. This release includes:

  • The skills (custom instructions for repeated work) that we and our partners have built and shared;
  • A harness that helps Claude map the codebase, spin up scanning subagents, triage its findings, and write reports;
  • A threat model builder, which maps a codebase to identify potential targets for attack and prioritizes the model’s work accordingly.

Cisco, one of our Project Glasswing partners, has also recently open-sourced its Foundry Security Spec to help other defenders build an evaluation system similar to the one they use themselves.

Supporting the ecosystem

We’ve formed a partnership with the Open Source Security Foundation’s Alpha-Omega project, which will support the foundation’s efforts to assist maintainers in processing and triaging bug reports. We’re also continuing to publish research into how frontier model capabilities can best support cyberdefenders.

We’ve also supported the development of ExploitBench and ExploitGym, the two new benchmarks that allow researchers to track frontier AI models’ exploit development capabilities over time, as we discuss here. We’re supporting the development of other high-quality quantitative benchmarks through our External Researcher Access Program. Finally, Claude for Open Source supports maintainers and contributors, and we’re committing to scan any open-source package that we adopt ourselves in the future.

What’s next for Project Glasswing

The speed of AI progress means that models as capable as Mythos Preview will soon be developed by many different AI companies. At present, no company—including Anthropic—has developed safeguards strong enough to prevent such models from being misused and potentially causing severe harm. That is why we have yet to release Mythos-class models to the public. But it’s also why we began Project Glasswing: if a similarly capable model is released without such safeguards, it will soon become dramatically cheaper and easier for almost anyone in the world to exploit flawed software.

Glasswing helps the most systemically important cyber defenders gain an asymmetric advantage. However, there is an urgent need for as many organizations as possible to shore up their cyber defenses. We hope that our generally available models, and the new tools, resources, and research we’re providing to accompany them, will support those organizations to improve their cybersecurity posture.

Next, we will work with critical partners—including US and allied governments—to expand Project Glasswing to additional partners. And in the near future, once we’ve developed the far stronger safeguards we need, we look forward to making Mythos-class models available through a general release.

On the far side of these risks, there’s an encouraging world available to us: one in which important code is hardened far better than it is today, and in which hacking is far less prevalent. There are many obstacles, but we’re nonetheless confident that Project Glasswing can help get us there.

2028: Two scenarios for global AI leadership

Our views on the AI competition between the US and China.

Read more

Teaching Claude why

New research on how we’ve reduced agentic misalignment.

Read more

Natural Language Autoencoders: Turning Claude’s thoughts into text

AI models like Claude talk in words but think in numbers. In this study, we train Claude to translate its thoughts into human-readable text.

Read more