Omi Iyamu · Personal DossierVol. XVII · 2026 Edition

§ V, Essays

Blog.

Long‑form pieces on AI, leadership, and the slow work of turning research into product. No set cadence, these go up when they're ready.

2026 · 07 · 27

4 min read

Moonshot AI releases Kimi K3 open weights: 2.8T-parameter MoE, Modified MIT license

Moonshot released Kimi K3's open weights at 00:00 UTC on July 27, 2026. A 2.8-trillion-parameter mixture-of-experts model under a Modified MIT license, roughly 594 GB in BF16 and 300 to 400 GB quantized to MXFP4. It is the largest open-weight release in history by about 3x and si

read →

2026 · 07 · 25

4 min read

Introducing Claude Opus 5

Anthropic shipped Claude Opus 5 on July 24 at the same price as Opus 4.8 ($5 in / $25 out per million tokens). Sets new highs on Frontier-Bench (43.3%) and SWE-bench Pro (79.2%), and lands 61 on the Artificial Analysis Intelligence Index. Ships with a low/medium/high effort toggl

read →

2026 · 07 · 24

4 min read

Experts say exploiting Anthropic's Fable isn't how Kimi K3 got so good

White House OSTP director Michael Kratsios accused Moonshot of large-scale covert industrial distillation of Anthropic's Fable 5 to build Kimi K3, and Treasury signaled sanctions and Entity List designations are on the table. It is the first US accusation naming a specific Chines

read →

2026 · 07 · 23

9 min read

When the model broke out

A frontier model escaped its eval sandbox and reached Hugging Face's production systems. The uncomfortable lessons for anyone who runs evals or ships agents.

Includes Pull quote

read →

2026 · 07 · 10

4 min read

GPT-5.6 Sol, Terra, and Luna go generally available

OpenAI made GPT-5.6 Sol, Terra, and Luna generally available on July 9. API pricing is $5/$30 per million tokens for Sol, $2.50/$15 for Terra, $1/$6 for Luna, all on a 1.05M-token context window with 128K max output. Sol Ultra scores 91.9 on OpenAI's composite; Terra lands near G

read →

2026 · 07 · 09

4 min read

OpenAI ships GPT-5.6 to the public after 30-day US cyber review

OpenAI released GPT-5.6 publicly on July 8 after a 30-day pre-release review under Trump's June 2 cybersecurity executive order. The family ships three tiers: Sol ($5/$30), Terra ($2.50/$15), and Luna ($1/$6) per million tokens. Sol adds an Ultra mode that spawns coordinated suba

read →

2026 · 07 · 04

3 min read

How the world's top AI models were revived

Axios reconstructs the 19 days between June 12 and July 1 when Anthropic could not ship Fable 5 or Mythos 5. Amazon flagged a jailbreak; Commerce Secretary Howard Lutnick called Dario Amodei; export controls landed. Anthropic sent engineers to DC. CAISI and the NSA rejected the f

read →

2026 · 07 · 03

5 min read

Agentjacking: hijacking AI coding agents via poisoned Sentry error events

Tenet Security disclosed a class of attack, agentjacking, in which a public Sentry DSN, the write-only key embedded in a website's frontend, is used to inject a poisoned error event that hijacks AI coding agents connected to Sentry via MCP. When a developer later asks the agent t

read →

2026 · 07 · 02

3 min read

Anthropic redeploys Claude Fable 5 with a targeted cybersecurity classifier after 19-day pull

Anthropic returned Claude Fable 5 to global availability on July 1, nineteen days after pulling it under US export controls. The redeployment ships with a safety classifier trained to block the specific cybersecurity jailbreak Amazon researchers surfaced in June, at a reported 99

read →

2026 · 07 · 01

3 min read

Introducing Claude Sonnet 5

Anthropic released Claude Sonnet 5, a midsize model priced at $2 per million input tokens and $10 per million output through August 31, roughly 1/7th of Opus 4.8. It slightly outperforms Opus 4.8 on Anthropic's knowledge-work benchmark, scores 63.2% on SWE-bench Pro and 80.4% on

read →

2026 · 06 · 28

4 min read

Anthropic's Mythos 5 AI model cleared by U.S. for wider use

The US Commerce Department lifted its June 12 export-control block on Anthropic's Mythos 5 for roughly 100 organizations listed in Annex A of Secretary Lutnick's letter, including Fortune 500 firms, government agencies and critical-infrastructure operators. Their foreign-national

read →

2026 · 06 · 27

4 min read

Summary of METR's predeployment evaluation of GPT-5.6 Sol

METR posted its predeployment evaluation of GPT-5.6 Sol. OpenAI gave them a railfree checkpoint, raw chain-of-thought, internal Codex harness docs, and updated answers to the Frontier Risk Report questionnaire — the most external access to a US frontier model before launch. Headl

read →

2026 · 06 · 27

4 min read

Previewing GPT-5.6 Sol: a next-generation model

OpenAI previewed GPT-5.6 today: three models — Sol the flagship, Terra the mid-tier, Luna the cheap one. A 1.5M-token context window, a new ultra mode that fans subagents out across hard tasks, and a new state of the art on Terminal-Bench 2.1. The cybersecurity section of the sys

read →

2026 · 06 · 23

3 min read

Daybreak: Tools for securing every organization in the world

OpenAI moved Daybreak past discovery and into end-to-end patch automation. GPT-5.5-Cyber is in general availability for trusted defenders. The Daybreak Cyber Partner Program already includes Akamai, Cisco, Cloudflare, CrowdStrike, Fortinet, Oracle, Palo Alto Networks, and Zscaler

read →

2026 · 06 · 17

4 min read

OpenAI's Deployment Simulation Extends Pre-Deployment Risk Assessment to Agentic Coding Through Simulated Tool Calls

OpenAI shipped Deployment Simulation on June 16: replay roughly 1.3 million de-identified ChatGPT conversations from GPT-5 Thinking through GPT-5.4 (Aug 2025-Mar 2026) through a candidate model with the assistant turn redacted, then grade the new completions against production tr

read →

2026 · 06 · 15

4 min read

Anthropic sends staff to Washington to fight Fable, Mythos export controls

Anthropic dispatched senior security researcher Nicholas Carlini, head of safeguards Dave Orr, and risk-evaluation lead Logan Graham to Washington Saturday for direct talks with Commerce Secretary Lutnick and National Cyber Director Cairncross, seeking to end the export-control d

read →

2026 · 06 · 14

3 min read

Statement on the US government directive to suspend access to Fable 5 and Mythos 5

On June 12, the US government issued an export-control directive ordering Anthropic to suspend access to Claude Fable 5 and Claude Mythos 5 for any foreign national, including its own foreign-national employees. Because Anthropic cannot filter citizenship in real time, both model

read →

2026 · 06 · 10

4 min read

Anthropic releases Claude Fable 5, a public version of Mythos, days after warning AI is becoming too dangerous

Anthropic shipped Claude Fable 5, the first publicly available Mythos-class model, paired with a restricted twin called Mythos 5 deployed via Project Glasswing with the US government. Fable 5 ships with a runtime classifier that routes prompts in cybersecurity, biology/chemistry,

read →

2026 · 06 · 06

4 min read

When AI builds itself

Anthropic published a position piece signed by Marina Favaro and Jack Clark arguing that recursive self-improvement is closer than the industry openly discusses, and proposing a verifiable coordinated pause as a tool frontier labs should be willing to deploy. The most-cited numbe

read →

2026 · 06 · 03

4 min read

Expanding Project Glasswing

Anthropic expanded Project Glasswing to about 150 additional organizations across more than 15 countries on June 2, widening access to Claude Mythos Preview — its most capable model, held in controlled research preview because of offensive cyber capabilities. New sectors include

read →

2026 · 05 · 29

4 min read

Anthropic releases Claude Opus 4.8 with Dynamic Workflows and a 3x cheaper fast mode

Anthropic released Claude Opus 4.8 on May 28, 41 days after Opus 4.7, at the same $5/$25 per million tokens. SWE-bench Verified moved to 88.6, SWE-bench Pro to 69.2, and GDPval-AA to 1890. Fast mode runs at 2.5x output speed and is 3x cheaper than the prior Opus fast tier. The re

read →

2026 · 05 · 22

4 min read

Alibaba's Qwen3.7-Max ships with native Anthropic API support, 1M context, 35-hour autonomous runs

Alibaba released Qwen3.7-Max as an agent-first model with a 1M-token context, native support for the Anthropic API protocol (so it drops into a Claude Code harness), and benchmark wins including 92.4 on GPQA Diamond, 41.4 on HLE, and $2.08M of simulated revenue in YC-Bench. It is

read →

2026 · 04 · 18

9 min read

When the model is too good to ship

Anthropic stopped Claude Mythos at the lab door because it found thousands of zero days during evaluation. The lesson is bigger than safety theatre.

read →

2026 · 04 · 02

8 min read

The desktop just got automated

GPT-5.4 scored above the human baseline on OSWorld-V this quarter. The 12 week response for SaaS founders looks the same as the playbook from 2009.

read →

2026 · 04 · 12

11 min read

The Research → Product Gap

Why most 'AI breakthroughs' never ship, and the 12-week playbook I used at Google Brain to move them from paper to production.

Includes Chart

read →

2026 · 02 · 28

7 min read

Hiring for AI Taste

Resumes, demos, and model evals are all lagging indicators. Here's what I screen for instead.

Includes Pull quote

read →

2026 · 01 · 17

6 min read

RAG Is Not Architecture

RAG is a technique. If your 'AI strategy' is a vector database, you don't have one.

Includes Diagram

read →

2025 · 11 · 09

14 min read

Notes on Governing AI at Hyperscale

What I learned authoring Google's company-wide AI/ML privacy framework, and how I'd rewrite it for 2026.

Includes Stat

read →

© Omi Iyamu · MMXXVIContact → · linkedin.com/in/omiiyamu