Omi Iyamu · Personal DossierVol. XVII · 2026 Edition
Omi Iyamu.
§, The Briefs

The Brief.

A short weekly note on AI, what caught my eye, what I'm building, and what I'm telling portfolio CTOs. Lands every Wednesday.

Free · Wednesdays · ~5 min read
Get the next issue in your inbox.
Brief № 0602026 · 06 · 08

The week the rule clocks started

Anthropic put a number on its own acceleration (80%) and asked for a pause. Congress, the White House, and Illinois each shipped something with a clock on it.

§ 01, Intro

Rules came from every direction this week. Anthropic published a position piece asking for a coordinated pause and admitted Claude now writes more than 80% of its own code. Congress, the White House, and Illinois each shipped something with a timer on it — while NVIDIA opened a 550B model on the same day, in the opposite direction.

§ 02, When AI builds itself

Anthropic published a position piece this week arguing that recursive self-improvement is closer than the industry talks about, and proposing a verifiable, coordinated pause as a tool labs should be willing to use. The number that lands is theirs: more than 80% of the code merged into Anthropic's own codebase is now written by Claude. The lab proposing the pause is the one whose models are accelerating its own development the most. Two things to do with it. Read the original before reading the takes. Then pull the same stat for your own team. If the answer is not a percentage, that is the work. If it is, you now have a number to track.

§ 03, Expanding Project Glasswing

Anthropic expanded Project Glasswing to 150 organizations across 15+ countries this week — operators in power, water, healthcare, communications, and hardware. Each one gets gated access to Claude Mythos Preview, the company's most capable model, held in research preview because of its offensive cyber capabilities. Mythos has surfaced more than 10,000 high or critical vulnerabilities since the program launched in April. The structural choice is the interesting part. Anthropic is treating its strongest model as a controlled asset, distributed to vendors whose codebases a successful attack would compromise across 100M-plus users. That is the inverse of the usual capability-race posture, and the more useful one to learn from.

§ 04, NVIDIA Nemotron 3 Ultra: 550B-parameter open-weights frontier model

NVIDIA opened the weights of Nemotron 3 Ultra on June 4: a 550B-total / 55B-active MoE with a Mamba-Transformer hybrid architecture, 1M-token context, pre-trained in NVFP4. Published numbers worth holding onto: 5x faster inference, ~300 output tokens per second, 30% lower cost on agentic tasks vs other open frontier models, and 48 on the Artificial Analysis intelligence index — second among US open-weight models. The architecture is the interesting part. Mamba state-space layers avoid full quadratic attention, which is most of why the 1M context is tractable at this throughput. If the agent-cost number survives real workloads, the middle ground between expensive closed APIs and weaker open models just thinned out. Open weights at this size still means a real serving bill. I want independent throughput numbers before I move anything off Claude or Gemini.

§ 05, Introducing new capabilities to GPT-Rosalind

GPT-Rosalind got a real update on June 3, not a marketing one. OpenAI layered GPT-5.5's agentic coding onto stronger medicinal-chemistry and genomics reasoning, and shipped one number worth holding onto: 31% fewer tokens than GPT-5.5 on long-horizon quantitative biology analyses, with accuracy gains. Token cost compounds across drug-discovery pipelines; that gap is the difference between a model your comp-bio team runs quietly and one that gets killed at the next budget review. Named partners are Amgen, Moderna, the Allen Institute, Thermo Fisher Scientific — research teams, not pilot logos. The research preview now opens to eligible organizations worldwide for the first time. Two things I am watching: whether the cost gap survives tool-heavy agentic loops on dirty experimental data, and whether the global-access program is real or paperwork. If you have early hands-on, send me a note.

§ 06, Introducing MAI-Code-1-Flash

Microsoft shipped seven new in-house models this week, but the one worth reading the model card for is MAI-Code-1-Flash. It is a 5B-parameter coding model trained on production Copilot harnesses and licensed data, and it beats Claude Haiku 4.5 on every coding benchmark Microsoft tested, including a 16-point lead on SWE-Bench Pro at roughly 60% fewer tokens. The pattern matters more than the result. Microsoft is moving away from 'the best model wins' and toward 'the best model for this surface, on this harness, at this token budget.' For teams shipping coding tools, the bar to justify routing to a frontier model is rising again.

§ 07, NVIDIA OpenShell ships as an open-source agent sandbox runtime, with Cadence's L

The NVIDIA Computex announcement I am most interested in is not a model. It is OpenShell, an open-source agent security and governance runtime. The runtime is a sandbox. Enterprises define what an agent can do, what it cannot touch, and what requires human approval. Boundary, blast radius, approval. Three things every production agent needs and few frameworks ship honestly. Cadence Design Systems also announced a Level-5 autonomous ChipStack agent built on Nemotron and secured by OpenShell. That part matters. The runtime ships with a real production deployment from day one, in a domain, chip design, that has zero tolerance for an agent that quietly wires the wrong net. This is the agent-governance pattern I have been writing about. NVIDIA and Microsoft are pushing it into the default Windows runtime, which is how patterns become defaults.

§ 08, Rep. Lori Trahan is working across the aisle on AI regulation. Some in her party

The Great American AI Act discussion draft is the first substantive federal AI bill from this Congress. 269 pages, bipartisan, with three operational pieces worth tracking: a three-year preemption of state laws on AI model development, mandatory catastrophic-risk frameworks for frontier developers above $500M in revenue, and an incident reporting clock of 15 days for critical events and 24 hours for imminent risk of death or serious injury. The Center for AI Standards and Innovation gets $300M over three years to run it. Both Anthropic and OpenAI support the draft. Americans for Responsible Innovation, Public Citizen, and the Alliance for Secure AI do not. The fight will be on preemption in public. The change to your work is the 15-day clock. Draft the report you would file. The shape of that draft will tell you which parts of your safety stack are written down and which only live in someone's head.

§ 09, Trump signs AI Innovation and Security executive order with voluntary 30-day mod

The Trump White House signed an AI Innovation and Security executive order on Tuesday. The most quoted provision: AI companies can voluntarily submit their most powerful models for federal review up to 30 days before release. The earlier draft had a 90-day review window; that timeline did not survive. The order also creates an AI cybersecurity clearinghouse to share vulnerability information and instructs federal agencies to build cyber-capability benchmarks for AI models. What it explicitly does not do is create a mandatory licensing or pre-clearance regime. For US-based AI developers, the practical change in the next quarter is small. The interesting question is what happens to the voluntary review process the first time a company declines.

§ 10, Illinois SB 315 clears the House 110-0, heads to Pritzker

The Illinois House passed SB 315, the Artificial Intelligence Safety Measures Act, 110-0 on May 27, and the bill is on Governor Pritzker's desk. If signed, Illinois becomes the first US state to require annual independent third-party safety audits of frontier model developers, plus critical incident reporting, whistleblower protections, and civil penalties. It joins New York's RAISE Act and California's SB 53 as the third state to write a frontier-model statute in nine months. The third-party audit requirement is the change worth tracking. It is harder to deflect than self-attestation, and that kind of requirement tends to harden industry norms faster than legislators expect. The other practical consequence: a US frontier lab now needs a single compliance posture across California, New York, Illinois, and the EU AI Act. The cheapest path is to write to the strictest and apply it everywhere.

§ 11, The compliance gap that could expose your AI systems

ComplyAdvantage's annual compliance survey landed Monday. Two numbers worth pinning. Ninety-four percent of compliance leaders believe existing and incoming AI regulations will prove effective. Fewer than three in five describe their own AI oversight programs as fully mature. That gap — between belief in the rules and readiness to follow them — is where most enforcement actions will land in the next twelve months. The survey also flags the structural shift agentic AI is forcing on compliance teams: 'the model decided' is no longer a defensible answer. Compliance officers now have to articulate how a decision was reached, whether it can be reproduced, and who owns it. For fintech and regulated-domain teams, the lesson is unglamorous and overdue. Build the decision-context capture layer before the regulator asks for it.

§ 12, OpenAI rolls out Dreaming V3, a self-updating memory architecture for ChatGPT

OpenAI began rolling out Dreaming V3, its new memory architecture for ChatGPT, on June 4 for Plus and Pro users in the US. Two numbers from the release: internal factual-recall eval moved from 41.5% in 2024 to 82.8% in 2026, and the compute needed to serve memory dropped roughly 5x. The interesting design choice is the transparency surface. Users can see what ChatGPT believes about them, dismiss it, correct it, or instruct it. Memory the user can audit is a different product from memory that just accumulates. The right pattern to copy if you ship agents with persistent state: the model should never know more about the user than the user can read in a UI. The wrong pattern: a hidden notes file you have to ask for. The number to ask your team is which one you are shipping today.

§ 13, Uber exhausts its full 2026 AI budget by mid-April

Uber burned through its full 2026 AI budget by mid-April, four months in. The COO said the company cannot draw a clean line from rising token consumption to useful consumer features shipped. Per-engineer monthly API costs ran $500 to $2,000, well past internal forecasts, driven by Claude Code and Cursor. About 11% of Uber's backend code updates now come from AI agents. Two notes worth keeping. First, internal leaderboards that rewarded tool adoption probably amplified the spend. Adoption is not an outcome. Second, this is the first credible production datum from a hyperscale engineering org that disagrees with the consensus narrative on AI ROI. The number to track this quarter is shipped-features-per-AI-dollar against a control group. Most companies still do not have a control group.

§ 14, Anthropic closes Series H at ~$965B, surpasses OpenAI as most valuable AI lab

Anthropic closed its Series H at $65 billion in financing and a $965 billion post-money valuation, putting it ahead of OpenAI as the most valuable private AI lab. Annualized revenue run-rate jumped from roughly $10 billion in late 2025 to $47 billion now, a curve almost entirely driven by Claude Code. The number that matters to me is the run-rate, not the valuation. A coding product moving roughly $37 billion in twelve months is the clearest evidence yet that the frontier-model-as-developer-tool thesis works at the high end of the market. The strategic question for everyone building outside the labs is whether the same compounding curve is available in any vertical that is not pure code, or whether code is the unusual case. I think code is the unusual case. I would like to be wrong.

§ 15, Gemini Spark goes live for US Google AI Ultra subscribers

Google's Gemini Spark went live on May 29 for US Google AI Ultra subscribers. It is Google's first serious shot at a 24/7 personal agent. It reads across Gmail, Docs, and Slides, takes actions across those apps, and runs in the background on phone and laptop. The product is labeled Beta. Two things stand out. First, the architectural decision to put a persistent agent inside Workspace, where most knowledge workers already live, is a sharper distribution play than another standalone app. Second, the privacy surface is large. What gets logged, what gets shared with the broader Gemini training pipeline, and what default permissions the agent inherits from your Google account are the questions worth asking before turning this on for a team.

§ 16, NVIDIA launches Cosmos 3, an open foundation model for physical AI

NVIDIA dropped Cosmos 3 at Computex this morning, an open foundation model for physical AI. One mixture-of-transformers handles vision reasoning, world generation, and action prediction inside a single model. Training data was 20 trillion tokens. About a billion images, 400 million real and synthetic videos, ambient audio, text, and action data from humans and robots. Two sizes are on Hugging Face today, Nano and Super. The architectural claim is the interesting part. A single shared representation across perception, prediction, and action, instead of bolting modality-specific models together. I am most curious about Nano. If the small variant runs locally on a robot or vehicle, the dependency on cloud inference for safety-critical loops finally has an off-ramp.

§ 17, Closer

If you ship to a regulated surface, draft what your 15-day incident report would say — the gaps you find are the work. If you ship agents, read the OpenShell runtime before the default Windows surface catches up. Reply if you have early hands-on with Nemotron 3 Ultra; I want independent throughput numbers before I move anything off Claude or Gemini.

Brief 1 of 17
© Omi Iyamu · MMXXVIContact → · linkedin.com/in/omiiyamu