Skip to content

← AI Tracker

AI Wochen-Digest

Anthropic $65B / IPO, Nemotron 3 Ultra, Claude Code Plugins

Freitag, 5. Juni 2026 - Wöchentliches AI-Briefing · (letzte 7 Tage)

Anthropic dominated the week: a $65B Series H at $965B valuation, a confidential IPO filing, Opus 4.8 shipping with dynamic workflows and effort controls, and the claim that 80% of its production code is now Claude-authored. Meanwhile NVIDIA dropped Nemotron 3 Ultra (550B MoE, 55B active, 1M context) as the strongest US open-weights model, Microsoft launched its MAI model family at Build, and Claude Code gained auto-loaded plugins from .claude/skills — directly relevant to your skills-framework workflow. The capital intensity of the race is now undeniable: Alphabet announced an $80B stock raise purely for AI compute.

Launches & releases this week

Models

  • Claude Opus 4.8 — Incremental Opus update with adjustable effort controls, dynamic workflows in Claude Code, and cheaper fast mode; tripled GPT-5.5’s ARC-AGI-3 score. (TLDR AI)
  • Nemotron 3 Ultra — 550B MoE (55B active), 1M context, NVFP4 quant, 300+ tok/s, scoring 48 on Artificial Analysis Intelligence Index. (TLDR AI)
  • MAI-Thinking-1 & MAI Family — Microsoft shipped 7 MAI models; Thinking-1 is 1T params (35B active MoE), 256K context, 97% AIME 2025. (TLDR AI)
  • MiniMax M3 — Open-weights multimodal agent model with 1M-token context, sparse attention, and desktop computer-use capability. (TLDR AI)
  • Grok Build 0.1 — xAI’s coding model in public beta: 100+ tok/s, $1/$2 per million tokens in/out, integrates with Cursor. (TLDR AI)
  • Mellum2 12B MoE — JetBrains open-sourced a 12B MoE model (Apache 2.0) optimised for routing, sub-agents, and code workflows. (JetBrains AI blog)
  • Qwen3.7-Plus — Multimodal agent model unifying vision/language for GUI+CLI operation in a single agent loop. (TLDR AI)

Features & Tools

  • Claude Code Dynamic Workflows — Claude breaks tasks into parallel subtasks; Jarred Sumner rewrote Bun from Zig to Rust (750K lines) in 11 days. (TLDR AI)
  • Claude Code v2.1.157 Plugins — Plugins in .claude/skills auto-load without marketplace; claude plugin init scaffolds new plugins. (GitHub: anthropics/claude-code)
  • Cursor 3.7 + Teams Pricing — Cursor 3.7 ships canvas improvements; new Premium seat tier and higher Teams limits for heavy agent users. (Cursor changelog)
  • ChatGPT Dreaming V3 — New memory synthesis system for ChatGPT that consolidates preferences across conversations; rolling out to Plus/Pro. (TLDR AI)

Products

  • OpenAI Models on AWS — OpenAI frontier models and Codex now GA on AWS Bedrock with native security and billing integration. (TLDR AI)

Deals & Partnerships

Stories to follow

Agent cost & verification at scale

Uber capped Claude Code usage after blowing its AI budget in four months. Anthropic itself reports 80% AI-authored code with 8× volume per engineer. Cognition now runs more Devin sessions asynchronously than interactively, making verified-before-merge a hard requirement. The pattern: agent output is exploding, but cost control and verification infrastructure haven’t kept pace — exactly the leaf-node verification problem you’ve written about.

Open-weights models closing the gap

Nemotron 3 Ultra, MiniMax M3, Mellum2, and Qwen3.7-Plus all shipped in one week — each targeting agentic workloads with long context, tool use, or MoE efficiency. The open-weights tier is now explicitly competing on agent orchestration, not just chat benchmarks. For your LiteLLM gateway and local routing, the viable model menu just expanded significantly.

  • Nemotron 3 Ultra — 550B MoE, 55B active, 1M context, 300+ tok/s — strongest US open model. (TLDR AI)
  • MiniMax M3 — 1M-token context, sparse attention, frontier coding and desktop computer-use. (TLDR AI)
  • Mellum2 12B MoE open-sourced — 12B MoE under Apache 2.0; designed for routing, sub-agents, and low-latency code. (JetBrains AI blog)
  • How far behind are open models? — Open models trail frontier by 4-6 months on public benchmarks; gap widening slightly. (TLDR AI)

Claude Code hardening for teams

This week’s Claude Code releases (v2.1.157–165) added auto-loaded plugins from .claude/skills, managed version pinning (requiredMinimumVersion), OTEL metric labels for team/repo slicing, auto mode on Bedrock/Vertex, and security prompts before writing to shell startup files. Taken together, these are enterprise-governance features — exactly what you need for overnight-agent-factory setups where multiple headless agents run unsupervised.

  • Claude Code v2.1.157 — Auto-loaded plugins from .claude/skills; claude plugin init scaffolding. (GitHub: anthropics/claude-code)
  • Claude Code v2.1.163 — Version pinning via managed settings; /plugin list command; hook return values. (GitHub: anthropics/claude-code)
  • Claude Code v2.1.161 — OTEL resource attributes as metric labels; claude agents shows fan-out progress. (GitHub: anthropics/claude-code)
  • Claude Code v2.1.158 — Auto mode now available on Bedrock, Vertex, and Foundry for Opus 4.7/4.8. (GitHub: anthropics/claude-code)

What I’m watching

pewdiepie-archdaemon/odysseus

54.3k★ · Python Self-hosted AI workspace.

zgwl/chinese-buy-us-stock-guide

3.2k★ 美股指南

Gloridust/WechatOnCloud

2.1k★ · TypeScript 云微WOC,云微信,自由连接

b-nnett/goose

2.1k★ · Rust Goose Swift proof-of-concept README

asz798838958/aBaiAutoplus

1.5k★ · Python 多平台 AI 账号自动注册与管理 · 协议化付款一键开通 ChatGPT Plus

Read this weekend

Slow down to speed up when working with AI agents

Gergely Orosz directly addresses the quality/debt problem you face when agents generate 2× more code than six months ago. Practical framing of when to throttle agent output — the management-layer complement to your leaf-nodes verification thinking.

Quote of the week

A substantial patch used to imply substantial effort, and that effort was a reasonable proxy for good faith. That assumption no longer holds.

Andreas Kling, Ladybird · link


Sources unavailable this week: r/ChatGPTCoding top, r/ClaudeAI top, r/LocalLLaMA top, r/MachineLearning top

Auto-curated weekly by Claude Opus 4.7 from Ben’s Bites, Cursor changelog, Don’t Worry About the Vase (Zvi), Exponential View (Azeem Azhar), GitHub: BerriAI/litellm, GitHub: anthropics/claude-code, GitHub: cline/cline, GitHub: ggml-org/llama.cpp, GitHub: huggingface/transformers, GitHub: langchain-ai/langchain, GitHub: langchain-ai/langgraph, GitHub: ollama/ollama, GitHub: vllm-project/vllm, Hugging Face blog, Import AI (Jack Clark), Interconnects (Nathan Lambert), JetBrains AI blog, LangChain blog, Latent Space, Lenny’s Newsletter, NVIDIA developer blog, Not Boring (Packy McCormick), One Useful Thing (Ethan Mollick), OpenAI blog, SaaStr (Jason Lemkin), Simon Willison, TLDR AI, The Algorithmic Bridge (Alberto Romero), The Pragmatic Engineer (Gergely Orosz), Together AI blog, Tomasz Tunguz, Understanding AI (Timothy B. Lee), Vercel blog, smol.ai news. Source list and editorial profile maintained by Daniel.