Grok Build CLI, Cursor Cloud Agents, Claude Code v2.1.143

xAI launched Grok Build, a terminal coding agent with worktrees, subagents, and headless mode — directly competing with Claude Code.

Must read

Introducing Grok Build — Terminal coding agent with AGENTS.md, MCP, worktrees, subagents, and headless mode — feature-parity play against your Claude Code setup.
Cursor Cloud Agent Development Environments — Configurable cloud dev environments for parallel autonomous agents — directly relevant to your overnight-agent-factory pattern.
Claude Code v2.1.143 — Plugin dependency enforcement, projected context cost per turn, and worktree.bgIsolation ‘none’ for background sessions editing working copy directly.
Claude Code v2.1.142 — New claude agents flags (—add-dir, —mcp-config, —plugin-dir, —model, —effort) for configuring dispatched background sessions. Fast mode now defaults to Opus 4.7.
Secure, Scalable Agent Sandbox Infrastructure — Practical architecture for sandboxing code-executing agents — isolate the agent, not the tool. Directly applicable to your headless agent fleet.

Tools & Frameworks

Raindrop Workshop

Gives Claude Code the ability to read traces, write evals against codebases, and self-heal — supports TypeScript, Python, Go, Rust.

Why this matters: Self-healing eval loop for your Claude Code workflows.

Cline Agent Runtime SDK

@cline/sdk: open-source framework with checkpoints, MCPs, subagents, cron jobs — runnable from CI/CD pipelines.

Why this matters: Embeddable agent runtime for GitHub Actions pipelines.

Genkit Middleware

Composable hooks intercepting generation calls for retries, human approval before destructive tool calls, and full observability. Supports TypeScript and Python.

Why this matters: Middleware pattern for hardening agentic apps in production.

Codex Hooks and Programmatic Tokens

Codex now supports hooks (scripts at key task points) and scoped programmatic tokens for Business/Enterprise automation.

Why this matters: Enables headless Codex automation comparable to your Claude Code dispatch setup.

LangSmith Engine

Watches production traces, clusters failures into named issues, and proposes targeted fixes with eval coverage.

Why this matters: Automated triage for agent failures at scale.

LangSmith LLM Gateway

Runtime governance with spend limits, PII redaction, and trace continuity built into the agent lifecycle.

Why this matters: Comparable to your LiteLLM gateway; adds governance layer.

Open Models & Local

Ollama v0.24.0 — Codex App

Ships built-in Codex App with worktree support, git functionality, and a built-in browser for local server preview.

Why this matters: Local agent coding environment without cloud dependency.

Orthrus-Qwen3-8B: 7.8× tokens/forward

Multi-token prediction head on frozen Qwen3-8B backbone achieves up to 7.8× throughput with provably identical output distribution.

Why this matters: Massive local inference speedup on Apple Silicon without quality loss.

Qwen 3.6 35B MTP tested at 1M+ tokens

User reports ~1.5× tok/s improvement with MTP Qwen 35B at 300K context building a full iterative project via Roo.

Why this matters: Real-world MTP performance data for local coding workflows.

llama.cpp b9174

Restructures UI to tools/ui folder with new naming conventions; incremental maintenance release.

Why this matters: Housekeeping — no action needed but tracks your local stack.

Industry & Trends

Anthropic Passes OpenAI in Business Adoption

Anthropic quadrupled business adoption in 12 months; more businesses used Anthropic than OpenAI in April 2026.

Why this matters: Validates your bet on Claude as primary stack.

Cerebras $5.5B IPO at ~$40B Valuation

Largest IPO of 2026; 20× oversubscribed. Cerebras serves trillion-parameter models including OpenAI 5.4/5.5.

Why this matters: Watch but don’t act — signals inference hardware demand growth.

Microsoft Shopping for OpenAI Replacement

Post-amended deal, Microsoft reportedly looking to acquire Inception (diffusion-based LMs) as it diversifies beyond OpenAI.

Why this matters: Multi-provider strategy increasingly validated at platform level.

Vercel AI Gateway Production Trends

7 months of traffic across 200K+ teams shows rapid agentic workload growth, increasing open-source model adoption, and heavy multi-model routing.

Why this matters: Benchmarks for your LiteLLM gateway routing decisions.

Agent-Driven App Rewrites Now Viable

Simon Willison reports a medium-sized company completed a coding-agent-driven rewrite of legacy mobile apps — languages are no longer lock-in.

Why this matters: Evidence for the one-person-team leverage shift you write about.

New Gemini Model at Google I/O Tuesday

Google reportedly announcing a new Gemini model at I/O roughly on par with GPT-5.5.

Why this matters: Potential new option for your model gateway routing.

How OpenAI Built the Codex Windows Sandbox

Detailed engineering of Codex’s sandboxing: constrained file access, networking, and local commands while preserving agent effectiveness.

Why this matters: Reference architecture for sandboxing your own headless agents.

Sources unavailable today: AI Tidbits (Sahar Mor), Sourcegraph blog

Auto-curated daily by Claude Opus 4.7 from Ben’s Bites, Don’t Worry About the Vase (Zvi), Exponential View (Azeem Azhar), GitHub: anthropics/claude-code, GitHub: ggml-org/llama.cpp, GitHub: ollama/ollama, Hugging Face blog, LangChain blog, Latent Space, NVIDIA developer blog, OpenAI blog, Simon Willison, TLDR AI, The Algorithmic Bridge (Alberto Romero), The Pragmatic Engineer (Gergely Orosz), Together AI blog, Understanding AI (Timothy B. Lee), Vercel blog, r/ClaudeAI top, r/LocalLLaMA top, smol.ai news. Source list and editorial profile maintained by Daniel.