Skip to content

← AI Tracker

AI Briefing

Claude Code v2.1.141, Cursor Changelog, Codex Windows Sandbox

jeudi 14 mai 2026 - AI News · (24 dernières heures)

Claude Code ships hook notifications and workspace identity federation; Cursor and OpenAI Codex also push updates on the same day.

Must read

Tools & Frameworks

Ollama v0.23.4: vision model support in opencode, Claude tool-result fix

ollama launch opencode now accepts image inputs for vision models; fixes Claude tool-calling formatting with local image paths.

Why this matters: Relevant if you route local models through Ollama via LiteLLM.

Cline v3.83.0: MCP server validation fix, OpenRouter Qwen cache

Fixes MCP servers requiring object params; enables prompt cache control for Qwen models on OpenRouter.

Why this matters: MCP validation fix matters if you test in-house MCP servers with Cline.

TextGen (oobabooga) ships as native desktop app — open-source LM Studio alternative

text-generation-webui reborn as a no-install Electron desktop app for Windows, Linux, and macOS with polished UI.

Why this matters: Another local inference frontend option for your Apple Silicon setup.

Vercel: Trusted Sources for Deployment Protection via OIDC

Protected deployments now accept short-lived OIDC tokens from authorised Vercel projects and external services, replacing long-lived bypass secrets.

Why this matters: You deploy on Vercel; this tightens CI/CD auth for protected previews.

Open Models & Local

24+ tok/s from Qwen 3.6 35B-A3B on GTX 1080 with 128k context via TurboQuant KV

llama.cpp’s TurboQuant/RotorQuant KV cache quantisation fits 128k context in 8 GB VRAM; Gemma 4 26B-A4B also tested at ~20 tok/s.

Why this matters: Shows MoE models are now viable for agentic coding on modest hardware.

llama.cpp Docker images for MTP (multi-token prediction) models

Pre-built Docker images simplify running MTP-enabled Gemma 4 and similar models without compiling from the MTP branch.

Why this matters: Lowers the barrier to testing MTP speedups in your local-plus-cloud hybrid workflow.

llama.cpp b9133: continue generation on reasoning models

Server and WebUI now support assistant prefill continuation on reasoning models, routing thinking tags correctly for CoT streaming.

Why this matters: Enables partial-generation resume for local reasoning models in agentic loops.

Ovis2.6-80B-A3B: MoE multimodal model with 3B active params

80B-total / 3B-active MoE multimodal LLM with long-context, high-res document understanding, and visual reasoning capabilities.

Why this matters: Watch-but-don’t-act — interesting MoE efficiency but no coding eval signal yet.

Web search for AI agents degrading as Google kills free index, Cloudflare blocks bots

Google’s free search API drops to 50 domains; Cloudflare + GoDaddy now challenge all AI bot traffic by default, breaking agent web-search tooling.

Why this matters: Directly impacts any MCP web-search server or Firecrawl-based pipeline you run.

Simon Willison built the Datasette blog with OpenAI Codex desktop, shared full transcript

Full Markdown session transcript of building a blog with Codex desktop — a real-world agentic coding workflow example from a respected practitioner.

Why this matters: Useful comparison point for your own Claude Code workflow documentation.

Latent Space: Codex rises, Claude meters programmatic usage

Latent Space reports on the long-term trend of major coding agents and Anthropic’s new metering for programmatic Claude usage.

Why this matters: Signals potential cost changes for your Claude Code API spend.


Sources unavailable today: AI Tidbits (Sahar Mor), Andrej Karpathy, Ben’s Bites, Chip Huyen, CrewAI blog, Eric Jang, Exponential View (Azeem Azhar), GitLab blog, Google DeepMind blog, Google Research blog, Hacker News (AI), Hugging Face blog, LangChain blog, Sourcegraph blog, TLDR AI, The Algorithmic Bridge (Alberto Romero), The Pragmatic Engineer (Gergely Orosz), r/ClaudeAI top, smol.ai news, swyx.io

Auto-curated daily by Claude Opus 4.7 from Cursor changelog, Don’t Worry About the Vase (Zvi), GitHub: anthropics/claude-code, GitHub: cline/cline, GitHub: ggml-org/llama.cpp, GitHub: ollama/ollama, Latent Space, NVIDIA developer blog, OpenAI blog, Simon Willison, Vercel blog, r/LocalLLaMA top, r/MachineLearning top. Source list and editorial profile maintained by Daniel.