Skip to content

← AI Tracker

AI Briefing

Gemini 3.5 Flash, Anthropic Acquires Stainless, Composer 2.5

mercredi 20 mai 2026 - AI News · (24 dernières heures)

Google I/O shipped Gemini 3.5 Flash to GA with improved agentic execution, while Anthropic acquired Stainless and Cursor released Composer 2.5.

Must read

Tools & Frameworks

Cursor Changelog May 19

Cursor shipped its May 19 changelog alongside the Composer 2.5 blog post detailing RL-based agent improvements.

Why this matters: Check for incremental fixes beyond the Composer 2.5 headline.

Cline CLI v3.0.9

Concurrent plugin loading, cached tool descriptors, and fuzzy @-mention file picker restore — startup speed improvements.

Why this matters: Relevant if evaluating Cline as a headless alternative.

Manus Scheduled Tasks 2.0

Tasks now run with persistent context across projects and apps, enabling continuity in automated workflows.

Why this matters: Comparable to your overnight-agent-factory pattern.

LangSmith Engine: An Agent for Improving Agents

LangChain details how they built an agent that iterates on other agents’ prompts and evals inside LangSmith.

Why this matters: Meta-agent eval pattern relevant to your team’s agent orchestration.

Agent Evaluation: A Detailed Guide

Comprehensive guide covering realistic harnesses, long-horizon testing, and outcome-oriented eval for production agents.

Why this matters: Directly applicable to verifying your 22,000-line-PR agent outputs.

Open Models & Local

Qwen3.7 Preview lands on Arena

Qwen3.7 Max Preview ranks 13th overall in Text Arena; Plus Preview ranks 16th in Vision Arena.

Why this matters: Tracks the Qwen family you run locally via Ollama.

Political censorship inside Qwen3.5-9B’s weights

Censorship is a small circuit layered on top of intact factual knowledge — can be read and disabled without fine-tuning.

Why this matters: Actionable if you deploy Qwen locally and need uncensored outputs.

llama.cpp b9235: MTP clean-up

Major MTP (multi-token prediction) clean-up: re-enables p-min with MTP drafts, fixes ngram spec acceptance logic.

Why this matters: MTP speculative decoding boosts local inference speed on Apple Silicon.

HRM-Text: 1B model trainable for ~$800

1B text-gen model trained on 8 H100s in ~50 hours using 130–600× less compute than standard foundation models.

Why this matters: Watch — interesting architecture efficiency, not yet coding-focused.

KV cache quantization benchmarks: q5 deserves more attention

Thorough PPL/KLD benchmarks on Qwen 3.6 27B at 64k–128k context show q5 outperforms TurboQuant; symmetric q8 wastes VRAM.

Why this matters: Directly useful for your local Qwen quantisation choices.

Google I/O 2026 Roundup: Flash, Spark, Antigravity 2.0

Covers Gemini 3.5 Flash GA, Spark background agents, Omni video model, and Antigravity 2.0 IDE features from I/O Day 1.

Why this matters: Single-page overview of everything Google shipped today.

Gemini 3.5 Flash on Vercel AI Gateway

Vercel AI Gateway now routes to Gemini 3.5 Flash with medium thinking level default and parallel agentic loops.

Why this matters: You deploy on Vercel — one-click access to the new model.

Together AI: Coding agent inference benchmarks

31% more TPS than TensorRT-LLM, 2× better TTFT at saturation, 76% lower cost than Claude Opus 4.6 for coding agents.

Why this matters: Cost/latency data relevant if you route coding tasks away from Anthropic.

AI’s impact on software engineers in 2026 (Part 2)

Gergely Orosz covers tradeoffs of AI tooling adoption at company level, with survey data on what’s changed in two years.

Why this matters: Useful framing for your own team’s adoption story and talks.

xAI launches Skills for Grok

Users can teach Grok persistent functions it remembers across interactions — similar to Claude’s skills/memory pattern.

Why this matters: Watch — validates the skills-framework pattern you’re building on.

Jury dismisses all Musk claims against OpenAI

Musk’s lawsuit against Altman/OpenAI dismissed — jury ruled he waited too long to file. Plans to appeal.

Why this matters: Removes an existential legal overhang from the OpenAI ecosystem.


Sources unavailable today: Eric Jang

Auto-curated daily by Claude Opus 4.7 from Ben’s Bites, Cursor changelog, Don’t Worry About the Vase (Zvi), GitHub: anthropics/claude-code, GitHub: cline/cline, GitHub: ggml-org/llama.cpp, Hacker News (AI), Hugging Face blog, LangChain blog, Latent Space, Lenny’s Newsletter, NVIDIA developer blog, OpenAI blog, Simon Willison, TLDR AI, The Algorithmic Bridge (Alberto Romero), The Pragmatic Engineer (Gergely Orosz), Together AI blog, Vercel blog, r/ClaudeAI top, r/LocalLLaMA top, r/MachineLearning top. Source list and editorial profile maintained by Daniel.