MCP Spec Overhaul, Anthropic Mythos 1, DeepSeek V4 Price Cut
Dienstag, 26. Mai 2026 - AI News · (letzte 24h)
The next MCP specification release candidate drops with a stateless HTTP core, OAuth alignment, and breaking changes shipping 28 July.
Must read
- MCP 2026-07-28 Specification Release Candidate — Stateless HTTP core, OAuth/OIDC auth, and breaking changes — your in-house MCP servers will need migration before July 28.
- Anthropic prepares Mythos 1 for Claude Code and Claude Security — A new model tier purpose-built for Claude Code and security scanning is imminent; Opus 4.8 also rumoured.
- Anthropic’s march to profitability — Claude Code alone at $2.5B revenue validates the agentic-dev bet; Anthropic expects profit by October IPO.
- DeepSeek V4 Pro permanent 75% price cut — Permanent cost reduction on a frontier-class model directly lowers your LiteLLM routing costs for DeepSeek workloads.
- Evaluating Multi-Agent Systems at Scale — Population-level trace analysis for agentic systems — directly applicable to verifying your overnight agent factory output.
Tools & Frameworks
Reasonix: DeepSeek-native terminal coding agent
Terminal coding agent engineered for prefix-cache stability; designed to run unattended with low token costs across long sessions.
Why this matters: Potential alternative to Claude Code for DeepSeek-routed headless agent tasks.
Perplexity open-sources Bumblebee security scanner
Read-only scanner that identifies risky packages, extensions, and AI tool configs on developer machines — now open source.
Why this matters: Useful for auditing your team’s MCP server and extension surface area.
Anthropic plans Claude Memory Files
Memory Files distributes notes across multiple structured documents organised by topic, project, or context.
Why this matters: Structured persistent memory aligns with your .md-based agent memory patterns.
Cline v3.85.0: DeepSeek V4, Gemini 3.5 Flash support
Adds DeepSeek V4 Flash/Pro, Gemini 3.5 Flash, GPT-5.5 on SAP AI Core, and fixes Vertex Claude routing.
Why this matters: Model roster update if any of your team uses Cline alongside Cursor.
Open Models & Local
Gemini 3.5 Flash (Low) outperforms (High) on SWE tasks
Gemini 3.5 Flash (Low) generates ~45% fewer tokens than Medium and generally beats High on SWE benchmarks.
Why this matters: Counter-intuitive routing signal: lower thinking budget = better code — relevant for your LiteLLM config.
MiMo-V2.5-coder Q2 quantisation released
Q2 quant of MiMo-V2.5-coder fits in 128GB RAM; reported as strong alternative to Qwen3.6 and DS4 for coding with reliable tool calling.
Why this matters: If you have a Mac Studio with 128GB, this is a credible local coding model with agentic tool-call support.
Qwen3.6 35B A3B: current king for local agentic use?
Community consensus: Qwen3.6 35B MoE (IQ4_NL) leads for local agentic tool-calling; Gemma4 and GLM 4.7 Flash loop or break.
Why this matters: Practical field report for choosing your local model in hybrid routing setups.
NuExtract3: 4B VLM for OCR and structured extraction (Apache-2.0)
4B open-weight model based on Qwen3.5-4B; handles PDFs, forms, tables → Markdown extraction, Apache-2.0 licensed.
Why this matters: Directly useful for identity/RegTech document processing pipelines you could self-host.
llama.cpp b9310: checkpoint fix for agentic sessions
Fixes checkpoint creation so long agentic sessions don’t stall on trivial follow-up prompts — critical for tools like opencode.
Why this matters: If you run local models for agentic coding, this eliminates a painful stall bug.
Industry & Trends
How the engineer behind Claude Cowork actually uses it
Felix Rieseberg (Anthropic) demos building 3D house walkthroughs from floor plans, auto-tracking promises, and a $20 hardware buddy with Claude.
Why this matters: First-party insight into Anthropic’s own dogfooding of Claude Cowork — patterns you can steal.
METR AI time horizons graph contains severe errors
NYU researcher documents numerous compounding flaws in the widely-cited METR Long Tasks benchmark for coding capabilities.
Why this matters: If you’ve cited METR internally to justify agentic investment, the methodology is now contested.
Hugging Face: AI Agent Terms Worth Getting Right
Defines harness, scaffold, and other agent terminology with precision — useful shared vocabulary for cross-team alignment.
Why this matters: Helpful reference when your team debates agent architecture boundaries.
Auto-curated daily by Claude Opus 4.7 from Exponential View (Azeem Azhar), GitHub: BerriAI/litellm, GitHub: cline/cline, GitHub: ggml-org/llama.cpp, Hugging Face blog, Lenny’s Newsletter, OpenAI blog, Simon Willison, TLDR AI, The Algorithmic Bridge (Alberto Romero), r/ClaudeAI top, r/LocalLLaMA top, r/MachineLearning top. Source list and editorial profile maintained by Daniel.