Prompting in the Agentic Era - From Prompt Engineering to System Design

The Shift: From Prompt Engineering to System Specification

In 2023-2024, “prompt engineering” was the hot skill. Developers spent hours crafting elaborate prompts with precise wording, chain-of-thought instructions, and few-shot examples to coax better outputs from models. There was a cottage industry of “prompt engineering” courses and certifications.

In 2026, that world looks quaint. Not because prompting doesn’t matter — it does — but because the nature of what matters has changed fundamentally. Modern frontier models (Claude 4, GPT-5, Gemini 2.5) are dramatically better at understanding intent without elaborate scaffolding. More importantly, the shift to agentic AI and tool use means the critical skill is no longer writing clever individual prompts but designing systems that guide AI behavior over multi-step workflows.

What Matters Now

1. CLAUDE.md and Project Specification Files

The most impactful “prompting” I do in 2026 isn’t a prompt at all — it’s writing CLAUDE.md files. These project-level specification documents tell AI agents everything they need to know about a codebase: architecture decisions, tech stack, coding conventions, design system tokens, file naming patterns, testing strategies.

A well-crafted CLAUDE.md file is worth more than a thousand clever prompts because it sets context once and applies to every interaction. It’s the difference between telling an employee what to do on each task versus giving them an onboarding document that lets them make good decisions autonomously.

What goes in a good CLAUDE.md:

Project overview and purpose
Tech stack and key dependencies
Architecture patterns and conventions
File structure and naming rules
Design system tokens and component patterns
Testing expectations
Common commands (build, test, lint, deploy)
Domain-specific terminology and business rules

2. System Prompts and Tool Definitions

For production AI features, the “prompt” is really a system design problem:

System prompts define the AI’s role, constraints, and behavioral boundaries. They’re less about tricks and more about clear specification of what the system should and shouldn’t do.
Tool definitions (via MCP or function calling) specify what actions the AI can take. The quality of your tool descriptions directly impacts how well the model uses them. Clear parameter descriptions, good examples, and explicit constraints matter enormously.
Structured output schemas (JSON schema, Pydantic models) ensure the AI returns data in formats your application can reliably parse. This replaces the old pattern of hoping the model formats its response correctly.

3. Evaluation Over Intuition

The 2024 approach: “Does this prompt feel like it works?” The 2026 approach: systematic evaluation. For any production AI feature, you need:

A test set of representative inputs
Clear success criteria (accuracy, format compliance, safety)
Automated evaluation pipelines
A/B testing for prompt changes

“Vibes-based testing” is the prompt engineering equivalent of not writing unit tests. It works until it doesn’t, and then you’re debugging production failures with no reproducibility.

What Still Works from Classic Prompting

Not everything from the prompt engineering era is obsolete. These fundamentals still apply:

Role and Context Setting

Models still perform better when they understand the perspective to take. But in 2026, this is more often set in system prompts or CLAUDE.md files than in individual user messages.

You are a senior order management specialist reviewing flagged transactions. Focus on identifying potential duplicate orders and fraud indicators.

Chain-of-Thought for Complex Reasoning

Asking models to think step-by-step still improves accuracy on complex tasks. Some models (Claude with extended thinking, o1/o3 series) do this automatically, but explicit CoT instructions help when using smaller or local models.

Few-Shot Examples for Format Specification

When you need output in a very specific format, showing 1-2 examples is still more effective than describing the format abstractly. This is especially true for structured data extraction and classification tasks.

Explicit Constraints

Setting boundaries remains important: maximum length, forbidden topics, required sections, output format. Models follow constraints better than ever, but you still need to state them.

What’s Less Important Now

Elaborate Prompt Scaffolding

You no longer need to trick models into good behavior with elaborate role-playing scenarios or psychological manipulation (“You will be penalized for wrong answers”). Modern models understand straightforward instructions.

Prompt Chaining for Simple Tasks

Multi-prompt chains that were necessary to work around context or capability limits are often unnecessary now. A single well-specified request to a frontier model handles what used to require 3-4 chained prompts.

Hyper-Specific Wording

The obsession with exact phrasing (“Use ‘analyze’ not ‘look at’”) has faded. Frontier models are robust to paraphrasing. Clarity of intent matters; specific word choice rarely does.

The New Prompting Checklist (2026 Edition)

For production AI features:

Define clear system prompts with role, constraints, and safety boundaries
Design tool definitions with precise descriptions and parameter schemas
Specify structured output formats (JSON schema)
Build evaluation datasets before optimizing prompts
Version control your prompts alongside your code
Use MCP for tool integration rather than bespoke function calling

For agentic development workflows:

Write comprehensive CLAUDE.md files for every project
Define clear task specifications before invoking agents
Include acceptance criteria and test expectations in specs
Keep context focused — don’t dump everything, provide what’s relevant

For ad hoc usage (research, writing, analysis):

State the goal clearly and directly
Provide relevant context upfront
Specify output format if it matters
Iterate naturally — modern models handle conversational refinement well
Use extended thinking / reasoning modes for complex analytical tasks

Practical Template for System Prompts

You are [role] for [product/system].

## Core Behavior
[What the system does and how it should approach tasks]

## Constraints  
- [Hard boundaries: topics to avoid, actions not to take]
- [Safety requirements]
- [Output format requirements]

## Tools Available
[Brief description of each tool and when to use it]

## Examples
[1-2 representative input/output pairs]

## Error Handling
[What to do when uncertain, when tools fail, when input is ambiguous]

The meta-lesson: prompting has matured from an art into an engineering discipline. The “10x prompt engineer” of 2024 has become the “AI system designer” of 2026 — someone who thinks about the entire pipeline from system prompt to tool definitions to evaluation to monitoring. The individual clever prompt matters less; the system design matters more.