Context Engineering and Management
Overview
Context engineering is the practice of designing and managing the information provided to LLMs to optimize their performance on specific tasks [23].
"Context engineering is providing the right information and tools in the right format so the LLM can accomplish a task. This is the number one job of AI engineers."
Prompt Engineering Techniques
Zero-Shot Prompting
Direct task instruction without examples [24]:
Classify the sentiment of this review as positive, negative, or neutral:
"The product arrived on time and works great!"
Sentiment:Few-Shot Prompting
Provide examples to guide the model [25]:
Classify the sentiment:
Review: "Terrible quality, broke after one day"
Sentiment: negative
Review: "It's okay, nothing special"
Sentiment: neutral
Review: "Best purchase I've ever made!"
Sentiment: positive
Review: "The product arrived on time and works great!"
Sentiment:Chain-of-Thought (CoT)
Encourage step-by-step reasoning [26]:
Solve this problem step by step:
A store has 50 apples. They sell 23 apples in the morning
and receive a shipment of 35 apples. How many apples do they have now?
Let's think through this step by step:
1. Starting apples: 50
2. After morning sales: 50 - 23 = 27
3. After shipment: 27 + 35 = 62
Answer: 62 applesContext Window Management
The Challenge
LLMs have finite context windows, and attention quality degrades with length. Effective context management is crucial for agent performance.
Strategies
| Strategy | Description | Trade-off |
|---|---|---|
| Truncation | Remove oldest content | May lose important context |
| Summarization | Compress older content | Loses detail, adds latency |
| Selective Retrieval | Only include relevant content | May miss connections |
| Hierarchical | Multi-level summaries | Complex to implement |
Context Compression
Techniques for fitting more information into limited context [27]:
Techniques
- Extractive Summarization: Select key sentences
- Abstractive Summarization: Generate concise summaries
- Entity Extraction: Keep only key entities and relationships
- Structured Representation: Convert to compact formats (JSON, tables)
Example
# Before compression (verbose)
"""
The user John Smith, who is 35 years old and lives in
New York City, has been a customer since January 2020.
He has made 47 purchases totaling $3,245.67. His preferred
payment method is credit card ending in 4532. He has
contacted support 3 times, most recently about a shipping
delay on order #12345.
"""
# After compression (structured)
"""
User: John Smith | Age: 35 | Location: NYC
Customer since: Jan 2020 | Purchases: 47 ($3,245.67)
Payment: CC *4532 | Support contacts: 3
Recent issue: Shipping delay (Order #12345)
"""Position-Aware Context
LLMs pay more attention to content at the beginning and end of context (the "lost in the middle" phenomenon).
Best Practices
- Front-load Critical Information: Put most important content first
- Repeat Key Instructions: Reiterate important points at the end
- Use Clear Delimiters: Separate sections with clear markers
- Prioritize Recency: Recent information often more relevant
Filesystem-Based Context Engineering
Deep Agents use the filesystem as a context management tool, enabling:
- Persistent Storage: Information survives context resets
- Selective Loading: Only load relevant files
- Structured Organization: Organize by topic/task
- Version Control: Track changes over time
Example Structure
/workspace
├── plan.md # Current task plan
├── findings/ # Research results
│ ├── topic_a.md
│ └── topic_b.md
├── memories/ # Long-term knowledge
│ ├── user_prefs.md
│ └── past_tasks.md
└── scratch/ # Temporary working files
└── draft.mdContext Engineering Best Practices
- Prioritize Ruthlessly: Identify and keep only critical information
- Be Position-Aware: Place critical info at beginning and end of context
- Use External Memory: Vector databases for long-term storage
- Compress, Don't Truncate: Summarize older information
- Tune Prompts: Guide attention with prompt engineering
System Prompt Design
Components
| Component | Purpose | Example |
|---|---|---|
| Role Definition | Set agent identity | "You are a helpful coding assistant" |
| Capabilities | Define what agent can do | "You can search the web and execute code" |
| Constraints | Set boundaries | "Never share personal information" |
| Output Format | Specify response structure | "Always respond in JSON format" |
| Examples | Demonstrate expected behavior | Few-shot examples |
Example System Prompt
You are an AI research assistant specializing in technical documentation.
CAPABILITIES:
- Search the web for information
- Read and analyze documents
- Write structured reports
CONSTRAINTS:
- Always cite sources
- Acknowledge uncertainty
- Stay focused on the task
OUTPUT FORMAT:
- Use markdown formatting
- Include section headers
- Provide references at the end
When asked to research a topic:
1. First search for relevant information
2. Analyze and synthesize findings
3. Present a structured summary with citations