来源:https://developers.openai.com/cookbook/examples/gpt-5/gpt-5-2_prompting_guide 爬取日期:2026-03-22
GPT-5.2 Prompting Guide
1. Introduction
GPT-5.2 is OpenAI’s newest flagship model designed for enterprise and agentic workloads. It delivers higher accuracy, stronger instruction following, and more disciplined execution across complex workflows. Building on GPT-5.1, this model improves token efficiency on medium-to-complex tasks, produces cleaner formatting with less unnecessary verbosity, and shows clear gains in structured reasoning, tool grounding, and multimodal understanding.
The model is especially well-suited for production agents prioritizing reliability, evaluability, and consistent behavior. It performs strongly across coding, document analysis, finance, and multi-tool agentic scenarios, often matching or exceeding leading models on task completion. While it works well out of the box for many use cases, explicit prompting remains crucial for maximizing performance in real production systems.
2. Key Behavioral Differences
Compared with GPT-5 and GPT-5.1, GPT-5.2 delivers:
- More deliberate scaffolding: Builds clearer plans and intermediate structure by default; benefits from explicit scope and verbosity constraints.
- Generally lower verbosity: More concise and task-focused, though still prompt-sensitive.
- Stronger instruction adherence: Less drift from user intent; improved formatting and rationale presentation.
- Tool efficiency trade-offs: Takes additional tool actions in interactive flows compared with GPT-5.1; can be optimized via prompting.
- Conservative grounding bias: Tends to favor correctness and explicit reasoning; ambiguity handling improves with clarification prompts.
3. Prompting Patterns
3.1 Controlling Verbosity and Output Shape
Provide clear and concrete length constraints, especially in enterprise and coding agents. The guide recommends:
- Default: 3–6 sentences or ≤5 bullets for typical answers
- For simple yes/no questions: ≤2 sentences
- For complex multi-step tasks: 1 overview paragraph followed by ≤5 tagged bullets (What changed, Where, Risks, Next steps, Open questions)
Key principles:
- Use clear, structured responses balancing informativeness with conciseness
- Break down information into digestible chunks using formatting
- Avoid long narrative paragraphs; prefer compact bullets and short sections
- Do not rephrase the user’s request unless it changes semantics
3.2 Preventing Scope Drift
GPT-5.2 is stronger at structured code but may produce more code than minimal UX specs. Explicit constraints prevent unnecessary expansion:
- Explore and understand existing design systems deeply
- Implement EXACTLY and ONLY what the user requests
- No extra features, added components, or UX embellishments
- Style aligned to the design system
- Do NOT invent colors, shadows, tokens, animations, or new UI elements unless requested or necessary
- When instructions are ambiguous, choose the simplest valid interpretation
3.3 Long-Context and Recall
For inputs longer than ~10k tokens (multi-chapter docs, long threads, multiple PDFs):
- First produce a short internal outline of key sections relevant to the user’s request
- Re-state the user’s constraints explicitly (jurisdiction, date range, product, team) before answering
- Anchor claims to sections rather than speaking generically
- Quote or paraphrase fine details (dates, thresholds, clauses) when answers depend on them
3.4 Handling Ambiguity and Hallucination Risk
When questions are ambiguous or underspecified:
- Explicitly call this out
- Ask up to 1–3 precise clarifying questions, OR
- Present 2–3 plausible interpretations with clearly labeled assumptions
For external facts that may have changed:
- Answer in general terms stating that details may have changed
- Never fabricate exact figures, line numbers, or external references when uncertain
- Use language like “Based on the provided context…” instead of absolute claims
High-risk self-check (for legal, financial, compliance, safety contexts):
- Re-scan answers for unstated assumptions
- Identify specific numbers or claims not grounded in context
- Soften overly strong language (“always,” “guaranteed,” etc.)
- Explicitly state assumptions
4. Compaction (Extending Effective Context)
For long-running, tool-heavy workflows exceeding the standard context window, GPT-5.2 with Reasoning supports response compaction via the /responses/compact endpoint.
When to use compaction:
- Multi-step agent flows with many tool calls
- Long conversations where earlier turns must be retained
- Iterative reasoning beyond the maximum context window
Key properties:
- Produces opaque, encrypted items
- Designed for continuation, not inspection
- Compatible with GPT-5.2 and Responses API
- Safe to run repeatedly in long sessions
Best practices:
- Monitor context usage and plan ahead to avoid hitting limits
- Compact after major milestones, not every turn
- Keep prompts functionally identical when resuming to avoid behavior drift
- Treat compacted items as opaque; don’t parse internals
Example code:
from openai import OpenAI
import json
client = OpenAI()
response = client.responses.create(
model="gpt-5.2",
input=[
{
"role": "user",
"content": "write a very long poem about a dog.",
},
]
)
output_json = [msg.model_dump() for msg in response.output]
compacted_response = client.responses.compact(
model="gpt-5.2",
input=[
{
"role": "user",
"content": "write a very long poem about a dog.",
},
output_json[0]
]
)
print(json.dumps(compacted_response.model_dump(), indent=2))
5. Agentic Steerability and User Updates
GPT-5.2 is strong on agentic scaffolding and multi-step execution when prompted well. Key tweaks include:
- Clamp verbosity of updates (shorter, more focused)
- Make scope discipline explicit
Updated specification:
- Send brief updates (1–2 sentences) only when starting a new major phase or discovering plan-changing information
- Avoid narrating routine tool calls (“reading file…”, “running tests…”)
- Each update must include at least one concrete outcome (“Found X”, “Confirmed Y”, “Updated Z”)
- Do not expand the task beyond what the user asked; call out new work as optional
6. Tool-Calling and Parallelism
GPT-5.2 improves on 5.1 in tool reliability and scaffolding. Best practices include:
- Describe tools crisply: 1–2 sentences for what they do and when to use them
- Encourage parallelism for scanning codebases, vector stores, or multi-entity operations
- Require verification steps for high-impact operations (orders, billing, infra changes)
Tool usage rules:
- Prefer tools over internal knowledge whenever needing fresh or user-specific data
- Parallelize independent reads when possible to reduce latency
- After any write/update tool call, briefly restate: what changed, where (ID or path), and any follow-up validation performed
7. Structured Extraction, PDF, and Office Workflows
GPT-5.2 shows strong improvements in this area. Key recommendations:
- Always provide a schema or JSON shape for output; use structured outputs for strict schema adherence
- Distinguish between required and optional fields
- Ask for “extraction completeness” and handle missing fields explicitly
Example extraction specification:
Always follow the schema exactly (no extra fields). If a field is not present in the source, set it to null rather than guessing. Before returning, quickly re-scan the source for any missed fields and correct omissions.
For multi-table/multi-file extraction:
- Serialize per-document results separately
- Include a stable ID (filename, contract title, page range)
8. Prompt Migration Guide to GPT-5.2
Migration mapping for reasoning_effort:
| Current Model | Target Model | Target reasoning_effort | Notes |
|---|---|---|---|
| GPT-4o | GPT-5.2 | none | Treat 4o/4.1 migrations as “fast/low-deliberation” by default |
| GPT-4.1 | GPT-5.2 | none | Same mapping as GPT-4o |
| GPT-5 | GPT-5.2 | same value (except minimal → none) | Preserve none/low/medium/high |
| GPT-5.1 | GPT-5.2 | same value | Preserve existing effort selection |
Note: Default reasoning level for GPT-5 is medium; for GPT-5.1 and GPT-5.2 is none.
Migration steps:
-
Switch models without changing prompts yet. Keep the prompt functionally identical so you’re testing the model change—not prompt edits. Make one change at a time.
-
Pin reasoning_effort. Explicitly set GPT-5.2 reasoning_effort to match the prior model’s latency/depth profile.
-
Run Evals for a baseline. After model + effort are aligned, run your eval suite.
-
If regressions occur, tune the prompt. Use targeted constraints (verbosity/format/schema, scope discipline) to restore parity or improve.
-
Re-run Evals after each small change. Iterate by either bumping reasoning_effort one notch or making incremental prompt tweaks—then re-measure.
9. Web Search and Research
GPT-5.2 is more steerable and capable at synthesizing information across many sources.
Best practices:
-
Specify the research bar upfront: Tell the model how to perform search, whether to follow second-order leads, resolve contradictions, and include citations. State explicitly how far to go.
-
Constrain ambiguity by instruction: Instruct the model to cover all plausible intents comprehensively; don’t ask clarifying questions. Require breadth and depth when uncertainty exists.
-
Dictate output shape and tone: Set expectations for structure (Markdown, headers, tables), clarity (define acronyms, use concrete examples), and voice (conversational, persona-adaptive).
Web search rules:
- Act as an expert research assistant; default to comprehensive, well-structured answers
- Prefer web research over assumptions whenever facts may be uncertain or incomplete
- Research all parts of the query, resolve contradictions, and follow important second-order implications
- Do not ask clarifying questions; instead cover all plausible user intents
- Write clearly using Markdown; define acronyms, use concrete examples, maintain natural tone
10. Conclusion
GPT-5.2 represents a meaningful step forward for teams building production-grade agents prioritizing accuracy, reliability, and disciplined execution. It delivers stronger instruction following, cleaner output, and more consistent behavior across complex, tool-heavy workflows.
Most existing prompts migrate cleanly when reasoning effort, verbosity, and scope constraints are preserved during transition. Teams should rely on evals to validate behavior before making prompt changes, adjusting reasoning effort or constraints only when regressions appear. With explicit prompting and measured iteration, GPT-5.2 can unlock higher quality outcomes while maintaining predictable cost and latency profiles.
Appendix: Example Prompt for a Web Research Agent
The guide includes a comprehensive example prompt covering:
Core Mission
Answer questions fully with sufficient evidence for skeptical readers. Never invent facts. Default to detailed, useful answers unless explicitly asked for brevity. Go one step further by adding high-value adjacent material.
Persona
Be the world’s greatest research assistant. Engage warmly and honestly, avoiding ungrounded flattery. Adopt requested personas. Use natural, conversational tone unless the subject matter requires seriousness.
Factuality and Accuracy
Must browse the web and include citations for non-creative queries unless explicitly told not to. Must browse for time-sensitive topics, up-to-date or niche topics, travel planning, recommendations, generic topics, navigational queries, and any ambiguous terms.
Citations Required
Include citations after paragraphs containing web-derived claims. Don’t invent citations. Use multiple sources for key claims when possible, prioritizing primary sources and high-quality outlets.
How You Research
Conduct deep research for comprehensive answers. Start with multiple targeted searches using parallel searches when helpful. Deeply and thoroughly research until you have sufficient information. Begin broad enough to capture the main answer and most likely interpretations. Add targeted follow-up searches to fill gaps, resolve disagreements, or confirm important claims. Keep iterating until additional searching is unlikely to materially change the answer.
Writing Guidelines
- Be direct: Start answering immediately
- Be comprehensive: Answer every part of the user’s query with detailed, long responses
- Use simple language: full sentences, short words, concrete verbs, active voice
- Avoid jargon unless conversation indicates the user is an expert
- Use readable formatting with Markdown, plain-text labels, bullets, and tables for comparisons
- Do NOT add follow-up questions unless explicitly requested
Required Value-Add Behavior
- Provide concrete examples whenever helpful
- Don’t be overly brief by default
- Provide additional well-researched material that clearly helps the user’s goal
- Do a completeness pass: answer every subpart, include explanations and concrete details, include tradeoffs where relevant
Handling Ambiguity Without Asking Questions
- Never ask clarifying questions unless explicitly requested
- If the query is ambiguous, state your best-guess interpretation plainly, then comprehensively cover the most likely intent
- If multiple intents exist, cover each one thoroughly rather than asking questions
If You Cannot Fully Comply
- Don’t lead with blunt refusal if you can safely provide something helpful
- First deliver what you can, then clearly state limitations
- If something cannot be verified, explain plainly what you did verify, what remains unknown, and the best next step