来源：https://developers.openai.com/cookbook/examples/gpt-5/gpt-5-2_prompting_guide 爬取日期：2026-03-22

GPT-5.2 Prompting Guide

1. Introduction

GPT-5.2 is OpenAI’s newest flagship model designed for enterprise and agentic workloads. It delivers higher accuracy, stronger instruction following, and more disciplined execution across complex workflows. Building on GPT-5.1, this model improves token efficiency on medium-to-complex tasks, produces cleaner formatting with less unnecessary verbosity, and shows clear gains in structured reasoning, tool grounding, and multimodal understanding.

The model is especially well-suited for production agents prioritizing reliability, evaluability, and consistent behavior. It performs strongly across coding, document analysis, finance, and multi-tool agentic scenarios, often matching or exceeding leading models on task completion. While it works well out of the box for many use cases, explicit prompting remains crucial for maximizing performance in real production systems.

2. Key Behavioral Differences

Compared with GPT-5 and GPT-5.1, GPT-5.2 delivers:

More deliberate scaffolding: Builds clearer plans and intermediate structure by default; benefits from explicit scope and verbosity constraints.
Generally lower verbosity: More concise and task-focused, though still prompt-sensitive.
Stronger instruction adherence: Less drift from user intent; improved formatting and rationale presentation.
Tool efficiency trade-offs: Takes additional tool actions in interactive flows compared with GPT-5.1; can be optimized via prompting.
Conservative grounding bias: Tends to favor correctness and explicit reasoning; ambiguity handling improves with clarification prompts.

3. Prompting Patterns

3.1 Controlling Verbosity and Output Shape

Provide clear and concrete length constraints, especially in enterprise and coding agents. The guide recommends:

Default: 3–6 sentences or ≤5 bullets for typical answers
For simple yes/no questions: ≤2 sentences
For complex multi-step tasks: 1 overview paragraph followed by ≤5 tagged bullets (What changed, Where, Risks, Next steps, Open questions)

Key principles:

Use clear, structured responses balancing informativeness with conciseness
Break down information into digestible chunks using formatting
Avoid long narrative paragraphs; prefer compact bullets and short sections
Do not rephrase the user’s request unless it changes semantics

3.2 Preventing Scope Drift

GPT-5.2 is stronger at structured code but may produce more code than minimal UX specs. Explicit constraints prevent unnecessary expansion:

Explore and understand existing design systems deeply
Implement EXACTLY and ONLY what the user requests
No extra features, added components, or UX embellishments
Style aligned to the design system
Do NOT invent colors, shadows, tokens, animations, or new UI elements unless requested or necessary
When instructions are ambiguous, choose the simplest valid interpretation

3.3 Long-Context and Recall

For inputs longer than ~10k tokens (multi-chapter docs, long threads, multiple PDFs):

First produce a short internal outline of key sections relevant to the user’s request
Re-state the user’s constraints explicitly (jurisdiction, date range, product, team) before answering
Anchor claims to sections rather than speaking generically
Quote or paraphrase fine details (dates, thresholds, clauses) when answers depend on them

3.4 Handling Ambiguity and Hallucination Risk

When questions are ambiguous or underspecified:

Explicitly call this out
Ask up to 1–3 precise clarifying questions, OR
Present 2–3 plausible interpretations with clearly labeled assumptions

For external facts that may have changed:

Answer in general terms stating that details may have changed
Never fabricate exact figures, line numbers, or external references when uncertain
Use language like “Based on the provided context…” instead of absolute claims

High-risk self-check (for legal, financial, compliance, safety contexts):

Re-scan answers for unstated assumptions
Identify specific numbers or claims not grounded in context
Soften overly strong language (“always,” “guaranteed,” etc.)
Explicitly state assumptions

4. Compaction (Extending Effective Context)

For long-running, tool-heavy workflows exceeding the standard context window, GPT-5.2 with Reasoning supports response compaction via the /responses/compact endpoint.

When to use compaction:

Multi-step agent flows with many tool calls
Long conversations where earlier turns must be retained
Iterative reasoning beyond the maximum context window

Key properties:

Produces opaque, encrypted items
Designed for continuation, not inspection
Compatible with GPT-5.2 and Responses API
Safe to run repeatedly in long sessions

Best practices:

Monitor context usage and plan ahead to avoid hitting limits
Compact after major milestones, not every turn
Keep prompts functionally identical when resuming to avoid behavior drift
Treat compacted items as opaque; don’t parse internals

Example code:

from openai import OpenAI
import json

client = OpenAI()

response = client.responses.create(
   model="gpt-5.2",
   input=[
       {
           "role": "user",
           "content": "write a very long poem about a dog.",
       },
   ]
)

output_json = [msg.model_dump() for msg in response.output]

compacted_response = client.responses.compact(
   model="gpt-5.2",
   input=[
       {
           "role": "user",
           "content": "write a very long poem about a dog.",
       },
       output_json[0]
   ]
)

print(json.dumps(compacted_response.model_dump(), indent=2))

5. Agentic Steerability and User Updates

GPT-5.2 is strong on agentic scaffolding and multi-step execution when prompted well. Key tweaks include:

Clamp verbosity of updates (shorter, more focused)
Make scope discipline explicit

Updated specification:

Send brief updates (1–2 sentences) only when starting a new major phase or discovering plan-changing information
Avoid narrating routine tool calls (“reading file…”, “running tests…”)
Each update must include at least one concrete outcome (“Found X”, “Confirmed Y”, “Updated Z”)
Do not expand the task beyond what the user asked; call out new work as optional

6. Tool-Calling and Parallelism

GPT-5.2 improves on 5.1 in tool reliability and scaffolding. Best practices include:

Describe tools crisply: 1–2 sentences for what they do and when to use them
Encourage parallelism for scanning codebases, vector stores, or multi-entity operations
Require verification steps for high-impact operations (orders, billing, infra changes)

Tool usage rules:

Prefer tools over internal knowledge whenever needing fresh or user-specific data
Parallelize independent reads when possible to reduce latency
After any write/update tool call, briefly restate: what changed, where (ID or path), and any follow-up validation performed

7. Structured Extraction, PDF, and Office Workflows

GPT-5.2 shows strong improvements in this area. Key recommendations:

Always provide a schema or JSON shape for output; use structured outputs for strict schema adherence
Distinguish between required and optional fields
Ask for “extraction completeness” and handle missing fields explicitly

Example extraction specification:

Always follow the schema exactly (no extra fields). If a field is not present in the source, set it to null rather than guessing. Before returning, quickly re-scan the source for any missed fields and correct omissions.

For multi-table/multi-file extraction:

Serialize per-document results separately
Include a stable ID (filename, contract title, page range)

8. Prompt Migration Guide to GPT-5.2

Migration mapping for reasoning_effort:

Current Model	Target Model	Target reasoning_effort	Notes
GPT-4o	GPT-5.2	none	Treat 4o/4.1 migrations as “fast/low-deliberation” by default
GPT-4.1	GPT-5.2	none	Same mapping as GPT-4o
GPT-5	GPT-5.2	same value (except minimal → none)	Preserve none/low/medium/high
GPT-5.1	GPT-5.2	same value	Preserve existing effort selection

Note: Default reasoning level for GPT-5 is medium; for GPT-5.1 and GPT-5.2 is none.

Migration steps:

Switch models without changing prompts yet. Keep the prompt functionally identical so you’re testing the model change—not prompt edits. Make one change at a time.
Pin reasoning_effort. Explicitly set GPT-5.2 reasoning_effort to match the prior model’s latency/depth profile.
Run Evals for a baseline. After model + effort are aligned, run your eval suite.
If regressions occur, tune the prompt. Use targeted constraints (verbosity/format/schema, scope discipline) to restore parity or improve.
Re-run Evals after each small change. Iterate by either bumping reasoning_effort one notch or making incremental prompt tweaks—then re-measure.

9. Web Search and Research

GPT-5.2 is more steerable and capable at synthesizing information across many sources.

Best practices:

Specify the research bar upfront: Tell the model how to perform search, whether to follow second-order leads, resolve contradictions, and include citations. State explicitly how far to go.
Constrain ambiguity by instruction: Instruct the model to cover all plausible intents comprehensively; don’t ask clarifying questions. Require breadth and depth when uncertainty exists.
Dictate output shape and tone: Set expectations for structure (Markdown, headers, tables), clarity (define acronyms, use concrete examples), and voice (conversational, persona-adaptive).

Web search rules:

Act as an expert research assistant; default to comprehensive, well-structured answers
Prefer web research over assumptions whenever facts may be uncertain or incomplete
Research all parts of the query, resolve contradictions, and follow important second-order implications
Do not ask clarifying questions; instead cover all plausible user intents
Write clearly using Markdown; define acronyms, use concrete examples, maintain natural tone

10. Conclusion

GPT-5.2 represents a meaningful step forward for teams building production-grade agents prioritizing accuracy, reliability, and disciplined execution. It delivers stronger instruction following, cleaner output, and more consistent behavior across complex, tool-heavy workflows.

Most existing prompts migrate cleanly when reasoning effort, verbosity, and scope constraints are preserved during transition. Teams should rely on evals to validate behavior before making prompt changes, adjusting reasoning effort or constraints only when regressions appear. With explicit prompting and measured iteration, GPT-5.2 can unlock higher quality outcomes while maintaining predictable cost and latency profiles.

Appendix: Example Prompt for a Web Research Agent

The guide includes a comprehensive example prompt covering:

Core Mission

Answer questions fully with sufficient evidence for skeptical readers. Never invent facts. Default to detailed, useful answers unless explicitly asked for brevity. Go one step further by adding high-value adjacent material.

Persona

Be the world’s greatest research assistant. Engage warmly and honestly, avoiding ungrounded flattery. Adopt requested personas. Use natural, conversational tone unless the subject matter requires seriousness.

Factuality and Accuracy

Must browse the web and include citations for non-creative queries unless explicitly told not to. Must browse for time-sensitive topics, up-to-date or niche topics, travel planning, recommendations, generic topics, navigational queries, and any ambiguous terms.

Citations Required

Include citations after paragraphs containing web-derived claims. Don’t invent citations. Use multiple sources for key claims when possible, prioritizing primary sources and high-quality outlets.

How You Research

Conduct deep research for comprehensive answers. Start with multiple targeted searches using parallel searches when helpful. Deeply and thoroughly research until you have sufficient information. Begin broad enough to capture the main answer and most likely interpretations. Add targeted follow-up searches to fill gaps, resolve disagreements, or confirm important claims. Keep iterating until additional searching is unlikely to materially change the answer.

Writing Guidelines

Be direct: Start answering immediately
Be comprehensive: Answer every part of the user’s query with detailed, long responses
Use simple language: full sentences, short words, concrete verbs, active voice
Avoid jargon unless conversation indicates the user is an expert
Use readable formatting with Markdown, plain-text labels, bullets, and tables for comparisons
Do NOT add follow-up questions unless explicitly requested

Required Value-Add Behavior

Provide concrete examples whenever helpful
Don’t be overly brief by default
Provide additional well-researched material that clearly helps the user’s goal
Do a completeness pass: answer every subpart, include explanations and concrete details, include tradeoffs where relevant

Handling Ambiguity Without Asking Questions

Never ask clarifying questions unless explicitly requested
If the query is ambiguous, state your best-guess interpretation plainly, then comprehensively cover the most likely intent
If multiple intents exist, cover each one thoroughly rather than asking questions

If You Cannot Fully Comply

Don’t lead with blunt refusal if you can safely provide something helpful
First deliver what you can, then clearly state limitations
If something cannot be verified, explain plainly what you did verify, what remains unknown, and the best next step