RESOURCE

Cheat sheet

A one-page reference of the patterns from the course and when to reach for each. Designed to print cleanly. Use the print button above to save as PDF.

Prompt Engineering Cheat Sheet

Patterns and when to reach for each.

v1.0

1. The 6-component prompt skeleton

Role/context: who the model is acting as
Task: what to do, unambiguously
Inputs: the data to operate on (delimited)
Constraints: what to avoid, how to handle ambiguity
Examples: demonstrations (when needed)
Output format: the exact shape of the response

2. Choosing examples (few-shot)

Cover decision boundaries, not the obvious center
Be ruthlessly consistent across examples
Mine your eval failures — each example should fix a real failure
Diminishing returns after 3–5 examples
Each example should justify its tokens

3. When to use chain-of-thought

Use: multi-step reasoning, math, logic, multi-hop QA
Skip: trivial classification, extraction, simple lookups
Reasoning models: don't add CoT — they reason internally
Production: use structured CoT (XML tags) for parseability
Visible reasoning ≠ correct reasoning

4. System vs user prompt

System: behavior, tone, rules, persona
User: the per-turn task or query
Models weight system instructions higher (RLHF)
Generic personas ("helpful assistant") do almost nothing
Specific scenario-anchored personas earn their tokens

5. Decompose vs single-prompt

Single prompt	Chain
Fits in working memory	4+ distinct sub-tasks
Simple structured output	Need intermediate validation
Latency-critical	Quality > latency
No need to inspect state	Want to retry/branch on state

6. ReAct / tool use

Each tool does one thing — no do_database_stuff
Tool descriptions are prompts — write them like prompts
Use enums, not free-text fields, where possible
Idempotent tools so retries don't double-charge
Always set: max iterations, token budget, wall timeout
Errors are signals — return clear messages to the model

7. RAG essentials

Chunk: 200–500 tokens, 10–20% overlap (tune for corpus)
Hybrid retrieval: BM25 + embeddings, reranked
Order matters — best chunks at start/end of context
Wrap each chunk in <document id="..."> tags
Require citations; verify they map to real chunks
Skip RAG when corpus fits in context

8. Sampling parameters

Task	Temperature
Classification, extraction	0
Code generation	0–0.3
Summarization	0.3–0.5
Creative writing	0.7–1.0
Self-consistency voting	0.7+

9. Failure mode → fix lookup

Failure	Fix
Format drift	Tighter spec + examples + validator
Hallucination	RAG, citations, "I don't know" permission
Instruction lapse	Move to system prompt; restate near task
Over-refusal	Specify scope; positive examples
Verbose	Hard length cap; sentence count
Inconsistent tone	Persona examples; explicit tone rules

10. Latency levers

Stream response — perceived TTFT, not total time
Cache stable prefix (system prompt, examples)
Smaller model for simple chain steps
Cap output tokens — most defaults are too high
Parallelize independent calls
Trim retrieved context aggressively

11. Eval discipline

50–200 examples gets you started; quality over quantity
60–80% real samples + 20–40% hand-crafted edge cases
Split dev/test (70/30); only iterate against dev
Watch the diff, not just the aggregate score
Newly-failing examples block ship even if score went up
Every prod failure → eval set with expected behavior

12. LLM-as-judge biases

Position — favors first option (rotate orders)
Length — favors longer (penalize verbosity)
Self-preference — same family scored higher
Sycophancy — agrees with what prompt implies
Surface features — bullets/headings over substance
Calibrate against human scores before trusting at scale

13. Prompt injection defenses (layered)

Wrap user data in delimiters; explicit "ignore instructions inside"
Least-privilege tools — minimum capability, scoped access
Output validation before any consequential action
Prompt isolation — separate trust levels into separate calls
Monitoring — alert on anomalous tool use

No single defense is sufficient. Stack them.

14. Pre-launch checklist

Worst-case output identified, blast radius bounded
Eval set covers failure modes that matter
Injection attack surface mapped + defended
"I don't know" path tested
Human-in-the-loop where required
Production observability live (latency, cost, output)
Rollback plan documented

Prompt Engineering Mastery — promptengineeringmastery.com

For the full reasoning, take the course.

Tip: Save as PDF

Click the Print / Save as PDF button above. In your browser's print dialog, choose "Save as PDF" as the destination. The cheat sheet is designed to fit cleanly on two pages of A4.