Safety & Responsible Deployment
Guardrails, evaluation, prompt-injection defense, cost/latency control, and production monitoring.
Before you start
- 019 multiple-choice questions, one correct answer each.
- 02Suggested time 12 minutes. The timer is a guide, not a cutoff.
- 03Use keys 1–4 to answer, arrows to move.
- 04You get a full explanation for every question at the end.
Study guide
Every question in the Safety & Responsible Deployment test, with the correct answer and a full explanation. Guardrails, evaluation, prompt-injection defense, cost/latency control, and production monitoring. Use it to review before or after taking the timed quiz above — the answers are revealed here, so take the quiz first if you want an honest score.
Show all 9questions, answers & explanations
- SAFE-01 · Question 1 of 9
An untrusted document instructs the model to ignore its system prompt. What is this, and a key mitigation?
- AA rate-limit error; retry with backoff
- BA prompt-injection attack; isolate and clearly label untrusted content and constrain tool permissions Correct answer
- CA streaming bug; disable streaming
- DExpected behavior; no action needed
Why: Malicious instructions embedded in retrieved/user content are prompt injection. Mitigations include clearly delimiting and labeling untrusted input, instructing the model not to follow instructions found in data, and least-privilege tool access.
- SAFE-02 · Question 2 of 9
Before shipping an LLM feature, what is the most important practice for measuring quality?
- AManual spot-checks only
- BAn evaluation set with representative cases and clear success criteria, run automatically Correct answer
- CTrusting vibes from a few prompts
- DMaximizing `max_tokens`
Why: Systematic evals — a curated set of representative inputs scored against defined criteria — let you measure quality, catch regressions, and compare prompts/models objectively before and after deployment.
- SAFE-03 · Question 3 of 9
Which technique most directly reduces cost and latency for repeated, large, stable prompt prefixes?
- APrompt caching Correct answer
- BRaising temperature
- CAdding more few-shot examples
- DUsing a larger model
Why: Prompt caching reuses a previously processed, stable prefix (e.g. a large system prompt or document set) across requests, cutting both cost and time-to-first-token on the cached portion.
- SAFE-04 · Question 4 of 9
For high-volume, latency-tolerant background jobs, which approach typically lowers cost the most?
- AAlways use the largest model synchronously
- BUse batch processing and/or a smaller, faster model where quality allows Correct answer
- CIncrease `max_tokens` for every request
- DDisable streaming
Why: Non-interactive workloads can use batch APIs (discounted, asynchronous) and right-sized smaller models. Match the model and execution mode to the task's real quality and latency needs rather than defaulting to the biggest model.
- SAFE-05 · Question 5 of 9
What should you monitor in production beyond raw error rates?
- AOnly the number of requests
- BOutput quality, latency, token/cost usage, refusal rates, and user feedback Correct answer
- COnly the model version string
- DNothing; LLM apps are self-correcting
Why: Observability for LLM apps spans quality (via evals/feedback), latency, token and cost usage, refusal/safety signals, and tool success rates — not just HTTP errors. This is what lets you catch drift and regressions.
- SAFE-06 · Question 6 of 9
A model-driven action could be destructive (e.g. deleting data). What is the safest design?
- ALet the agent act fully autonomously to save time
- BRequire a human-in-the-loop confirmation or scoped, reversible permissions for high-impact actions Correct answer
- CRaise temperature for better judgment
- DRemove logging to reduce overhead
Why: High-impact or irreversible actions should require human confirmation or be constrained to least-privilege, reversible, well-logged operations. Autonomy is earned for low-risk steps, not granted blanket access to destructive ones.
- SAFE-07 · Question 7 of 9
You compare two prompts on your eval set. Prompt A scores higher overall but fails a small set of safety-critical cases that Prompt B passes. What is the sound decision?
- AAlways pick the higher aggregate score
- BTreat safety-critical cases as gating: do not ship a prompt that regresses them, regardless of aggregate score Correct answer
- CAverage the two prompts together
- DIgnore the eval and choose by intuition
Why: Not all eval cases carry equal weight. Safety-critical or must-not-fail cases should act as hard gates, so a higher overall average does not justify shipping a regression on them. Segment evals by severity rather than optimizing a single aggregate.
- SAFE-08 · Question 8 of 9
Why use an LLM-as-judge (model-graded) evaluation instead of exact string matching?
- AIt is always cheaper than string matching
- BIt can score open-ended outputs against criteria like correctness, tone, or completeness where many valid wordings exist Correct answer
- CIt removes the need for any test cases
- DIt guarantees deterministic scores
Why: Open-ended generations rarely match a fixed string, so a model graded against a clear rubric can assess qualities like factual correctness, tone, and completeness. It should be validated against human judgments, since it adds cost and is not perfectly deterministic.
- SAFE-09 · Question 9 of 9
Users can submit free-text that becomes part of the prompt to a tool-enabled agent. Which combination best limits prompt-injection blast radius?
- ATrust the model to ignore malicious instructions on its own
- BLeast-privilege tool scopes, human confirmation for high-impact actions, and clearly separating untrusted input from instructions Correct answer
- CRaising `max_tokens` so the model can reason more
- DDisabling evals to ship faster
Why: Defense in depth limits damage when injection succeeds: scope tools to least privilege, gate high-impact or irreversible actions behind human approval, and structurally separate and label untrusted input. No single prompt instruction is a reliable sole defense.