Claude API Fundamentals
Messages format, core request parameters, streaming, and how to reason about model selection.
Before you start
- 019 multiple-choice questions, one correct answer each.
- 02Suggested time 12 minutes. The timer is a guide, not a cutoff.
- 03Use keys 1–4 to answer, arrows to move.
- 04You get a full explanation for every question at the end.
Study guide
Every question in the Claude API Fundamentals test, with the correct answer and a full explanation. Messages format, core request parameters, streaming, and how to reason about model selection. Use it to review before or after taking the timed quiz above — the answers are revealed here, so take the quiz first if you want an honest score.
Show all 9questions, answers & explanations
- API-01 · Question 1 of 9
In the Messages API, what determines the maximum length of the model's response?
- AThe `temperature` parameter
- BThe `max_tokens` parameter Correct answer
- CThe size of the system prompt
- DThe number of messages in the conversation
Why: `max_tokens` caps how many tokens the model may generate in its response. It is a hard upper bound; the model may stop earlier (e.g. an `end_turn` stop reason). It does not control input length.
- API-02 · Question 2 of 9
Which field carries instructions that set the model's role and behavior for the whole conversation, separate from the turn-by-turn dialogue?
- AThe first `user` message
- BThe `system` parameter Correct answer
- CA `developer` role message inside `messages`
- DThe `metadata` field
Why: The top-level `system` parameter holds the system prompt: persistent role, tone, and task framing. The `messages` array carries the alternating user/assistant turns.
- API-03 · Question 3 of 9
What is the required structure of the `messages` array?
- AAny order of roles is accepted
- BIt must start with a `system` message
- CRoles must alternate, starting with `user` Correct answer
- DAll messages must use the `assistant` role
Why: Conversational turns alternate between `user` and `assistant`, and the array must begin with a `user` turn. The system prompt is passed separately, not as a message role.
- API-04 · Question 4 of 9
You set `temperature: 0`. What behavior should you expect?
- AThe model refuses to answer
- BMaximum creativity and variation
- CMore deterministic, focused output Correct answer
- DThe response is truncated
Why: Lower temperature reduces randomness, producing more deterministic and focused responses. Higher values increase diversity. Use low temperature for extraction, classification, and other tasks that reward consistency.
- API-05 · Question 5 of 9
A request returns `stop_reason: "max_tokens"`. What does this indicate?
- AThe model finished its turn naturally
- BThe output hit the `max_tokens` limit and was cut off Correct answer
- CA stop sequence was matched
- DThe input exceeded the context window
Why: `max_tokens` as a stop reason means generation was truncated at the token cap, not that the model completed naturally (`end_turn`). Increase `max_tokens` or design for continuation if you see this on complete-looking tasks.
- API-06 · Question 6 of 9
When you stream a response, how is the content delivered?
- AAs one final JSON object only
- BAs a sequence of server-sent events with incremental deltas Correct answer
- CAs a WebSocket binary frame
- DStreaming returns the same payload as non-streaming, just slower
Why: Streaming uses server-sent events: the response arrives as incremental delta events you accumulate client-side, which lowers time-to-first-token for interactive UIs.
- API-07 · Question 7 of 9
Which factor counts against the model's context window?
- AOnly the generated output tokens
- BOnly the system prompt
- CAll input tokens plus the tokens the model generates Correct answer
- DOnly the number of messages, not their length
Why: The context window bounds the total of input tokens (system prompt, full message history, tool definitions, documents) plus the output tokens generated. As a conversation grows, input usage rises and leaves less room for output.
- API-08 · Question 8 of 9
You need both diverse, creative output and a hard limit on response length. Which two parameters address those separately?
- A`max_tokens` for both
- B`temperature` for diversity, `max_tokens` for length Correct answer
- C`system` for diversity, `temperature` for length
- D`stop_sequences` for both
Why: These are independent levers: `temperature` (and top_p) controls randomness/diversity of sampling, while `max_tokens` caps how long the response may run. Raising one does not affect the other.
- API-09 · Question 9 of 9
What is the role of a `stop_sequences` value in a request?
- AIt sets the minimum output length
- BIt lists strings that, when generated, end the response with a `stop_sequence` stop reason Correct answer
- CIt filters unsafe content from the output
- DIt defines which tools the model may call
Why: Stop sequences are custom strings that halt generation as soon as the model produces one. The response ends with a `stop_sequence` stop reason, which is useful for delimiting structured output or cutting off at a known boundary.