Skip to content
TestsAPI & Models
Foundational9 questions · ~12 min

Claude API Fundamentals

Messages format, core request parameters, streaming, and how to reason about model selection.

Before you start

  • 019 multiple-choice questions, one correct answer each.
  • 02Suggested time 12 minutes. The timer is a guide, not a cutoff.
  • 03Use keys 1–4 to answer, arrows to move.
  • 04You get a full explanation for every question at the end.

Study guide

Every question in the Claude API Fundamentals test, with the correct answer and a full explanation. Messages format, core request parameters, streaming, and how to reason about model selection. Use it to review before or after taking the timed quiz above — the answers are revealed here, so take the quiz first if you want an honest score.

Show all 9questions, answers & explanations
  1. API-01 · Question 1 of 9

    In the Messages API, what determines the maximum length of the model's response?

    • AThe `temperature` parameter
    • BThe `max_tokens` parameter Correct answer
    • CThe size of the system prompt
    • DThe number of messages in the conversation

    Why: `max_tokens` caps how many tokens the model may generate in its response. It is a hard upper bound; the model may stop earlier (e.g. an `end_turn` stop reason). It does not control input length.

  2. API-02 · Question 2 of 9

    Which field carries instructions that set the model's role and behavior for the whole conversation, separate from the turn-by-turn dialogue?

    • AThe first `user` message
    • BThe `system` parameter Correct answer
    • CA `developer` role message inside `messages`
    • DThe `metadata` field

    Why: The top-level `system` parameter holds the system prompt: persistent role, tone, and task framing. The `messages` array carries the alternating user/assistant turns.

  3. API-03 · Question 3 of 9

    What is the required structure of the `messages` array?

    • AAny order of roles is accepted
    • BIt must start with a `system` message
    • CRoles must alternate, starting with `user` Correct answer
    • DAll messages must use the `assistant` role

    Why: Conversational turns alternate between `user` and `assistant`, and the array must begin with a `user` turn. The system prompt is passed separately, not as a message role.

  4. API-04 · Question 4 of 9

    You set `temperature: 0`. What behavior should you expect?

    • AThe model refuses to answer
    • BMaximum creativity and variation
    • CMore deterministic, focused output Correct answer
    • DThe response is truncated

    Why: Lower temperature reduces randomness, producing more deterministic and focused responses. Higher values increase diversity. Use low temperature for extraction, classification, and other tasks that reward consistency.

  5. API-05 · Question 5 of 9

    A request returns `stop_reason: "max_tokens"`. What does this indicate?

    • AThe model finished its turn naturally
    • BThe output hit the `max_tokens` limit and was cut off Correct answer
    • CA stop sequence was matched
    • DThe input exceeded the context window

    Why: `max_tokens` as a stop reason means generation was truncated at the token cap, not that the model completed naturally (`end_turn`). Increase `max_tokens` or design for continuation if you see this on complete-looking tasks.

  6. API-06 · Question 6 of 9

    When you stream a response, how is the content delivered?

    • AAs one final JSON object only
    • BAs a sequence of server-sent events with incremental deltas Correct answer
    • CAs a WebSocket binary frame
    • DStreaming returns the same payload as non-streaming, just slower

    Why: Streaming uses server-sent events: the response arrives as incremental delta events you accumulate client-side, which lowers time-to-first-token for interactive UIs.

  7. API-07 · Question 7 of 9

    Which factor counts against the model's context window?

    • AOnly the generated output tokens
    • BOnly the system prompt
    • CAll input tokens plus the tokens the model generates Correct answer
    • DOnly the number of messages, not their length

    Why: The context window bounds the total of input tokens (system prompt, full message history, tool definitions, documents) plus the output tokens generated. As a conversation grows, input usage rises and leaves less room for output.

  8. API-08 · Question 8 of 9

    You need both diverse, creative output and a hard limit on response length. Which two parameters address those separately?

    • A`max_tokens` for both
    • B`temperature` for diversity, `max_tokens` for length Correct answer
    • C`system` for diversity, `temperature` for length
    • D`stop_sequences` for both

    Why: These are independent levers: `temperature` (and top_p) controls randomness/diversity of sampling, while `max_tokens` caps how long the response may run. Raising one does not affect the other.

  9. API-09 · Question 9 of 9

    What is the role of a `stop_sequences` value in a request?

    • AIt sets the minimum output length
    • BIt lists strings that, when generated, end the response with a `stop_sequence` stop reason Correct answer
    • CIt filters unsafe content from the output
    • DIt defines which tools the model may call

    Why: Stop sequences are custom strings that halt generation as soon as the model produces one. The response ends with a `stop_sequence` stop reason, which is useful for delimiting structured output or cutting off at a known boundary.