AIStackWatch
Back to wiki

Function-calling API

Function-calling is the provider feature that lets you declare a typed schema up front and get the model to respond with a JSON object that matches it. It is the primary way modern apps extract structured data from an LLM and the foundation for tool-use.

Two distinct uses

  • Tool-use. The model decides which function to call and with what arguments; your code executes it and sends the result back. See the tool-use wiki page for the full loop.
  • Structured output. You force the model to return JSON matching a schema — for classification, form extraction, or any downstream system that expects typed data.

Providers expose both. OpenAI's response_format: {type: "json_schema"}, Anthropic's tool definitions, Gemini's structured output mode — different names, same idea.

Why use it instead of prompting for JSON

Prompted JSON ("return a JSON object with these fields") works 95% of the time. The other 5% is trailing commas, wrapped-in-markdown, invented fields, truncated output. For anything that touches a database, 5% failure is a bug tracker you will regret.

Structured output enforces the schema at the decoding level on the provider side. You still validate with Zod or Pydantic for defense in depth, but parser errors drop to near zero.

Schema best practices

  • Use JSON Schema or draw from zod-to-json-schema / pydantic. Don't hand-write it.
  • Keep field names concrete and matched to the task. customer_email beats email.
  • Include descriptions — they are part of the prompt the model sees.
  • Enum fields are your friend. status: "pending" | "done" | "blocked" is far more reliable than a free-form string.

Failure modes

  • Over-nested schemas. Models degrade past 3-4 levels of nesting. Flatten where you can.
  • Required fields the model can't infer. It will hallucinate values rather than violate the schema. Mark unknowns optional.
  • Large enums. A 500-label enum produces wrong labels. Use retrieval to narrow before the LLM picks.
  • Silent truncation. max_tokens cuts off JSON; parser fails. Set generous limits for structured responses.

When NOT to use it

If the output is pure natural language (a chat reply, a summary, marketing copy), forcing JSON structure is a downgrade. Keep text as text.