Ask Knox

Language models produce text. Your application needs structured data. Bridging that gap reliably — at production scale — is the challenge that structured outputs solve.

Before structured outputs existed, teams wrote regex parsers, custom JSON extractors, and retry loops to coerce model output into machine-readable format. All of that complexity is now unnecessary if you use the right API parameter.

JSON Mode vs Structured Outputs

These are frequently confused. They are different.

JSON mode (response_format: { type: "json_object" }) tells the model to produce valid JSON. The model will always return parseable JSON. But the shape of that JSON is unconstrained — the model decides what fields to include. You still need to validate that the fields you expect are actually present.

Structured outputs (response_format: { type: "json_schema", json_schema: {...}, strict: true }) enforces your exact schema. The model is constrained at inference time to only produce tokens that conform to the schema. Fields not in your schema cannot appear. Fields in your required list will always be present. Types are enforced.

Defining a Response Schema

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[
        {
            "role": "system",
            "content": "Extract financial data from the provided earnings text."
        },
        {
            "role": "user",
            "content": "Apple Q4 revenue was $124.3B, up 6% YoY. EPS hit $2.18, beating estimates of $2.10."
        }
    ],
    response_format={
        "type": "json_schema",
        "json_schema": {
            "name": "earnings_data",
            "strict": True,
            "schema": {
                "type": "object",
                "properties": {
                    "revenue_billions": {
                        "type": "number",
                        "description": "Revenue in billions USD"
                    },
                    "eps": {
                        "type": "number",
                        "description": "Earnings per share"
                    },
                    "yoy_growth_pct": {
                        "type": "number",
                        "description": "Year-over-year growth percentage"
                    },
                    "beat_estimate": {
                        "type": "boolean",
                        "description": "Whether EPS beat analyst estimates"
                    }
                },
                "required": ["revenue_billions", "eps", "yoy_growth_pct", "beat_estimate"],
                "additionalProperties": False
            }
        }
    }
)

data = json.loads(response.choices[0].message.content)
# data is now guaranteed to have all four fields with correct types

Nested Objects and Arrays

Structured outputs support the full JSON Schema spec including nested objects and arrays:

"schema": {
    "type": "object",
    "properties": {
        "company": {"type": "string"},
        "quarters": {
            "type": "array",
            "items": {
                "type": "object",
                "properties": {
                    "q": {"type": "string"},
                    "revenue": {"type": "number"},
                    "yoy_pct": {"type": "number"}
                },
                "required": ["q", "revenue", "yoy_pct"],
                "additionalProperties": False
            }
        }
    },
    "required": ["company", "quarters"],
    "additionalProperties": False
}

Using Pydantic for Schema Generation

The Python SDK integrates with Pydantic — define your output shape as a class and the SDK generates the schema automatically:

from pydantic import BaseModel
from openai import OpenAI

class EarningsData(BaseModel):
    revenue_billions: float
    eps: float
    yoy_growth_pct: float
    beat_estimate: bool

client = OpenAI()

response = client.beta.chat.completions.parse(
    model="gpt-4o",
    messages=[
        {"role": "system", "content": "Extract financial data from earnings text."},
        {"role": "user", "content": "Apple Q4 revenue was $124.3B, up 6% YoY. EPS $2.18, beat $2.10 estimate."}
    ],
    response_format=EarningsData
)

data = response.choices[0].message.parsed
print(data.revenue_billions)  # 124.3 (typed float, not string)
print(data.beat_estimate)     # True (typed bool)

The .parse() method returns a typed Pydantic object, not raw JSON. Your IDE gets type hints, mypy passes, and you catch schema mismatches at development time rather than production runtime.

Building an Extraction Pipeline

Structured outputs make extraction pipelines trivial to build and reliable to operate:

def extract_entities(text: str) -> list[EntityRecord]:
    response = client.beta.chat.completions.parse(
        model="gpt-4o-mini",  # mini is sufficient for extraction
        messages=[
            {
                "role": "system",
                "content": (
                    "Extract all named entities from the text. "
                    "Return each entity with its type and confidence."
                )
            },
            {"role": "user", "content": text}
        ],
        response_format=EntityList
    )
    return response.choices[0].message.parsed.entities

When to Use Which Approach

Use structured outputs (json_schema strict) when:

Output feeds into code that expects specific fields
You are building a data pipeline where downstream systems ingest AI output
Schema correctness is critical and retry logic for malformed output is unacceptable

Use JSON mode (json_object) when:

The schema is dynamic or user-defined and cannot be predetermined
You want valid JSON but are flexible about shape

Use plain text when:

The output is for human consumption only
You are building a conversational interface where structure adds no value

Refusal Handling

When strict mode is enabled, the model may refuse to produce output if the input violates content policy. In this case, message.parsed is None and message.refusal contains an explanation. Always check for refusal in production code:

message = response.choices[0].message
if message.refusal:
    # Handle content policy refusal
    log_refusal(message.refusal)
    return None

data = message.parsed

Bottom Line

JSON mode = valid JSON, any shape. Structured outputs with strict schemas = exact conformance guaranteed. For any production pipeline where AI output feeds into code, use strict structured outputs. You eliminate an entire class of parsing, validation, and retry complexity.

The next lesson covers Custom GPTs — the no-code path to building specialized AI apps for specific audiences without writing any backend code.

Structured Outputs — JSON Mode and Response Schemas