Lesson 0003 · Serve + Retrieve stages

Structured Data

Stop making machines parse your prose. Hand them the fact, already labelled.

Recap from Lesson 0002: your page is now crawlable and indexable. That gets it into the system. Structured data is the first lever that makes it stand out — and it’s the first thing that serves both pipelines at once.

Structured data is machine-readable markup you embed in a page that says, explicitly, “this is a Product, its price is 29 USD, it’s in stock.” No NLP guesswork. Google recommends one format: JSON-LD — a <script> block of JSON, decoupled from your visible HTML.^[1]

Your win: generate valid JSON-LD for a page and lint it against Google’s required-property rules — catching “this won’t earn a rich result” before you ship, not weeks later in Search Console.

Prose vs. labelled fact

What a crawler sees in prose: “Our Acme Widget Pro is available now for just twenty-nine dollars…” — an engine must infer: is that a product? a price? which currency? Is it in stock? Maybe right, maybe not. The same fact as JSON-LD leaves nothing to guess:

{
  "@type": "Product",
  "name": "Acme Widget Pro",
  "offers": {
    "price": "29.00",
    "priceCurrency": "USD",
    "availability": "InStock"
  }
}

Why it serves both pipelines

Serve · classic search

Eligible for rich results — star ratings, prices, FAQ drop-downs right on the SERP. More space, more clicks.

Retrieve · answer engines

An LLM retrieving your page gets a pre-extracted, unambiguous fact to quote and cite — instead of risking a misread of your prose.^[2]

Reality check: Google says structured data is not a magic AEO trick and there’s no special “AI markup” — clean structured data simply makes your existing content easier for any machine to use.^[3] That’s the honest version, and it’s enough.

Required vs. recommended

Every type has properties Google marks REQUIRED (miss one → no rich result at all) and RECOMMENDED (skip → still valid, just less rich). Knowing which is which is the whole game — and exactly what a linter automates.

Type	What it needs
`FAQPage`	REQUIRED `mainEntity` → each `Question` needs `name` + `acceptedAnswer.text`
`Product`	REQUIRED `name`, and in `offers`: `price` + `priceCurrency`
`Article`	RECOMMENDED `headline`, `image`, `datePublished`, `author`

The skill: generate & lint with a script

In this workspace: tools/schema_tool.py — stdlib only. It emits a copy-paste sample for a type and lints any JSON-LD file against the rules above.

Do this now:

Self-check: python3 tools/schema_tool.py --demo
Get markup you can paste into a page <head>: python3 tools/schema_tool.py sample FAQPage
Break it on purpose — save a Product JSON-LD with no name, then python3 tools/schema_tool.py lint bad.json → watch it flag the ERROR.

$ python3 tools/schema_tool.py lint bad.json

Linting JSON-LD: @type = Product
──────────────────────────────────────────────
[FAIL] Product missing required property: name
[WARN] Product missing recommended property: image
[WARN] Product missing recommended property: offers
──────────────────────────────────────────────
VERDICT: INVALID — 1 error(s) block the rich result.

Ceiling: this linter encodes Google’s rules for 3 types, not the full schema.org vocabulary. Before shipping production markup, confirm with Google’s official Rich Results Test and the schema.org validator. Your script is the fast inner loop; theirs is the source of truth.

Retrieval practice · no peeking

Check yourself

Answer from memory — that effort is what makes it stick. One try each; pick before you read the others.

Question 1 / 4

Which structured-data format does Google recommend?

Question 2 / 4

Structured data most directly helps which two pipeline stages?

Question 3 / 4

A required property is missing from your Product markup. Consequence?

Question 4 / 4

Why are FAQPage question/answer pairs strong for AEO?

Primary source — read this next (≈12 min)

"Intro to structured data markup in Google Search" — Google Search Central

Formats, where to put it, and the per-type requirement lists your linter mirrors. Pair with <a href="https://schema.org/docs/gs.html">schema.org Getting Started</a>. Full list in <a href="/resources/"><code>RESOURCES.md</code></a>.

Stuck or curious? This agent is your teacher. Ask it anything — “show me a real robots.txt”, “do Claude and Perplexity retrieve differently?” — followups are the fastest way to learn.