Lesson 0003 · Serve + Retrieve stages
Structured Data
Stop making machines parse your prose. Hand them the fact, already labelled.
Recap from Lesson 0002: your page is now crawlable and indexable. That gets it into the system. Structured data is the first lever that makes it stand out — and it’s the first thing that serves both pipelines at once.
Structured data is machine-readable markup you embed in a page that says, explicitly, “this is a Product, its price is 29 USD, it’s in stock.” No NLP guesswork. Google recommends one format: JSON-LD — a <script> block of JSON, decoupled from your visible HTML.[1]
Prose vs. labelled fact
What a crawler sees in prose: “Our Acme Widget Pro is available now for just twenty-nine dollars…” — an engine must infer: is that a product? a price? which currency? Is it in stock? Maybe right, maybe not. The same fact as JSON-LD leaves nothing to guess:
{
"@type": "Product",
"name": "Acme Widget Pro",
"offers": {
"price": "29.00",
"priceCurrency": "USD",
"availability": "InStock"
}
}
Why it serves both pipelines
Eligible for rich results — star ratings, prices, FAQ drop-downs right on the SERP. More space, more clicks.
An LLM retrieving your page gets a pre-extracted, unambiguous fact to quote and cite — instead of risking a misread of your prose.[2]
Reality check: Google says structured data is not a magic AEO trick and there’s no special “AI markup” — clean structured data simply makes your existing content easier for any machine to use.[3] That’s the honest version, and it’s enough.
Required vs. recommended
Every type has properties Google marks REQUIRED (miss one → no rich result at all) and RECOMMENDED (skip → still valid, just less rich). Knowing which is which is the whole game — and exactly what a linter automates.
| Type | What it needs |
|---|---|
FAQPage | REQUIRED mainEntity → each Question needs name + acceptedAnswer.text |
Product | REQUIRED name, and in offers: price + priceCurrency |
Article | RECOMMENDED headline, image, datePublished, author |
The skill: generate & lint with a script
In this workspace: tools/schema_tool.py — stdlib only. It emits a copy-paste sample for a type and lints any JSON-LD file against the rules above.
- Self-check:
python3 tools/schema_tool.py --demo - Get markup you can paste into a page
<head>:python3 tools/schema_tool.py sample FAQPage - Break it on purpose — save a Product JSON-LD with no
name, thenpython3 tools/schema_tool.py lint bad.json→ watch it flag the ERROR.
$ python3 tools/schema_tool.py lint bad.json Linting JSON-LD: @type = Product ────────────────────────────────────────────── [FAIL] Product missing required property: name [WARN] Product missing recommended property: image [WARN] Product missing recommended property: offers ────────────────────────────────────────────── VERDICT: INVALID — 1 error(s) block the rich result.
Ceiling: this linter encodes Google’s rules for 3 types, not the full schema.org vocabulary. Before shipping production markup, confirm with Google’s official Rich Results Test and the schema.org validator. Your script is the fast inner loop; theirs is the source of truth.
Retrieval practice · no peeking
Check yourself
Answer from memory — that effort is what makes it stick. One try each; pick before you read the others.
FAQPage question/answer pairs strong for AEO?Formats, where to put it, and the per-type requirement lists your linter mirrors. Pair with <a href="https://schema.org/docs/gs.html">schema.org Getting Started</a>. Full list in <a href="/resources/"><code>RESOURCES.md</code></a>.