Lesson 0007 Β· Telling engines you changed
Sitemaps + IndexNow
Publishing a page is half the job. The other half is the ping: one mechanism every engine pulls on its own slow schedule, one that pushes instantly β to everyone except Google.
Recap from Lesson 0006: we made sure your content is actually in the HTML the server sends. Now the page exists and renders. But an engine still has to find out it exists, or changed. That discovery step is its own pipeline stage β and itβs the most automatable one in the whole course.
Two notification mechanisms, and the asymmetry is the lesson. A sitemap is passive: a list of your URLs the engine pulls when it feels like it β slow, but universal, Google included. IndexNow is active: you POST changed URLs and participating engines fetch within minutes, then share the ping with each other.[2] The catch: Google doesnβt participate in IndexNow β it relies on its own crawl scheduling plus your sitemap.[3] So a real publishing pipeline fires both.
sitemap_ping.py generates a spec-valid sitemap.xml from a URL list, validates an existing one against the real sitemaps.org limits, and builds a correct IndexNow payload (key + keyLocation + urlList) β the exact two calls a publish hook should make. Who listens to what
| Mechanism | Bing / Yandex / Naver / Seznam | Speed | |
|---|---|---|---|
sitemap.xml (passive pull) | yes | yes | slow β on the engineβs schedule |
| IndexNow (active push) | no β ignores it[3] | yes β shared across all[2] | minutes |
Why a builder cares about the engines Google snubs: several of them feed AI answers (Bingβs index sits behind Copilot and others), so an instant IndexNow push is partly an AEO discovery move, not just classic SEO.
<loc>. Optional <lastmod> (W3C date).[1]All URLs must be on one host. Submit once via Search Console or
robots.txt; re-fetched on Google's schedule.β universal Β· β slow, no "it changed!" signal
{host, key, keyLocation, urlList} to the endpoint, β€ 10,000 URLs per call.[2]Key = 8β128 chars
[a-zA-Z0-9-], hosted in a text file at your root so the engine can verify you own the site.β instant + shared Β· β Google not included
The tool: build, validate, push
Both halves are pure data transforms β perfect for a publish hook, and offline-testable. The IndexNow side validates everything before it would ever hit the network:
# IndexNow payload is built + checked offline; --send is the only network call
key matches ^[A-Za-z0-9-]{8,128}$ # or it's rejected
all URLs share one host # keyLocation must be on it too
<= 10,000 URLs per post
# sitemap side: well-formed, <=50k URLs, <=50MB, one host, valid lastmod
That generated sitemap.xml is plain spec-valid XML β one <loc> per URL, optional <lastmod>, all on one host:
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<url>
<loc>https://yoursite.com/new-page</loc>
<lastmod>2026-06-22</lastmod>
</url>
</urlset>
β¦and the IndexNow dry run prints the exact POST body before any --send:
{
"host": "yoursite.com",
"key": "a1b2c3d4e5f6a7b8",
"keyLocation": "https://yoursite.com/a1b2c3d4e5f6a7b8.txt",
"urlList": ["https://yoursite.com/new-page"]
}
- Self-check (offline):
python3 tools/sitemap_ping.py --demo - Make a
urls.txt(one URL per line, optional tab + lastmod), generate + validate:python3 tools/sitemap_ping.py gen urls.txt > sitemap.xmlthenpython3 tools/sitemap_ping.py check sitemap.xml - Build an IndexNow push (dry run β prints the key-file step + the exact POST body):
python3 tools/sitemap_ping.py indexnow https://yoursite.com/new-page --key <your-key>
$ python3 tools/sitemap_ping.py check sitemap.xml Sitemap check ββββββββββββββββββββββββββββββββββββββββββββββ [PASS] well-formed XML [PASS] root is <urlset> [PASS] every <url> has a <loc> (412 urls) [PASS] <= 50,000 URLs (412) [PASS] <= 50 MB uncompressed [FAIL] all URLs share one host (site.com, cdn.site.com) [FAIL] lastmod dates valid (W3C) (2 bad: ['19/06/2026']) ββββββββββββββββββββββββββββββββββββββββββββββ VERDICT: 2 problem(s) β engines may reject it.
JobPosting and livestream pages; for everything else itβs sitemap + crawl schedule, with a manual βRequest indexingβ in Search Console. (2) The key file must actually be reachable at keyLocation and contain the key, or every push is rejected β same trust model as robots.txt. (3) A sitemap is discovery, not a ranking lever. Listing a URL gets it found; it does nothing for whether it ranks or gets cited. Donβt expect traffic from a sitemap alone. Ceiling to know: sitemap_ping.py writes a single sitemap; past 50k URLs you need a sitemap index (a sitemap of sitemaps) β noted in the tool as the next upgrade. It defaults IndexNow to a dry run; --send is the one line that touches the network.
Retrieval practice Β· no peeking
Ping check
Answer from memory β that effort is what makes it stick. One try each; pick before you read the others.
The authoritative spec: tags, the 50,000-URL / 50 MB limits, W3C date format, single-host rule. Pair with Google's <a href="https://developers.google.com/search/docs/crawling-indexing/sitemaps/build-sitemap">"Build and submit a sitemap"</a> for submission, and the <a href="https://www.indexnow.org/documentation">IndexNow protocol docs</a> (key + payload). Google's non-participation is tracked in <a href="/resources/">RESOURCES</a>.