What's the difference between AEO and SEO for a Sanity site?

Same plumbing, different consumers. SEO targets traditional search engines (Google, Bing) and AEO targets answer engines (ChatGPT, Claude, Perplexity, Google AI Overviews). Both rely on the same fundamentals: clear titles, accurate metadata, structured data, real backlinks, and fast pages. AEO adds two extras — content negotiation to serve agents clean markdown, and discovery surfaces like llms.txt and sitemap.md.

How do I get cited by ChatGPT, Claude, and Perplexity from a Sanity site?

Three things in priority order. First, derive accurate JSON-LD from your Sanity fields so structured data matches the rendered content. Second, ship content negotiation so agents requesting text/markdown get clean markdown instead of HTML noise. Third, publish llms.txt and sitemap.md as curated, agent-friendly indexes. After that, the usual SEO fundamentals do the heavy lifting: clear authors, fresh updatedAt dates, semantic headings.

Does llms.txt actually work — do AI tools read it?

Yes. Anthropic's web fetcher, Perplexity's crawler, and most agent tooling consume llms.txt and sitemap.md. ChatGPT's browse tool uses them when you point it at a URL. Every major coding agent pulls them when pointed at docs. Anyone telling you it's made up is behind on what's actually shipping.

Do I need a separate SEO plugin for Sanity?

No. Sanity ships unopinionated on purpose. You get further faster by defining your own SEO fields object and reusing it across every document type. Plugins add a layer you don't need — the patterns in this post are around 100 lines of schema.

Should I let editors write JSON-LD by hand?

No. JSON-LD should be derived from the fields editors already fill in — title, description, author, published date, image. Generate it at render time from a shared helper. If editors are pasting structured data into a textarea, you've built the wrong thing. This matters even more for AEO than SEO — answer engines weight structured data heavily when picking sources to cite.

How do I make a Sanity blog post rank for freshness in AI Overviews?

Show both publishedAt and _updatedAt visibly on the post, wire them into the BlogPosting JSON-LD as datePublished and dateModified, and actually update old posts when the content changes. ChatGPT, Perplexity, and Google AI Overviews all weight recency heavily — between two posts answering the same question, the one with a recent updatedAt usually wins the citation.

How often should I run an AEO/SEO audit on a Sanity site?

Weekly with an automated crawl (Screaming Frog, Sitebulb) feeding into Linear. Quarterly with a manual review. Lighthouse CI on every PR, plus a Vercel AI SDK agent on a cron checking AEO surfaces (llms.txt, sitemap.md, content negotiation). The goal is no surprises — if your audit catches things you didn't know about, your feedback loop is broken.

AEO/SEO best practices for Sanity: the Roboto default

This entire blog post started out as a response to somebody on Reddit. As any well intentioned, concise response starts out... I started to spiral, and ended up writing a thesis about Sanity AEO/SEO. This is that guide, and this is how you can provide the best possible AEO/SEO for your websites with Sanity.

Loading video player...

This post is the opposite of unopinionated. It's the AEO/SEO default we ship on every Sanity project at Roboto, distilled into one place. AEO (answer engine optimisation) and SEO live on the same plumbing, so we treat them as one job: same fields, same JSON-LD, same sitemap, with a couple of extra agent-facing surfaces layered on top. If you want the canonical reference, go and look at turbo-start-sanity on GitHub. Three files do most of the work: packages/sanity/src/query.ts, apps/studio/utils/seo-fields.ts, and apps/web/src/lib/seo.ts. Copy from those and you'll have a stronger AEO/SEO baseline than most headless sites.

What follows is the why behind those files, plus the bits that go beyond what fits in a starter repo: content negotiation for agents, llms.txt and sitemap.md, accessibility as a ranking and citation signal, and the feedback loop that stops the whole thing rotting six months after launch. This is the successor to our 2023 post on SEO tips for Sanity and Next.js and complements our wider takes on headless CMS SEO and why AEO is mostly SEO with content negotiation.

Start from turbo-start-sanity

Our open-source Sanity + Next.js starter with every AEO/SEO pattern in this post wired in by default.

View on GitHub

Why Sanity AEO/SEO feels like a mystery

Sanity will let you model anything. A blog, a 50,000-SKU e-commerce catalogue, a directory of train times for a single terminal at Paddington. The flexibility is brilliant, and it's also why every new team building on Sanity asks the same question on day one: where do I put the meta title — and how do I make sure ChatGPT and Perplexity actually cite this thing?

There's a Sanity Learn article on SEO, and a cute guy that wrote it, that covers most information at a basic level. It's worth a read if you've got the time. Most teams don't. What follows is the shorter, more opinionated version. The one we hand to clients and developers who want to ship the AEO/SEO bit and get on with the actual product.

From this point onwards, just trust me bro, because we've built too many of these.

The baseline: title and description on every document

Every document type (page, blog, case study, service, author, whatever) gets two fields at the top: title and description. These are the editorial defaults. They double as more than just SEO:

The blog title is the H1 on the post and the heading on the listing card
The blog description is the meta description fallback and the card blurb on the index page
The page title and description follow the same pattern across the page builder

You can layer validation on top of these: minimum length, maximum character count, the little counter underneath the input. We tend not to. Editors find prescriptive character counts annoying, and Google's truncation behaviour shifts often enough that hard limits feel out of date within a year. The fields exist; the editor's responsible for using them sensibly.

The SEO tab: meta overrides with fallbacks

Beneath the baseline, every document gets an SEO tab. Inside it:

seoTitle — overrides the document's title in the <title> tag
seoDescription — overrides the description in the meta description
seoImage — overrides the social/Open Graph image
seoNoIndex — boolean, adds the noindex meta tag
seoHideFromLists — boolean, excludes the document from internal listings

The crucial bit: the SEO tab fields are overrides, not requirements. If seoTitle is empty, the renderer falls back to the document's title. If seoDescription is empty, it falls back to description. Editors get a consistent surface across every document type, and they only fill in the SEO tab when they need to differentiate the metadata from the visible content.

// apps/studio/utils/seo-fields.ts (simplified)
import { defineField } from "sanity";

export const seoFields = [
  defineField({
    name: "seoTitle",
    title: "SEO title",
    type: "string",
    description: "Overrides the page title in the <title> tag. Falls back to the document title.",
  }),
  defineField({
    name: "seoDescription",
    title: "SEO description",
    type: "text",
    rows: 3,
    description: "Overrides the page description in the meta description. Falls back to the document description.",
  }),
  defineField({
    name: "seoImage",
    title: "SEO image",
    type: "image",
    description: "Overrides the social sharing image. Falls back to the document's primary image.",
  }),
  defineField({
    name: "seoNoIndex",
    title: "No index",
    type: "boolean",
    description: "Hides this page from search engines.",
    initialValue: false,
  }),
  defineField({
    name: "seoHideFromLists",
    title: "Hide from lists",
    type: "boolean",
    description: "Hides this page from internal listings (blog index, related posts, etc).",
    initialValue: false,
  }),
];

That's it. Spread ...seoFields into any document schema and you've got a consistent SEO surface across the whole studio.

Open Graph (and per-network if you really need it)

Open Graph follows the same fallback pattern. ogTitle, ogDescription, ogImage. If they're not set, use the SEO fields. If those aren't set, use the baseline title and description.

If you've got a specific need to differentiate per network (a punchier headline for Twitter, a longer description for LinkedIn), split it into twitterTitle, linkedinTitle, and so on. Most projects don't need it. Default to one Open Graph block that covers every network, and break it out only when a client specifically asks.

JSON-LD: set and forget

If there's one thing I can't recommend enough set and forget as much as you can. As soon as you put JSON-LD into a Sanity Studio, you better hope to god you put a validator in there, because I guarantee it's going to be broke within a week flat.

The mistake we see most often: teams add structured data as a separate editor surface. A textarea for JSON-LD, or a schema picker, or a dedicated tab in the studio. Editors are now responsible for filling in a BlogPosting schema with author, date, image, headline, duplicating what they've already put into the document. It drifts within a sprint.

The Roboto default: derive JSON-LD from existing fields, at render time, in code. Editors enter the data once, in the natural place. The schema generator pulls from those fields and emits the structured data.

// apps/web/src/lib/seo.ts (simplified)
export function generateBlogJsonLd(post: BlogPost) {
  return {
    "@context": "https://schema.org",
    "@type": "BlogPosting",
    headline: post.seoTitle ?? post.title,
    description: post.seoDescription ?? post.description,
    image: post.seoImage ?? post.image,
    datePublished: post.publishedAt,
    dateModified: post._updatedAt,
    author: post.authors.map((a) => ({
      "@type": "Person",
      name: a.name,
      url: `https://robotostudio.com/author/${a.slug}`,
      sameAs: [a.linkedin, a.twitter].filter(Boolean),
    })),
    publisher: {
      "@type": "Organization",
      name: "Roboto Studio",
      logo: { "@type": "ImageObject", url: "https://robotostudio.com/logo.png" },
    },
  };
}

The whole thing is one helper function. Parity with what's rendered, zero editor work, no drift when content changes. And it's not difficult to scaffold these days. Point any half-decent AI assistant at your document schema and it'll write the generator in one prompt.

The other side of set-and-forget: pull through everything you've already modelled. If you've got author documents with bios and social profiles, push them into the Person schema. If you've got categories, emit BlogPosting.about. Don't make editors think about it. They've already given you the data.

`seoNoIndex` and `seoHideFromLists`: why we keep both

These two fields look similar. They do different things.

seoNoIndex adds <meta name="robots" content="noindex">. The page is still in the sitemap, still linked from internal navigation, but search engines are told not to index it.
seoHideFromLists is consumed by your GROQ queries. The page is excluded from the blog index, category pages, related-posts blocks, and anything else that lists documents.

Common scenarios where you need both:

PPC landing page — seoNoIndex: true so Google doesn't rank it organically, seoHideFromLists: true so it doesn't appear on the main blog index. Only people clicking the ad ever see it.
Archived blog post — seoHideFromLists: true so it's quietly demoted off the listings, but seoNoIndex: false so existing backlinks still hit a valid indexed page.
Author-only draft — both true while the post is being reviewed, both flipped when it's ready to ship.

The GROQ pattern looks like this:

// packages/sanity/src/query.ts
export const getAllBlogPostsQuery = groq`
  *[_type == "blog" && (seoHideFromLists != true)] | order(publishedAt desc) {
    ...,
  }
`;

One line. Editors get a toggle, GROQ does the filtering, the renderer respects the meta. Nothing fancy.

Sitemaps: cheap wins

The sitemap is the boring infrastructure that most teams set up once and never touch again. A few rules worth getting right:

Use _updatedAt as <lastmod>, not _createdAt or the publish date. Sanity gives you _updatedAt for free on every document. Pipe it straight through. This is the freshness signal Googlebot actually cares about in 2026.
Generate the sitemap from the same GROQ query that powers the site, so there's only one source of truth. No drift between what's indexable and what's visible.
Respect seoNoIndex and seoHideFromLists. Filter them out of the sitemap too. A no-indexed page in your sitemap is a mixed signal.
Optional belt-and-braces: add a sitemapPriority field on the document, default 0.7, bump to 0.9 for cornerstone content. Modern Google largely ignores <priority> and <changefreq> (they've said so publicly), but it costs nothing to ship and gives editors a lever if they want one.

The honest take: <lastmod> is the only sitemap field that genuinely moves the needle today. Get it right and the rest is hygiene.

Last updated and last published: freshness signals for AEO

Show both dates on every blog post, visibly, at the top:

Published: 12 March 2024
Updated: 18 May 2026

Sanity gives you both for free: _createdAt (or your custom publishedAt) and _updatedAt. Pull them into the byline, render them in the layout, and wire them into the JSON-LD datePublished and dateModified properties. One source of truth, multiple surfaces, no editor work to keep them in sync.

Why it matters more than it used to:

Traditional SEO. Google's freshness algorithm rewards recently-updated content for query-deserves-freshness topics. Visible dates reinforce what's in the schema.
AEO. ChatGPT, Perplexity, Claude, and Google's AI Overviews all weight recency heavily when picking sources to cite. Two posts answering the same question; yours says "updated last month"; yours wins the citation.
User trust. A 2019 post with no update date reads as stale. A 2019 post updated last month reads as maintained.

The discipline that goes with this: actually update old posts. Refresh the stats, swap stale screenshots, fix broken links, then let _updatedAt do its job. Don't fake the date. LLMs are getting better at detecting stale content dressed up as fresh, and Google's always been able to.

`details` and `summary`: free AEO/SEO and accessibility wins

The native HTML disclosure widget is one of the most underused tools in the SEO toolkit. No JavaScript, no library, works in every browser back to 2020.

Why it matters for SEO: content inside <details> is fully crawlable and indexable, even when collapsed. Google reads it. So you can hide long FAQ answers, technical specifications, or "read more" sections without losing keyword surface area.

Why it matters for UX: shorter perceived page length, lower bounce on long-form content, and it's a real focusable, keyboard-accessible element out of the box. No aria-expanded plumbing required.

Where to use it in a Sanity-driven site:

FAQ blocks. Pair with FAQ JSON-LD and you get rich results and a clean UI.
Long product spec tables
"What we did" sections on case studies
Footnotes and references on blog posts

The trap: don't hide your H1, primary value proposition, or first paragraph inside a <details>. Google indexes the content, but ranks visible-by-default content higher. Use disclosures for supporting material, not core material.

Implementation note: in your Portable Text serializer or MDX components, expose it as a "collapsible section" block. Editors get a button in the studio; developers get semantic HTML in the output.

Content negotiation: serve markdown to agents by default

Vercel published a pattern recently that's worth shipping on every Sanity build: when an agent sends Accept: text/markdown, text/html, */*, return clean markdown instead of the HTML page. Same URL, dramatically smaller payload. Their example: 500KB of HTML compressed to 3KB of markdown. A 99% reduction. We've written about why this matters in more depth in AEO is just SEO with content negotiation.

Why this matters for AEO right now: agents burn tokens on HTML noise. Nav, footers, scripts, classNames, hydration payloads. Hand them markdown and they ingest the actual content. More content per request, cleaner citations, better odds of being picked as a source.

How it works in Next.js:

A rewrite rule in next.config.ts detects the markdown Accept header and routes the request to a dedicated markdown endpoint
The route handler returns the body with Content-Type: text/markdown
Both endpoints pull from the same Sanity GROQ query, so there's no drift between what HTML readers see and what agents consume

Two discovery aids to ship alongside it:

A <link rel="alternate" type="text/markdown" href="/llms.txt" /> tag in your HTML head for agents that don't send the Accept header
A /sitemap.md alongside /sitemap.xml, giving agents a structured markdown index they can navigate

Wire this into every project from day one. It's a one-time setup that future-proofs the site as agent traffic keeps growing. Cheap to add now, expensive to retrofit later when half your traffic is bots that can't see your content.

`llms.txt` and `sitemap.md`: yes, agents read them

There's a tedious crowd online insisting llms.txt is "made up" or that "no AI reads it." They're wrong.

What llms.txt is. A markdown file at the root of your site (/llms.txt) that gives agents a curated, structured index of your most important content. Titles, descriptions, links. Think of it as a hand-picked sitemap optimised for LLM consumption rather than crawler discovery.
What sitemap.md is. The same idea but mirroring your full sitemap.xml. A markdown version of every indexable URL with titles and hierarchy. Agents traverse it instead of parsing XML.
Who reads them. Anthropic's web fetcher consumes them. Perplexity's crawler does. Agents built on the Vercel AI SDK do. ChatGPT's browse tool uses them when you point it at a URL. Every coding agent (Cursor, Claude Code, Codex) pulls them when you point it at docs. The pattern is spreading fast. Stripe, Anthropic, Vercel, Cloudflare, and Sanity itself all ship them.

The cost to ship is one route handler and a GROQ query. The cost of not shipping is invisibility in agent-mediated search at exactly the moment that traffic source is growing fastest. Hell, if it turns out I'm totally wrong, I can delete this section and pretend I never said it.

How to wire it up in Sanity:

/llms.txt — hand-pick your top docs, pages, and guides. Group by section. Render from a dedicated GROQ query so editors can flag content for inclusion with a featuredForLLM boolean.
/sitemap.md — derive from the same query as sitemap.xml. Same filters, different output format.
Both are static markdown responses. Cheap to serve, cacheable forever, easy to regenerate.

Don't overthink it. A 100-line file beats a 0-line file every time. Ship something, iterate.

Image SEO from the Sanity pipeline

Sanity's image handling is one of the best parts of the stack, and most teams don't use it properly.

Alt text lives on the asset, not the usage. Install sanity-plugin-media. It gives you a proper asset library inside the studio where editors can set alt text, title, tags, and credits on the image itself. Set it once, every usage across every document inherits it. No more wondering whether the editor remembered to write alt text on the card variant. It's hoisted onto the asset.

The compounding win: when you swap a hero image six months later, the new asset already has alt text from the upload. Editors fill it in once, at the source of truth.

A few more rules:

Require alt text at upload. Wire it as a required field in the plugin config so editors can't physically upload without it.
Filename hygiene at upload. hero-blog-seo-best-practices.png, not IMG_3421.png. Google reads filenames as a weak ranking signal. LLMs absolutely do.

The banger combo: Sanity's image pipeline plus next/image. This is the move:

Sanity's CDN handles transforms. ?w=800&fm=webp&q=80 gives you any size, any format, on demand.
next/image consumes those URLs and emits a proper srcset for every breakpoint.
You get responsive images, automatic AVIF/WebP, lazy loading, blur placeholders (from Sanity's LQIP), and zero CLS, all from a single <SanityImage> component.

Compared to hand-rolling <picture> elements with <source> tags for every breakpoint? Not even close. Write the component once, ship it everywhere, the pipeline does the rest.

priority on the LCP image (hero, above the fold). Lazy on everything else. Next handles both with a single prop.
Explicit width and height. Sanity gives you the asset dimensions for free via asset->metadata.dimensions. Pipe them straight into next/image to lock the aspect ratio and prevent CLS.
Preload the LCP image in <head> when it's an above-the-fold hero. Next does this automatically when you set priority.

Accessibility is AEO/SEO

This deserves its own section because most teams treat it as an afterthought. It isn't.

Google has been candid that accessibility signals feed Core Web Vitals and ranking. Screen-reader-friendly markup is also LLM-friendly markup, since agents parse semantic HTML the same way assistive tech does. So when you optimise for accessibility, you're optimising for traditional SEO and AEO at the same time.

The non-negotiables on every Sanity build:

Required alt text at the asset level (covered above). Make it physically impossible to upload without one.
Heading hierarchy is sacrosanct. One H1 per page (the document title). H2 for sections. No skipping levels, no decorative H1s in the page builder. Validate it in Portable Text serializers: strip or downgrade any H1s authored inside rich text.
Semantic landmarks. <main>, <nav>, <article>, <aside>, <footer>. Not <div>s with role attributes.
Focus states: visible, high-contrast. Tailwind's default focus-visible: ring is fine. Don't disable it for aesthetics.
Colour contrast: 4.5:1 for body text, 3:1 for large text. Build it into the design tokens.
Skip links at the top of every layout. Keyboard users and screen readers both benefit.
Form labels: every input wired to a <label>. Placeholder text is not a label.
prefers-reduced-motion: respect it in your Motion for React components. Wrap big animated sections in a conditional that disables motion when the OS preference is set.

The payoff: Lighthouse accessibility scores correlate with rankings, AEO tools weight clean semantic structure when extracting answers for citations, and you stop excluding users who can't navigate poorly-built sites. Every box gets ticked at once.

No spam, only good stuff

Get more Sanity patterns in your inbox

Only god knows why anybody would purposefully subscribe themselves to a newsletter that moans about development. These poor souls did though

Internal linking discipline

Internal links are the cheapest ranking lever you have. The Roboto default:

Custom link annotations in Portable Text. Internal links are references, not strings. When a slug changes, the link follows. No 404 link rot, ever.
Programmatic related-posts from shared categories or tags. Don't ask editors to hand-pick three related links on every post; derive them from the data they've already entered.
Breadcrumbs derived from URL structure, with BreadcrumbList JSON-LD attached.
Three internal links per blog post is a sensible baseline. Editors should be linking inside prose, not just on related-content cards. If a post has zero internal links in the body, something's wrong with either the post or the rest of the site.

Canonical URLs

Self-referencing canonical on every page by default
An seoCanonical override field for syndicated content (cross-posted to Medium? Point canonical back to your domain)
Strip query params from canonicals — UTM tags should not dilute your canonical signal

This is a ten-line helper in apps/web/src/lib/seo.ts. Most projects never need to touch the override; it's there for the day a client cross-posts a thought leadership piece.

Redirects as a Sanity document type

When editors change a slug, never leave the old URL dead. The Roboto stack handles this with a redirect document type. Old slug, new slug, automatic.

The redirect schema is surfaced in the studio sidebar, so editors own it without dev intervention
The list of redirects is queried at build time and piped into next.config.ts (or middleware for high-volume sites)
For projects with hundreds of historical redirects, batch them. Don't expand the middleware on every request.

The wider point: SEO equity is fragile. Years of backlinks evaporate the moment a URL 404s. A redirect document type costs about an hour to build and protects the site indefinitely.

Author entities and E-E-A-T

Every author gets a proper Author document type with:

A bio (used in the byline and on the author archive page)
A photo (used in the byline and on the author card)
Social profiles (LinkedIn, Twitter, GitHub, personal site)
Position and credentials

Why bother:

Author bylines render a Person JSON-LD that Google reads for expertise signals. E-E-A-T is real, and identified authors carry more weight than anonymous content.
Author archive pages (/author/jono-alford) collect every post, building topical authority and giving Google a clear signal about who writes about what.
LLMs cite authors when they can identify them. Anonymous content gets cited less, especially for opinion-led topics where attribution matters.

This is the same set-and-forget principle from JSON-LD. Editors fill in the author document once. Every post by that author inherits the bio, the photo, the schema, the social links.

Schema beyond `BlogPosting`

BlogPosting is the obvious one. Don't stop there. On every Sanity build we tend to wire up:

Organization once, in the root layout
BreadcrumbList on every nested page
FAQPage when a block has FAQs (this post has one, look at the source)
HowTo for tutorials with clear steps
Product or Service on service pages
Person for author bylines

All derived from existing Sanity content. Editors never touch schema markup. It's just generated.

Robots and the boring infrastructure

The unglamorous stuff that bites you when it breaks:

robots.txt generated from Sanity site config, not hardcoded. That way you can toggle staging vs prod indexing without a deploy: flip a boolean in the studio, redeploy if needed, done.
404 pages with real internal links (popular posts, search, primary nav). Not a dead end. Genuinely useful 404 pages reduce bounce and recover SEO equity from broken inbound links.
Never serve 200 OK on "not found" content. Use Next's notFound() properly. Soft 404s are one of the worst ranking signals you can send and Google is increasingly aggressive about flagging them.

Open Graph image fallback generator

Editors are inconsistent about uploading social images. The fix: give them a default the moment the document is created, and generate one at render time when the slot is empty.

Default at document creation. Sanity's initialValue on the schema lets you pre-populate the ogImage field with a templated asset when a new document is created. Editors open a fresh blog post and there's already a usable OG image sitting in the slot. Most won't touch it; the few who care will override it. Either way you've removed the empty state.

// apps/studio/schemaTypes/documents/blog.ts (simplified)
defineField({
  name: "ogImage",
  title: "OG image",
  type: "image",
  description: "Social sharing image. Defaults to a branded template; override when you want something custom.",
  initialValue: {
    _type: "image",
    asset: {
      _type: "reference",
      _ref: "image-default-og-blog-1200x630-png", // pre-uploaded brand template
    },
  },
}),

Render-time fallback. Use next/og to auto-generate a branded OG image from the document title when ogImage is still empty (or when you want a per-post variant). Brand-consistent template, always fresh, zero editor burden. Cache it aggressively — the URL is deterministic from the slug, so it never needs to regenerate after the first request.

The same set-and-forget principle, applied to social previews. Editors who care can override; editors who don't get something usable by default, both at creation time and at render time.

AEO/SEO sanitisation: catch the drift before it eats you

Here's what happens on every codebase, including ones built by people who know better. You ship fast. You vibe-code a few sections. You let a junior developer wire up a new page builder block. You copy-paste a layout. Six months later your site is quietly haemorrhaging SEO equity. Heading hierarchy is broken on three templates. Half your pages have duplicate H1s. The new "featured grid" component renders product titles as <div> because someone forgot. Meta descriptions are getting truncated. Four hundred pages share the same og:image.

You won't catch this by eyeballing the site. You need a feedback loop.

Run a crawl weekly. Screaming Frog or Sitebulb against prod. Export the issues, triage the top ten. Every Roboto client site has one of these running, and it surfaces things humans miss every single time: orphan pages, broken internal links, missing alt text, oversized HTML, redirect chains, duplicate titles. The licence pays for itself in one engagement.

Wire it into Linear. When the crawl finds issues, dump them into tickets with an seo-debt label. Treat them like bugs, not "nice to have."

CI-level guards for the cheap ones:

Lint rule: no <h1> inside Portable Text serializers (only the page title gets H1)
Test: every page returns a <title>, <meta name="description">, <link rel="canonical">, and an og:image
Test: no console.log in production, no stray noindex in prod headers (you'd be amazed how often this ships)
Test: sitemap.xml returns 200 and contains every published document

Lighthouse CI on every PR. Fail the build if accessibility drops below 95 or SEO drops below 100. Don't let regressions land. Catch them at the PR, not after deploy.

Vibe coding is fine. Vibe coding without guardrails is what kills you.

Shipping fast is genuinely good. Half the SEO patterns in this post exist because we vibed our way through a problem on a client project and codified what worked. The fix isn't to slow down, it's to set up validation that runs at the speed you're shipping.

The pattern we run on most projects now:

Wire up an agent that audits your site continuously. Vercel AI SDK makes this trivially cheap to build — a scheduled function, a couple of tool calls (fetch the page, parse the head, check the schema), and a structured output that flags regressions. We've got one running against robotostudio.com on a cron, and it catches things humans miss between deploys.
Validation at the schema level. If a page builder block requires a heading, a Zod schema or a Sanity validation rule should reject it without one. Don't let bad data ship and then fix it downstream.
Validation at the PR level. Lighthouse CI, a Screaming Frog programmatic crawl, or an AI SDK script that diffs the rendered HTML against last week's. Anything that fails the build before merge.
Validation at the runtime level. Log SEO-critical fields to PostHog or Sentry. If a page renders without a <title> or with a duplicate H1, you want to know about it the moment it happens, not next quarter.

The honest take: every shortcut you take to ship (the inline component, the hardcoded string, the "I'll fix the schema later") compounds. SEO debt looks invisible until you check Search Console and realise you've been bleeding impressions for two months. AI-driven validation is the cheapest way to keep moving fast without paying for it later.

What "good" looks like: zero critical Screaming Frog issues, Lighthouse SEO 100 on every template, an AI audit agent that catches drift before you do. If your manual audit surfaces surprises, your validation is broken. Fix the validation first, not just the surprises.

Where to copy this from

The whole pattern is open source. Go to turbo-start-sanity on GitHub. Open these three files:

packages/sanity/src/query.ts — the GROQ queries with SEO filtering baked in
apps/studio/utils/seo-fields.ts — the reusable SEO fields object
apps/web/src/lib/seo.ts — the metadata generator and JSON-LD helpers

Copy them, adapt them, ship them. That's the entire AEO/SEO baseline for a Sanity project. Everything else in this post (content negotiation, llms.txt, accessibility, the sanitisation loop) is incremental. The three files above are the foundation.

If you're starting a new Sanity build, start there. If you've inherited an existing build and the AEO/SEO feels patchy, those three files are your refactor target.

Stuck on AEO/SEO for your Sanity build?

We help teams audit and rebuild their Sanity AEO/SEO so they stop bleeding impressions and start showing up in AI answers. Drop us a line.

Get in touch

AEO/SEO best practices for Sanity: the Roboto default

Why Sanity AEO/SEO feels like a mystery

The baseline: title and description on every document

The SEO tab: meta overrides with fallbacks

Open Graph (and per-network if you really need it)

JSON-LD: set and forget

`seoNoIndex` and `seoHideFromLists`: why we keep both

Sitemaps: cheap wins

Last updated and last published: freshness signals for AEO

`details` and `summary`: free AEO/SEO and accessibility wins

Content negotiation: serve markdown to agents by default

`llms.txt` and `sitemap.md`: yes, agents read them

Image SEO from the Sanity pipeline

Accessibility is AEO/SEO

Get more Sanity patterns in your inbox

Internal linking discipline

Canonical URLs

Redirects as a Sanity document type

Author entities and E-E-A-T

Schema beyond `BlogPosting`

Robots and the boring infrastructure

Open Graph image fallback generator

AEO/SEO sanitisation: catch the drift before it eats you

Where to copy this from

Frequently asked questions

About the Authors

Get in touch

AEO/SEO best practices for Sanity: the Roboto default

Why Sanity AEO/SEO feels like a mystery

The baseline: title and description on every document

The SEO tab: meta overrides with fallbacks

Open Graph (and per-network if you really need it)

JSON-LD: set and forget

seoNoIndex and seoHideFromLists: why we keep both

Sitemaps: cheap wins

Last updated and last published: freshness signals for AEO

details and summary: free AEO/SEO and accessibility wins

Content negotiation: serve markdown to agents by default

llms.txt and sitemap.md: yes, agents read them

Image SEO from the Sanity pipeline

Accessibility is AEO/SEO

Get more Sanity patterns in your inbox

Internal linking discipline

Canonical URLs

Redirects as a Sanity document type

Author entities and E-E-A-T

Schema beyond BlogPosting

Robots and the boring infrastructure

Open Graph image fallback generator

AEO/SEO sanitisation: catch the drift before it eats you

Where to copy this from

Frequently asked questions

About the Authors

Get in touch

`seoNoIndex` and `seoHideFromLists`: why we keep both

`details` and `summary`: free AEO/SEO and accessibility wins

`llms.txt` and `sitemap.md`: yes, agents read them

Schema beyond `BlogPosting`