Publishing & discoverability

Your site is live — these are the finishing files that make it look right when shared, get found by search engines and AI assistants, and show a proper icon in the browser tab. Small files, big difference.

↳ Hands-on in Academy: Edit & publish — the daily rhythm — the change → push → live loop these files finish off.

Favicon — the tab icon

The little icon in the browser tab and bookmarks. Without it you get a blank page symbol — looks unfinished. You need a few sizes; the modern minimum:

<link rel="icon" href="/favicon.svg" type="image/svg+xml">
<link rel="icon" href="/favicon.ico" sizes="any">
<link rel="apple-touch-icon" href="/apple-touch-icon.png">

Give the agent your logo (SVG is best) and ask it to generate the favicon set and the tags. Add ?v=2 to the URLs when you change it, so browsers drop the old cached one.

Open Graph — the share preview

When someone pastes your link into a chat or social post, these tags decide the title, description and image that "unfurl". Without them the link looks bare.

<meta property="og:title" content="Your page title">
<meta property="og:description" content="One clear sentence.">
<meta property="og:image" content="https://yoursite/og-cover.png">
<meta property="og:url" content="https://yoursite/">
<meta name="twitter:card" content="summary_large_image">
The image should be ~1200×630 px. Test how it looks by pasting the link into a private chat with yourself before you share it widely.

robots.txt — who may crawl

A plain text file at the root telling crawlers what they may visit. Today it's also where you welcome (or refuse) AI crawlers. To be found by AI assistants, allow them:

User-agent: *
Allow: /

# AI assistants — allow them to find you
User-agent: GPTBot
Allow: /
User-agent: ClaudeBot
Allow: /
User-agent: PerplexityBot
Allow: /

Sitemap: https://yoursite/sitemap.xml

llms.txt — a map for AI

A newer convention: a simple Markdown file at /llms.txt that gives AI models a clean, curated overview of your site (what it is, the key links), so they describe and cite you correctly. A fuller /llms-full.txt can include the actual content. Think of it as a readme written for machines.

# Your Site
> One-line description.

## Key pages
- [Home](https://yoursite/): what it is
- [Docs](https://yoursite/docs/): guides

sitemap.xml — the list of pages

A machine-readable list of every page, so search engines find them all. The agent can generate and keep it updated; you reference it from robots.txt (above).

<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
  <url><loc>https://yoursite/</loc></url>
  <url><loc>https://yoursite/about/</loc></url>
</urlset>

Canonical & hreflang

Canonical tells search engines the one true URL for a page (avoids "duplicate" penalties when a page is reachable several ways). hreflang links the language versions of a page so the right one is shown:

<link rel="canonical" href="https://yoursite/page/">
<link rel="alternate" hreflang="en" href="https://yoursite/page/">
<link rel="alternate" hreflang="cs" href="https://yoursite/cs/page/">

All at once

You don't hand-write these. Tell your agent: "set up SEO and AI discoverability for this site — favicon, robots.txt with an AI-crawler allowlist, llms.txt, sitemap.xml, Open Graph and canonical/hreflang." That's exactly what our web-launch recipe does in one pass — and it's safe to re-run to fill gaps.

Rule of thumb: do this once when the site goes live, then again whenever you add pages or change the share image.
Putting a site live for the first time? Hands-on: Zero to a live website → · reference: Web on Cloudflare — cheat-sheet