# Is your website ready for AI? Free LLM discoverability checker - agentmarkup

> Check if your website has llms.txt, JSON-LD structured data, AI crawler robots.txt rules, markdown mirrors, and sitemap. Free audit tool for AI discoverability - works for e-commerce and brand websites.

Source: https://agentmarkup.dev/blog/website-checker/

By [Sebastian Cochinescu](/authors/sebastian-cochinescu/) · March 20, 2026 · 8 min read

# Is your website ready for AI? Use the free agentmarkup checker to find out

Most websites are invisible to AI systems. Not because the content is bad, but because the metadata is missing, broken, or incomplete. The [agentmarkup checker](/checker/) audits your website in seconds and tells you exactly what to fix.

## Why you need to check your website

When ChatGPT, Claude, or Perplexity answers a question about your industry, does your website show up? In most cases, the answer is no. Not because your content is not relevant, but because AI systems cannot understand your site.

The difference between a website that gets cited by AI and one that does not often comes down to a few missing files and metadata tags. A robots.txt that accidentally blocks AI crawlers. Missing JSON-LD structured data. No llms.txt file. No readable fallback when the raw HTML is thin or heavily client-rendered.

These are not complex problems. They are configuration gaps that take minutes to fix once you know they exist. The hard part is knowing they exist.

## What the checker audits

Enter any public URL at [agentmarkup.dev/checker](/checker/) and the tool fetches your homepage, llms.txt, robots.txt, sitemap, markdown mirrors, and a sample internal page. It runs 20+ deterministic checks and categorizes each as a pass, warning, or error.

### Homepage structure

- Is your homepage publicly reachable over HTTPS?
- Does it have a canonical URL tag?
- Is there a meta description?
- Is the HTML lang attribute set?
- Does it have an H1 heading?
- Does the X-Robots-Tag header block indexing?

### JSON-LD structured data

- Are there any JSON-LD blocks in the page?
- Is the JSON-LD syntactically valid?
- Is there a WebSite schema identifying your site?
- Is there an Organization schema with your brand name and logo?

### llms.txt

- Does `/llms.txt` exist and is it accessible?
- Is the file structurally valid?
- Does your homepage advertise it via a link tag?

### Markdown mirrors

- If the raw HTML is thin, does your homepage have a markdown alternate link?
- Is the markdown file accessible and substantial (not empty or raw HTML)?
- If a linked page also serves thin HTML, is there a useful markdown fallback there too?

### Robots.txt and AI crawlers

- Does robots.txt exist?
- Are there explicit rules for GPTBot, ClaudeBot, PerplexityBot, Google-Extended, and CCBot?
- Does robots.txt reference your sitemap?

### Sitemap

- Is a sitemap available at `/sitemap.xml` or referenced in robots.txt?
- Is the sitemap valid XML?

## How to read the results

Results use three levels. There are no scores, no percentages, no arbitrary numbers. Just deterministic checks with clear outcomes:

- **Error** - something is blocking AI access. Homepage unreachable, noindex header present, invalid JSON-LD.
- **Warning** - something important is missing. No Organization schema, no explicit AI crawler rules, or thin HTML without adequate markdown coverage.
- **Pass** - a best practice is met. Homepage reachable, canonical URL present, llms.txt valid.

Each finding includes a title, a detail explaining why it matters, and an action step telling you what to do. Where relevant, findings link to the documentation guides on this site.

## For e-commerce websites

E-commerce sites have the most to gain from AI discoverability. When someone asks an AI "best running shoes under $150," the AI recommends products from stores it can understand. The checker tells you whether your store is one of them.

Key checks for e-commerce:

- **Product JSON-LD** - does your product page have structured data with name, price, availability? Without it, AI cannot recommend your products accurately.
- **Organization schema** - does AI know your brand name and logo? This is how AI associates products with your store in its answers.
- **AI crawler access** - is your robots.txt accidentally blocking GPTBot or PerplexityBot? Many e-commerce platforms ship with broad disallow rules that block AI crawlers along with everything else.
- **Markdown mirrors** - can AI agents read clean content from your product pages, or do they get an empty JavaScript shell?

Run your store through the checker. If you see warnings for missing Organization schema or no AI crawler rules, those are the first things to fix.

## For brand and content websites

If your business depends on being known - a consultancy, an agency, a SaaS product - the checker shows whether AI can accurately describe what you do.

- **Organization schema** tells AI your exact name, description, and social profiles. Without it, AI might confuse you with another company.
- **llms.txt** gives AI a structured overview of your services and pages. Instead of crawling every page, AI reads one file and understands your site.
- **Meta description and H1** are the first things AI reads. If they are generic ("Welcome to our website"), AI has nothing useful to work with.

## What makes this different from SEO auditors

Traditional SEO tools check whether Google can index your pages. The agentmarkup checker checks whether AI systems can understand your content. These overlap but are not the same.

- **llms.txt** is not checked by any SEO tool. It is AI-specific.
- **Markdown mirrors** are irrelevant to Google but useful when the raw HTML is thin, heavily client-rendered, or cluttered with layout noise.
- **AI crawler rules** (GPTBot, ClaudeBot) are separate from Googlebot rules. You might have perfect Google indexing while being completely invisible to ChatGPT.
- **No scores.** SEO tools love to give you a number out of 100. The checker gives you specific, actionable findings. "Your robots.txt does not include explicit rules for GPTBot" is more useful than "Your AI readiness score is 47."

## Try it now

Go to [agentmarkup.dev/checker](/checker/), enter your website URL, and see the results in seconds. It is free and requires no signup. The deployed checker may retain normalized check records briefly for caching, rate limiting, and recent-history views, but it is not a lead form.

If the checker finds issues, the documentation guides on this site explain how to fix each one. Or install [agentmarkup](https://github.com/agentmarkup/agentmarkup) for Vite, Astro, or Next.js and it handles llms.txt, JSON-LD, robots.txt, markdown mirrors, optional headers, and validation automatically at build time.

## Make your website machine-readable

agentmarkup is an open-source build-time toolkit for Vite, Astro, and Next.js that generates llms.txt, injects JSON-LD structured data, creates optional markdown mirrors from final HTML when raw pages need a cleaner agent-facing fetch path, manages AI crawler robots.txt rules, patches optional Content-Signal and canonical mirror headers, and validates everything at build time. Zero runtime cost.

 Learn more GitHub
```
pnpm add -D @agentmarkup/vite # or @agentmarkup/astro or @agentmarkup/next
```

Written by

[Sebastian Cochinescu](/authors/sebastian-cochinescu/) · Developer of agentmarkup

Builder of developer tools for machine-readable websites. Developer of agentmarkup. Founder of Anima Felix.

## More from the blog

### How to add llms.txt, JSON-LD, and AI crawler controls to Next.js

Use @agentmarkup/next to generate llms.txt, inject JSON-LD, manage AI crawler rules, and understand the dynamic SSR boundary in Next.js.

 March 23, 2026 · 8 min read

### When markdown mirrors help, and when they do not

A practical guide to when generated markdown mirrors add signal, when HTML is already enough, and how to avoid unnecessary downsides.

 March 20, 2026 · 7 min read

### Build-time markdown mirrors for agent readability: Cloudflare comparison

Build-time markdown generation for AI readability, including when it helps and how it compares to Cloudflare runtime extraction.

 March 20, 2026 · 7 min read

### How to make your brand appear in AI conversations

Organization schema, llms.txt, and FAQ markup make your brand visible in ChatGPT, Claude, and Perplexity answers.

 March 20, 2026 · 7 min read

### Why LLM-optimized e-commerce websites sell more

Product JSON-LD, llms.txt, and AI crawler access make your store visible in AI product recommendations.

 March 20, 2026 · 8 min read

### Every AI crawler indexing your website in 2026

Complete list: GPTBot, ClaudeBot, PerplexityBot, Google-Extended, CCBot, and more. What each does and how to control access.

 March 20, 2026 · 8 min read

### JSON-LD structured data: the complete guide for web developers

Schema types, JSON-LD vs microdata, common mistakes, and build-time validation.

 March 20, 2026 · 10 min read

### What is GEO? Generative Engine Optimization explained for developers

What is real, what is hype, and what you can do today to make your site citeable by AI.

 March 20, 2026 · 7 min read

### Why llms.txt matters: making your website discoverable by AI

LLMs answer questions by synthesizing web content. llms.txt gives them a structured overview of your site.

 March 20, 2026 · 6 min read
