# How to add llms.txt, JSON-LD, and AI crawler controls to Next.js - agentmarkup

> Use @agentmarkup/next to generate llms.txt, inject JSON-LD, manage AI crawler rules, and understand the fully dynamic SSR boundary in Next.js.

Source: https://agentmarkup.dev/blog/nextjs-llms-txt-json-ld/

By [Sebastian Cochinescu](/authors/sebastian-cochinescu/) · March 23, 2026 · 8 min read

# How to add llms.txt, JSON-LD, and AI crawler controls to Next.js

Next.js sites need the same machine-readable surface as any other modern website: `llms.txt`, structured data, crawler rules, and validation. The tricky part is choosing the right integration point so those artifacts reflect your final output instead of an earlier build step.

## Why Next.js is slightly different

With plain Vite or Astro, the final HTML output is usually obvious. Next.js can mix static export, prerendered pages, server deployments, and fully dynamic SSR routes in the same app. That means a useful Next integration cannot just be a generic bundler plugin. It has to respect what Next actually emits at build time.

That is what [@agentmarkup/next](https://www.npmjs.com/package/@agentmarkup/next) is for. It is a final-output-first adapter built around Next's config and build hooks rather than a Vite-style HTML transform.

## What the Next.js adapter gives you

- **`llms.txt`** generation from your config, with the homepage discovery link injected automatically
- **Optional `llms-full.txt`** with inlined same-site markdown context when mirrors exist
- **JSON-LD injection** into emitted HTML plus validation of existing schema blocks
- **Optional markdown mirrors** for thin or noisy built pages that need a cleaner fetch target for agents
- **`robots.txt` patching** for AI crawler directives like GPTBot, ClaudeBot, and Google-Extended
- **Header support** for Content-Signal and markdown canonicals through static `_headers` output or merged Next header rules
- **Build-time validation** for schema mistakes, crawler conflicts, thin HTML, and markdown drift

## Basic setup

Install the package:

```
pnpm add -D @agentmarkup/next
```

Then wrap your Next config:

```
// next.config.ts
import type { NextConfig } from 'next'
import { withAgentmarkup } from '@agentmarkup/next'

const nextConfig: NextConfig = {
 output: 'export',
}

export default withAgentmarkup(
 {
 site: 'https://example.com',
 name: 'Example Docs',
 description: 'Technical docs and product pages.',
 globalSchemas: [
 { preset: 'webSite', name: 'Example Docs', url: 'https://example.com' },
 { preset: 'organization', name: 'Example Inc.', url: 'https://example.com' },
 ],
 llmsTxt: {
 sections: [
 {
 title: 'Documentation',
 entries: [
 {
 title: 'Getting Started',
 url: '/docs/getting-started',
 description: 'Setup guide and first steps',
 },
 ],
 },
 ],
 },
 llmsFullTxt: {
 enabled: true,
 },
 markdownPages: {
 enabled: true,
 },
 contentSignalHeaders: {
 enabled: true,
 },
 aiCrawlers: {
 GPTBot: 'allow',
 ClaudeBot: 'allow',
 PerplexityBot: 'allow',
 'Google-Extended': 'allow',
 CCBot: 'disallow',
 },
 },
 nextConfig,
)
```

The important thing to notice is that the config shape is shared across the first-party adapters. The shared `AgentMarkupConfig` stays framework-agnostic. Only the wrapper changes.

## Where it works best

The strongest fit is static export and any route where Next emits build-time HTML that can be patched or post-processed. That includes a lot of real App Router sites: docs, marketing pages, blog pages, changelogs, and mixed apps with a meaningful prerendered surface.

On those builds, you get the full output flow:

```
out/
 llms.txt
 llms-full.txt
 robots.txt
 _headers
 docs/getting-started/index.html
 docs/getting-started.md
```

Server deployments are still useful too. You keep generated root artifacts and header integration, even when the deployment is not a pure static export.

## The one caveat that matters

Fully dynamic SSR routes are the boundary. If Next never emits an HTML file for a route at build time, there is no final file for the adapter to patch afterward.

That does **not** make the package useless for Next apps. It just means you should be precise about ownership:

- **Use `@agentmarkup/next`** for static export, prerendered pages, generated root artifacts, and header integration
- **Use the re-exported `@agentmarkup/core` helpers** inside app code for truly dynamic routes that have no build-time HTML file

That is the honest model for Next. The package is strongest where Next has a final-output artifact. For routes without one, route-level core helpers are the right tool.

## Should you enable markdown mirrors?

Only when they add signal. If your emitted HTML is already substantial, keep HTML as the primary fetch target. If the built page is thin, noisy, or heavily shell-like, generated markdown mirrors can give fetch-based agents a cleaner path.

agentmarkup keeps that feature disciplined by generating mirrors from final HTML, keeping them directly fetchable, and adding canonical headers back to the HTML route so search engines keep the page itself as the preferred URL.

## Why this is useful for the Next.js community

A lot of Next.js teams already care about structured metadata, crawlability, and build output quality. They just do not want four separate solutions for `llms.txt`, JSON-LD, crawler policy, markdown mirrors, and validation.

The practical value of `@agentmarkup/next` is that it keeps those concerns in one build step, on the same config surface, with the same rules the public checker looks for on deployed sites.

## The bottom line

If your Next.js app has a real static or prerendered surface, [@agentmarkup/next](https://www.npmjs.com/package/@agentmarkup/next) is the natural adapter. It gives you build-time machine-readable output without making you stitch the pieces together manually.

Start with the adapter for the routes Next emits, keep markdown mirrors optional, and use [@agentmarkup/core](https://www.npmjs.com/package/@agentmarkup/core) directly only where fully dynamic SSR makes that necessary. That is the cleanest model for shipping a machine-readable Next.js website today.

If you need the underlying pieces in more detail, read the [llms.txt guide](/docs/llms-txt/), the [JSON-LD guide](/docs/json-ld/), and the [AI crawlers guide](/docs/ai-crawlers/).

## Make your website machine-readable

agentmarkup is an open-source build-time toolkit for Vite, Astro, and Next.js that generates llms.txt, injects JSON-LD structured data, creates optional markdown mirrors from final HTML when raw pages need a cleaner agent-facing fetch path, manages AI crawler robots.txt rules, patches optional Content-Signal and canonical mirror headers, and validates everything at build time. Zero runtime cost.

 Learn more GitHub
```
pnpm add -D @agentmarkup/vite # or @agentmarkup/astro or @agentmarkup/next
```

Written by

[Sebastian Cochinescu](/authors/sebastian-cochinescu/) · Developer of agentmarkup

Builder of developer tools for machine-readable websites. Developer of agentmarkup. Founder of Anima Felix.

## More from the blog

### When markdown mirrors help, and when they do not

A practical guide to when generated markdown mirrors add signal, when HTML is already enough, and how to avoid unnecessary downsides.

 March 20, 2026 · 7 min read

### Is your website ready for AI? Free LLM discoverability checker

Audit your website for llms.txt, JSON-LD, robots.txt, markdown mirrors, and sitemap. Free tool for e-commerce and brand websites.

 March 20, 2026 · 8 min read

### Build-time markdown mirrors for agent readability: Cloudflare comparison

Build-time markdown generation for AI readability, including when it helps and how it compares to Cloudflare runtime extraction.

 March 20, 2026 · 7 min read

### How to make your brand appear in AI conversations

Organization schema, llms.txt, and FAQ markup make your brand visible in ChatGPT, Claude, and Perplexity answers.

 March 20, 2026 · 7 min read

### Why LLM-optimized e-commerce websites sell more

Product JSON-LD, llms.txt, and AI crawler access make your store visible in AI product recommendations.

 March 20, 2026 · 8 min read

### Every AI crawler indexing your website in 2026

Complete list: GPTBot, ClaudeBot, PerplexityBot, Google-Extended, CCBot, and more. What each does and how to control access.

 March 20, 2026 · 8 min read

### JSON-LD structured data: the complete guide for web developers

Schema types, JSON-LD vs microdata, common mistakes, and build-time validation.

 March 20, 2026 · 10 min read

### What is GEO? Generative Engine Optimization explained for developers

What is real, what is hype, and what you can do today to make your site citeable by AI.

 March 20, 2026 · 7 min read

### Why llms.txt matters: making your website discoverable by AI

LLMs answer questions by synthesizing web content. llms.txt gives them a structured overview of your site.

 March 20, 2026 · 6 min read
