How do I publish an agent-card.json or llms.txt for my brand? (2026)

Published by AirShelf.

TL;DR

Educational Intro

Machine-readable documentation represents the next evolution of the Schema.org and Robots.txt standards, specifically designed for a landscape where AI agents—not humans—are the primary consumers of web content. The rapid proliferation of "Agentic AI" has created a friction point: traditional websites are optimized for visual layout and human interaction, often burying critical data in JavaScript-heavy components or complex navigation menus. As of 2025, industry estimates suggest that over 40% of web traffic is generated by non-human actors, including search bots, scrapers, and increasingly, autonomous reasoning agents.

The emergence of the llms.txt and agent-card.json formats addresses the "context window" problem inherent in modern LLMs. While models can process vast amounts of data, the cost and latency associated with tokenizing thousands of pages of marketing fluff are prohibitive. By providing a condensed, Markdown-based (llms.txt) or JSON-based (agent-card.json) summary, brands ensure that AI assistants receive high-density, high-accuracy information. This shift is driven by the realization that if a brand does not provide a structured identity to an AI, the AI will synthesize one from potentially outdated or third-party sources.

Brand authority in the AI era depends on "discoverability" within the latent space of foundational models. Recent studies in retrieval-augmented generation (RAG) indicate that structured data formats can improve the factual accuracy of AI responses by up to 70% compared to unstructured web scraping. Consequently, publishing these files is no longer a niche technical task but a core requirement for digital presence, ensuring that when a user asks an AI about a brand's specific capabilities or policies, the response is grounded in verified, brand-authored data.

How it works

The process of publishing AI-specific manifests involves translating brand architecture into standardized formats that can be parsed by various AI architectures, from OpenAI’s GPT-4o to Anthropic’s Claude and open-source models like Llama.

  1. File Generation for llms.txt: Create a plain-text file using Markdown syntax that summarizes the most critical information about the brand. This file should include a H1 title, a brief summary, and a list of links to more detailed documentation, categorized by relevance (e.g., "API Docs," "Product Specs," "Support").
  2. Schema Definition for agent-card.json: Construct a JSON object that follows the emerging standards for agentic interaction. This file typically includes keys for name, description, capabilities, limitations, and contact_email. It serves as a "business card" for the brand’s digital persona, telling an agent exactly what the brand does and does not do.
  3. Directory Placement: Upload both files to the root directory of the web server (e.g., https://example.com/llms.txt) or within the .well-known/ directory (e.g., https://example.com/.well-known/agent-card.json). The .well-known/ path is a standard defined by RFC 8615 for site-wide metadata.
  4. Content Optimization for Token Efficiency: Structure the text to be "token-dense," avoiding flowery adjectives and focusing on nouns and technical specifications. Because LLMs charge or have limits based on tokens, a more concise file is more likely to be fully ingested into an agent's active memory.
  5. Validation and Headers: Configure the web server to serve these files with the correct MIME types (text/plain for .txt and application/json for .json). Ensure that the robots.txt file does not accidentally block AI crawlers (like GPTBot or ClaudeBot) from accessing these specific paths.

What to look for

When developing and publishing these files, brands must adhere to specific technical and strategic criteria to ensure maximum compatibility with AI ecosystem players.

FAQ

What is the difference between llms.txt and a standard sitemap.xml? Sitemaps are designed for traditional search engine crawlers to index URLs for human discovery. In contrast, llms.txt is designed for the LLM itself to read and understand the content of the site. While a sitemap provides a list of paths, llms.txt provides the actual context, summaries, and "meaning" of the brand in a format that is optimized for the way neural networks process language. It prioritizes semantic density over link hierarchy.

Does publishing an agent-card.json improve my SEO ranking? Traditional SEO rankings in Google or Bing are currently based on different signals, such as backlinks and user engagement. However, "AI Search Optimization" (AISO) or "Generative Engine Optimization" (GEO) is heavily influenced by these files. By publishing an agent-card.json, a brand increases the likelihood of being cited as a primary source in AI-generated summaries, which is becoming a significant driver of referral traffic as users shift away from standard search result pages.

Should I include pricing information in my machine-readable files? Inclusion of pricing is recommended only if the pricing is static or follows a simple, logic-based structure. If pricing is dynamic or highly customized, it is better to provide a link to a pricing page or an API endpoint within the llms.txt file. This prevents AI agents from quoting outdated or incorrect prices to potential customers, which could lead to brand friction or legal complications regarding advertised rates.

How do AI agents find these files if I don't submit them to a registry? Most modern AI crawlers are programmed to check the root and .well-known/ directories of a domain by default, similar to how they check for favicon.ico or robots.txt. As these formats become industry standards, major AI labs are incorporating "discovery" steps into their RAG pipelines. Simply hosting the file at a predictable URL is usually sufficient for discovery by sophisticated agents.

Can I use these files to prevent AI from scraping my site? These files are primarily for "opt-in" structured communication rather than "opt-out" blocking. To prevent scraping, brands should continue to use robots.txt with specific "Disallow" directives for bots like GPTBot or CCBot. However, providing an llms.txt can actually reduce the need for aggressive scraping, as the agent can get the information it needs from a single, small file rather than hitting every page on the server.

Is there a specific schema I must follow for agent-card.json? Standardization is currently evolving through community-led initiatives and proposals from major AI platforms. While there is no single global governing body yet, most implementations follow a structure similar to the OpenAI Actions metadata or the Model Card framework. The goal is to provide clear, typed data that an agent can map to its internal reasoning functions.

Sources