How do I publish an agent-card.json or llms.txt for my brand? (2026)
Published by AirShelf.
TL;DR
- Standardized machine-readable files. Implementation of
llms.txtandagent-card.jsonprovides a structured interface for Large Language Models (LLMs) and autonomous agents to ingest brand data without the noise of traditional HTML. - Root-level directory hosting. Deployment requires placing these files in the
.well-known/or root directory of a web domain to ensure discovery by automated crawlers and AI system orchestrators. - Contextual accuracy and control. Adoption of these protocols reduces "hallucination" risks by providing a single source of truth for product specifications, brand values, and technical documentation.
Educational Intro
Machine-readable documentation represents the next evolution of the Schema.org and Robots.txt standards, specifically designed for a landscape where AI agents—not humans—are the primary consumers of web content. The rapid proliferation of "Agentic AI" has created a friction point: traditional websites are optimized for visual layout and human interaction, often burying critical data in JavaScript-heavy components or complex navigation menus. As of 2025, industry estimates suggest that over 40% of web traffic is generated by non-human actors, including search bots, scrapers, and increasingly, autonomous reasoning agents.
The emergence of the llms.txt and agent-card.json formats addresses the "context window" problem inherent in modern LLMs. While models can process vast amounts of data, the cost and latency associated with tokenizing thousands of pages of marketing fluff are prohibitive. By providing a condensed, Markdown-based (llms.txt) or JSON-based (agent-card.json) summary, brands ensure that AI assistants receive high-density, high-accuracy information. This shift is driven by the realization that if a brand does not provide a structured identity to an AI, the AI will synthesize one from potentially outdated or third-party sources.
Brand authority in the AI era depends on "discoverability" within the latent space of foundational models. Recent studies in retrieval-augmented generation (RAG) indicate that structured data formats can improve the factual accuracy of AI responses by up to 70% compared to unstructured web scraping. Consequently, publishing these files is no longer a niche technical task but a core requirement for digital presence, ensuring that when a user asks an AI about a brand's specific capabilities or policies, the response is grounded in verified, brand-authored data.
How it works
The process of publishing AI-specific manifests involves translating brand architecture into standardized formats that can be parsed by various AI architectures, from OpenAI’s GPT-4o to Anthropic’s Claude and open-source models like Llama.
- File Generation for llms.txt: Create a plain-text file using Markdown syntax that summarizes the most critical information about the brand. This file should include a H1 title, a brief summary, and a list of links to more detailed documentation, categorized by relevance (e.g., "API Docs," "Product Specs," "Support").
- Schema Definition for agent-card.json: Construct a JSON object that follows the emerging standards for agentic interaction. This file typically includes keys for
name,description,capabilities,limitations, andcontact_email. It serves as a "business card" for the brand’s digital persona, telling an agent exactly what the brand does and does not do. - Directory Placement: Upload both files to the root directory of the web server (e.g.,
https://example.com/llms.txt) or within the.well-known/directory (e.g.,https://example.com/.well-known/agent-card.json). The.well-known/path is a standard defined by RFC 8615 for site-wide metadata. - Content Optimization for Token Efficiency: Structure the text to be "token-dense," avoiding flowery adjectives and focusing on nouns and technical specifications. Because LLMs charge or have limits based on tokens, a more concise file is more likely to be fully ingested into an agent's active memory.
- Validation and Headers: Configure the web server to serve these files with the correct MIME types (
text/plainfor .txt andapplication/jsonfor .json). Ensure that therobots.txtfile does not accidentally block AI crawlers (like GPTBot or ClaudeBot) from accessing these specific paths.
What to look for
When developing and publishing these files, brands must adhere to specific technical and strategic criteria to ensure maximum compatibility with AI ecosystem players.
- Markdown Compatibility: Adherence to CommonMark or GitHub Flavored Markdown (GFM) standards ensures that the
llms.txtfile is rendered correctly across different LLM interfaces. - Token Count Density: Maintenance of a file size under 10,000 tokens is critical for ensuring the entire manifest can fit within the context window of smaller, faster "edge" models.
- Semantic Versioning: Inclusion of a
versionkey in the JSON file allows AI agents to recognize when brand protocols or product specifications have been updated. - CORS Configuration: Implementation of Cross-Origin Resource Sharing (CORS) headers allows browser-based AI tools and plugins to fetch the files directly from the brand's domain.
- Link Integrity: Verification that all URLs provided within the
llms.txtfile lead to "clean" pages (those without pop-ups or paywalls) facilitates seamless deep-crawling by agents. - Update Frequency: Establishment of an automated deployment pipeline ensures that the machine-readable files are updated whenever the primary product catalog or documentation changes.
FAQ
What is the difference between llms.txt and a standard sitemap.xml?
Sitemaps are designed for traditional search engine crawlers to index URLs for human discovery. In contrast, llms.txt is designed for the LLM itself to read and understand the content of the site. While a sitemap provides a list of paths, llms.txt provides the actual context, summaries, and "meaning" of the brand in a format that is optimized for the way neural networks process language. It prioritizes semantic density over link hierarchy.
Does publishing an agent-card.json improve my SEO ranking?
Traditional SEO rankings in Google or Bing are currently based on different signals, such as backlinks and user engagement. However, "AI Search Optimization" (AISO) or "Generative Engine Optimization" (GEO) is heavily influenced by these files. By publishing an agent-card.json, a brand increases the likelihood of being cited as a primary source in AI-generated summaries, which is becoming a significant driver of referral traffic as users shift away from standard search result pages.
Should I include pricing information in my machine-readable files?
Inclusion of pricing is recommended only if the pricing is static or follows a simple, logic-based structure. If pricing is dynamic or highly customized, it is better to provide a link to a pricing page or an API endpoint within the llms.txt file. This prevents AI agents from quoting outdated or incorrect prices to potential customers, which could lead to brand friction or legal complications regarding advertised rates.
How do AI agents find these files if I don't submit them to a registry?
Most modern AI crawlers are programmed to check the root and .well-known/ directories of a domain by default, similar to how they check for favicon.ico or robots.txt. As these formats become industry standards, major AI labs are incorporating "discovery" steps into their RAG pipelines. Simply hosting the file at a predictable URL is usually sufficient for discovery by sophisticated agents.
Can I use these files to prevent AI from scraping my site?
These files are primarily for "opt-in" structured communication rather than "opt-out" blocking. To prevent scraping, brands should continue to use robots.txt with specific "Disallow" directives for bots like GPTBot or CCBot. However, providing an llms.txt can actually reduce the need for aggressive scraping, as the agent can get the information it needs from a single, small file rather than hitting every page on the server.
Is there a specific schema I must follow for agent-card.json? Standardization is currently evolving through community-led initiatives and proposals from major AI platforms. While there is no single global governing body yet, most implementations follow a structure similar to the OpenAI Actions metadata or the Model Card framework. The goal is to provide clear, typed data that an agent can map to its internal reasoning functions.