The Web Is Forking - Read By Agents Research

_AI agent traffic surged 7,851% in 2025. One internet is for humans, another is forming for machines. Most newsletter publishers aren't building for the side that's growing. And the ones that are still need to keep attribution and referrals._

Two internets now exist. You just can't see the second one.

The first is the one you're reading right now. HTML pages designed for human eyes, styled with CSS, loaded with JavaScript, wrapped in cookie banners and newsletter popups. The web we've been building for three decades.

The second is being built underneath it. For machines.

The numbers

HUMAN Security's 2025 report measured a 7,851% year-over-year surge in AI agent traffic. Automated traffic is growing 8x faster than human visits. OpenAI accounts for 69% of observed AI-driven traffic, Meta 16.3%, Anthropic 10.8%.

Adobe measured a 4,700% year-over-year increase in AI agent visits to US retail sites alone.

In early 2025, AI bots accounted for roughly 1 in every 200 website visits. By late 2025, that ratio was 1 in 31. At current growth rates, agents will likely generate more internet traffic than humans by the end of this year.

The infrastructure already shipped

The supply side moved faster than most publishers noticed.

Cloudflare launched Markdown for Agents on February 12, 2026. Any site on its paid plans can now serve markdown to AI agents automatically through HTTP content negotiation. An agent sends an Accept: text/markdown header, and Cloudflare converts the page before it arrives. Their benchmarks: 16,180 HTML tokens down to 3,150 in markdown. An 80% reduction. Claude Code and OpenCode already send this header by default.

From Cloudflare's announcement: "Now is the time to consider not just human visitors, but start to treat agents as first-class citizens."

Jeremy Howard's LLMs.txt proposal, a simple markdown file at your domain root that points AI models toward your best pages, hit 844,000 implementations in under 18 months. Anthropic, Hugging Face, Perplexity, and Zapier adopted early. The companion format, llms-full.txt, which concatenates an entire site into one file, gets visited twice as often as the index.

Anthropic open-sourced the Model Context Protocol in November 2024. One year later: 10,000 active servers, 97 million monthly SDK downloads, client support in Claude, ChatGPT, Cursor, Gemini, Copilot, and VS Code. In December 2025, it was donated to the Linux Foundation, co-stewarded by OpenAI, Block, Microsoft, AWS, and Google.

None of this is theoretical. It all shipped.

The gap nobody's filling

All of this works for the public web. Blogs, docs, marketing pages, product listings. Content at a URL any crawler can reach.

But a large share of the internet's most useful content doesn't live on the public web. It lives in email.

Newsletters. Research reports. Market analysis. Investment memos. Industry briefings. All of it distributed to subscriber lists by platforms like Beehiiv, Substack, Ghost, ConvertKit, and Mailchimp. That content sits behind subscriber walls, wrapped in tracking pixels, stuffed with merge tags, rendered in email HTML that no agent framework was designed to parse.

An AI agent uses roughly 5x more tokens trying to make sense of newsletter HTML than it would on clean markdown. Most agents skip email content entirely.

Cloudflare can convert the public web. LLMs.txt can index your docs. MCP can wire up your API. But email inboxes? None of these tools reach them. No agent can subscribe to a newsletter, strip Beehiiv tracking links, clean ConvertKit merge tags, or store subscriber-only issues at permanent URLs.

We built Read By Agents to close this gap.

What we learned in the first weeks

The first cut of this post, shipped in April 2026, framed Read By Agents as an "open directory any agent can query." That framing was wrong. Within weeks, publishers pushed back on three things and we rebuilt around the feedback.

The first: publishers should not be listed publicly by default. Being discoverable to agents is not the same as being advertised to every casual web visitor. So the public directory is gone. New archives default to T2, agent-accessible only. The listing surfaces publicly only when the operator verifies ownership and opts in. The retirement note + three-tier model is at /trust.

The second: publishers should keep attribution, not hand it over. If an AI agent cites the Read By Agents mirror instead of the operator's original archive, that's an extractive outcome. So every issue now carries canonical_url in frontmatter, a visible "Source:" banner at the top of the markdown body, an HTTP Link: ; rel="canonical" header, and a in the HTML render. Agents that strip HTTP headers still see the upstream attribution. We are a mirror with permission, not a source of truth.

The third: publishers should see the economic upside. When Claude, ChatGPT, Perplexity, or the next wave of AI products send readers to a newsletter, the operator should know it. So every human click on a Read By Agents mirror routes through an opaque rbaref token that 302s to the operator's canonical URL and credits the visit. We aggregate the counts per newsletter per month. Anonymous aggregate is free. Per-agent granularity ships on the paid tier.

What this means for newsletter publishers

The web is splitting along an economic fault line.

One side optimizes for human attention: impressions, clicks, scroll depth, time on page. The advertising model that's powered the web since the late '90s. That model stops working when your visitor is a machine. CPMs are meaningless when agents don't render banner ads. Click-through rates don't apply when a language model extracts the answer without loading the page.

The other side optimizes for machine comprehension: clean structure, semantic markup, token efficiency, stable URLs. Different economics. Agents recommend, cite, and route traffic to content they can reliably parse. The publishers who show up in that layer accumulate visibility over time. The ones who don't are simply absent.

Newsletter publishers have an advantage they mostly haven't noticed. The content is original, it's fresh, and subscribers already trust it. That's what agents want to surface. The only barriers are format and attribution. Email HTML makes newsletters invisible to the entire agent layer of the internet. And the publishers who do get surfaced often lose the attribution to the mirror itself.

The fix is straightforward: make the archive agent-readable, keep the attribution pointing at the publisher, and make the referral traffic visible. The head start for publishers who move early is not.

_Read By Agents converts newsletters from 13 ESP platforms into clean, agent-readable markdown, hosts them at permanent URLs with canonical upstream attribution, and makes every agent-origin click countable. Join at readbyagents.com/for-operators._

The numbers

The infrastructure already shipped

The gap nobody's filling

Join Read By Agents

What we learned in the first weeks

What this means for newsletter publishers

The RBA Weekly