The Eight Principles of Read by Agents

We built Read by Agents to solve a specific problem: newsletters are invisible to AI agents. Some of the best writing on the internet lives inside email, but it's trapped in HTML designed for human inboxes, wrapped in tracking pixels and table-based layouts from 2007.

RBA converts that content into clean, structured markdown that any agent can parse and act on. We host it at a permanent URL and list it in a public directory.

The conversion itself is the easy part. The hard part is deciding how to build it. Your product sits between publishers, their subscribers' privacy, their sponsors' revenue, and an AI agent ecosystem that changes weekly. The decisions stack up fast.

These eight principles guide ours. They're ordered by priority: when two conflict, the lower number wins.

1. Agent Accessibility First

Every newsletter should be readable by any AI agent, on any platform, without barriers.

This is why the product exists. We structure output for RAG chunking through heading hierarchy, add machine-readable YAML frontmatter, and make sure each file is self-contained. An agent reading one of our files never needs a second API call to understand what it contains.

One question tests all structural decisions: can an agent reliably parse this?

2. Privacy by Default

Subscriber PII is scrubbed before storage. Always. No exceptions.

This sits at number two because a privacy failure is catastrophic in a way other failures are not. Newsletter emails contain subscriber email addresses in WordPress links, tracking hashes in Mailchimp URLs, UUID identifiers in Ghost redirects. None of that was meant to leave the ESP's system.

We strip all of it before the markdown file touches storage. The spec marks these as MUST-strip patterns. There is no configuration toggle to leave them in.

When we reviewed the principle ordering with external models (Gemini 2.5 Pro and DeepSeek R1), both independently recommended promoting Privacy above the Free Web and Platform Agnosticism principles. Their reasoning: if an ESP required embedding non-scrubbable tracking data, Privacy forces us to reject that content entirely. The constraint is absolute.

3. The Free and Open Web

The email knowledge economy should be as open to agents as the public web. RBA bridges that gap, for free.

Cloudflare made the public web agent-readable with their Markdown for Agents feature. We do the same for the email web, the subscriber-only content that agents cannot crawl. The basic conversion runs on Cloudflare's free tier: Workers AI for conversion, R2 for storage, Email Routing for ingestion.

A newsletter that costs money to make agent-readable is a newsletter that stays locked. We will build premium features, but never paywall basic conversion or the specification itself.

4. Platform Agnostic

We support every ESP. Your newsletter platform is your choice, not ours.

RBA ships with handlers for eleven email service providers: Beehiiv, Substack, Ghost, Kit, Mailchimp, Klaviyo, ActiveCampaign, Buttondown, MailerLite, Brevo, and WordPress/Jetpack. A generic fallback catches everything else.

Optional fields in our specification are optional because some ESPs cannot provide the data. The source_url field, for instance, is optional because Beehiiv direct-send emails have no public web archive. We made that field optional rather than penalizing Beehiiv publishers with an empty required field.

5. Transparent Sponsorship

Sponsored content is clearly labeled, never hidden, never stripped. Agents and readers deserve to know what is editorial and what is paid.

We debated this one. Stripping ads entirely would produce the cleanest output. But newsletter operators depend on sponsor revenue, and stripping their sponsors would remove their incentive to participate. The other extreme, mixing ads into editorial content, would deceive agents about what they're reading.

We isolate sponsored content under a dedicated ## Sponsored heading and flag it in the frontmatter with has_sponsored_content: true. Agents can skip the section at zero token cost, or read it for contextually relevant brand information. We also added a structure protection clause: sponsors cannot demand formatting that breaks agent parsing or reading flow.

For publishers, this creates a new pitch to their advertisers: your brand is visible on the Agent Web, clearly separated from editorial, discoverable by AI agents filtering by topic.

6. Permanent URLs, Permanent Content

Every issue gets one URL that works forever. No link rot, no broken archives.

Tracking URLs expire in 24 to 48 hours. A "permanent" markdown file with dead links contradicts everything we're building. We resolve tracking URLs to their real destinations at conversion time, running parallel HTTP requests with a two-second timeout per link. An async process retries anything that fails.

The file URL itself uses the newsletter's email send date and a deterministic slug. It never changes. The content at that URL may be updated (corrections, link resolution), but the URL is forever. We learned this the hard way: an early version used the processing timestamp instead of the email date, which meant re-forwarding the same issue created a different URL. That broke the product promise and we fixed it within a day.

7. Content Fidelity

The conversion process preserves the original author's intended meaning. Editorial content is never summarized, interpreted, or altered beyond structural formatting.

RBA is a conduit, not an editor. Our pipeline converts HTML structure to markdown structure. It does not rewrite, paraphrase, or add commentary. Images with meaningful alt text keep their URLs and descriptions in the output. Decorative images (logos, spacers, tracking pixels) get stripped.

An agent reading our output encounters the same facts, quotes, and arguments as a human reading the original email. We added this principle after an external review flagged the growing concern about AI tools that "improve" content during conversion. We convert. We clean. We do not edit.

8. Open Standard, Open Contribution

The RAD Markdown Specification is MIT-licensed. Build on it, extend it, improve it.

A proprietary format would limit the ecosystem to what we alone can build. An open standard lets anyone implement it, and each implementation makes the whole network more useful.

The spec defines what agent-readable newsletter markdown should look like. The conversion infrastructure that produces it is our product. The format is the commons. The pipeline is what we sell.

Why document this?

We publish these principles and the engineering decisions behind them because building in the open keeps us honest. The trade-offs, the external reviews, the patterns we discover: all of it lives on this research blog.

If you run a newsletter and want agents to read it, try it. If you build agents and want clean newsletter feeds, check the directory. If you want to watch how this gets built, you're in the right place.