v0.1 / public beta

Technical SEO tools that return raw data.

Generate llms.txt manifests for AI crawlers, audit on-page markup, validate schema, and inspect any URL. Every tool returns JSON or plain text in under 30 seconds and respects robots.txt.

Open llms.txt Generator Explore tools

Toolkit

Individual tools, no suite required.

Each tool solves one problem and returns one output. Pick what fits your stack.

llms.txt Generator

Crawls a site, extracts documentation URLs and titles, and outputs a Markdown manifest that AI agents (ChatGPT, Claude, Perplexity) fetch from /llms.txt to find canonical content.

Open tool

Site Audit

Technical SEO scan. Broken links, render-blocking assets, missing meta, Core Web Vitals.

Coming soon

Keyword Research

Volume, difficulty, and SERP context for any term. CSV export, no quota theatre.

Coming soon

Page Inspector

On-page diagnostics for a single URL. Titles, headings, links, OG tags, render diff.

Coming soon

Schema Validator

Validate JSON-LD against schema.org. Find conflicts, missing required fields, and rich-result eligibility.

Coming soon

Sitemap Builder

Generate and validate sitemap.xml. Priority hints, lastmod, hreflang, and a diff against your current one.

Coming soon

How it works

Built for engineers who own the site.

Free, no account

Each tool runs one URL at a time without registration. Paid tiers cover bulk crawls, scheduled audits, and HTTP API access.

Machine-readable output

Results return as JSON or plain text. Pipe into your build, commit to git, diff between runs. No dashboard required.

Polite by default

Crawlers read robots.txt and sitemap.xml first, throttle to 1 request per second per host, and identify with a clear user-agent. No JavaScript execution unless requested.

Open formats

Generated files use public specifications: llms.txt, sitemap.xml, JSON-LD schema.org. Self-host any output, no proprietary format, no API key required to read the result.

FAQ

Frequently Asked Questions.

What is llms.txt and why does it matter?

llms.txt is a plain-text manifest placed at the root of a domain (example.com/llms.txt) that lists the URLs an AI agent should read to understand the site. Like robots.txt for crawlers and sitemap.xml for search engines, llms.txt is written for large language models. When ChatGPT, Claude, or Perplexity fetch live content during a search, the manifest tells them which pages contain canonical documentation, API references, and product information.

Are the tools free to use?

Yes. Each tool runs one URL at a time without an account. Paid tiers cover scheduled audits, multi-domain crawls, HTTP API access, and bulk processing. Single-shot use of any tool remains free permanently.

Do I need to create an account?

No account is required for the free tools. Paste a URL, receive output, download. An account is only needed to save run history, schedule recurring scans, or use the upcoming HTTP API.

How are these tools different from Semrush, Ahrefs, or Screaming Frog?

Semrush and Ahrefs are priced for marketing teams and built around dashboards, keyword data, and backlink graphs. Screaming Frog is a desktop crawler with a GUI. This toolkit is built for engineers who own a site, return raw JSON or plain text, and can be piped into a build script or commit hook. Output is machine-readable first, human-readable second.

What does the crawler actually read on a page?

The crawler fetches each URL with an HTTP GET, parses the response HTML, and extracts the title tag, meta description, Open Graph tags, structured data (JSON-LD blocks), heading hierarchy (h1 to h6), canonical link, and the destination of every internal anchor. JavaScript execution is disabled by default, which means single-page applications without server-side rendering return empty content.

Does the crawler respect robots.txt?

Yes. Every crawl fetches /robots.txt first and skips any URL disallowed for our user-agent. Requests are throttled to one per second per host. The user-agent identifies the tool clearly so server logs reflect actual traffic and you can rate-limit further if needed.

Where should an llms.txt file be placed?

Serve the file at https://yourdomain.com/llms.txt with Content-Type text/plain. AI agents that support the convention fetch this exact path. The file should be a plain-text Markdown document, no authentication required, and HTTP 200 status. Some implementations also support /llms-full.txt for an expanded version including page contents.

Will an llms.txt file improve my Google rankings?

Not directly. llms.txt affects how AI assistants discover and cite your content, not how Google ranks pages. Traditional search engines still rely on sitemap.xml, internal links, and crawl signals. A well-structured llms.txt improves your visibility in answers from ChatGPT, Perplexity, Claude, and Gemini when they fetch live web data, which is increasingly a parallel discovery channel alongside Google.

Is there an HTTP API or CLI?

An HTTP API and a CLI are on the roadmap for the paid tier. Every tool's output is already in a clean machine-readable format (JSON or plain text), so calling the web interface and parsing the result works as an interim integration. API authentication will use bearer tokens scoped per project.

What data is stored when I run a tool?

For anonymous runs, the input URL and aggregate run metadata are logged for abuse prevention; the output is discarded after delivery. No personal data is collected. For signed-in users, run history is saved to the account and can be deleted at any time. Crawled site content is never persisted beyond the response delivered to the user.