Skip to content

Generate a manual from screenshots

A common AI-agent workflow: walk a feature, capture screenshots at each step, attach annotations + redactions, ship the bundle as a how-to document. With annot-mcp@0.2.0’s encode option, the bytes shrink ~3–5× per screenshot — practical for 30–50 slide manuals that would otherwise be too heavy.

For the kept-in-sync flavour of this same flow — where the manual stays current with the live UI because a Playwright tour re-captures every screen on every PR — see the living-product-docs recipe.

For manual generation, the recommended defaults:

"encode": {
"format": "smart", // PNG-8 for UI, JPEG for photos
"saveSizePreset": "standard" // 1920px max (mobile-friendly + print-friendly)
}

For mobile-first manuals (smaller files acceptable):

"encode": {
"format": "smart",
"saveSizePreset": "light" // 1280px max — ~30–40% additional savings
}

For archival / reference manuals where lossless matters:

"encode": {
"format": "png", // PNG-32 only, no PNG-8 quantization
"saveSizePreset": "highQuality" // 2560px max, preserves UI fidelity
}

Agent transcript: 5-step onboarding manual

Section titled “Agent transcript: 5-step onboarding manual”
USER: Walk the signup → first-project onboarding flow on
staging.example.com. Capture each step with a callout
explaining the action, and produce a manual.
AGENT → annot.annot_annotate_url({
url: "https://staging.example.com/signup",
annotations: [
{ type: "rect", locator: "input[name=email]", intent: "info" },
{ type: "callout",
atLocator: "h1",
targetLocator: "input[name=email]",
content: "Step 1: Enter your email" }
],
encode: { format: "smart", saveSizePreset: "standard" }
})
← 180 KB PNG-8 (~60% smaller than raw PNG-32)
[ playwright.browser_navigate + fill + click to advance ]
AGENT → annot.annot_annotate_url({
url: "https://staging.example.com/signup/verify",
annotations: [
{ type: "rect", locator: "[data-testid=otp-input]", intent: "info" },
{ type: "callout",
at: { x: 40, y: 40 },
targetLocator: "[data-testid=otp-input]",
content: "Step 2: Enter the code from your inbox" }
],
encode: { format: "smart", saveSizePreset: "standard" }
})
← 95 KB PNG-8 (sparse form, even smaller)
[ ... 3 more steps ... ]

After 5 annotated steps:

EncodingTotal bundle size
PNG-32 raw (default 0.1.x)~2.5 MB
encode: smart + standard~750 KB

That’s the difference between “this manual is too heavy for the LMS” and “this manual loads instantly on a phone.”

For manuals that walk through authenticated flows, redact sensitive fields per-step:

AGENT → annot.annot_redact_url({
url: "https://staging.example.com/profile",
regions: [
{ locator: "[data-testid=ssn]", style: "blur" },
{ locator: "[data-testid=card-last4]", style: "solid", color: "#000" }
],
encode: { format: "smart", saveSizePreset: "standard" }
})
← redacted + encoded in one MCP call

The redactions burn destructively into the bitmap before the encode pipeline, so the encoded output (whether PNG-8 or JPEG) carries no recoverable original pixels.

Manual screenshots are typically:

  • UI-heavy — solid backgrounds, sharp text, limited palette. Smart picks PNG-8 → ~60% file-size reduction.
  • Occasionally photo-mixed (e.g. a step shows a product carousel). Smart’s photo-heavy fallback picks JPEG → ~80% reduction on those specific images, no quality loss the reader notices.

The agent doesn’t have to think about which encoding to pick per screenshot — format: "smart" handles the decision tree based on actual pixel content.

@ingcreators/annot-annotator is Apache-2.0 end to end. The bundled Median Cut quantizer powering the PNG-8 path is pure TypeScript; no GPL inclusion, no separate install step.