Generate a manual from screenshots
A common AI-agent workflow: walk a feature, capture screenshots
at each step, attach annotations + redactions, ship the bundle
as a how-to document. With annot-mcp@0.2.0’s encode option,
the bytes shrink ~3–5× per screenshot — practical for 30–50
slide manuals that would otherwise be too heavy.
For the kept-in-sync flavour of this same flow — where the manual stays current with the live UI because a Playwright tour re-captures every screen on every PR — see the living-product-docs recipe.
Encode preset cheat sheet
Section titled “Encode preset cheat sheet”For manual generation, the recommended defaults:
"encode": { "format": "smart", // PNG-8 for UI, JPEG for photos "saveSizePreset": "standard" // 1920px max (mobile-friendly + print-friendly)}For mobile-first manuals (smaller files acceptable):
"encode": { "format": "smart", "saveSizePreset": "light" // 1280px max — ~30–40% additional savings}For archival / reference manuals where lossless matters:
"encode": { "format": "png", // PNG-32 only, no PNG-8 quantization "saveSizePreset": "highQuality" // 2560px max, preserves UI fidelity}Agent transcript: 5-step onboarding manual
Section titled “Agent transcript: 5-step onboarding manual”USER: Walk the signup → first-project onboarding flow on staging.example.com. Capture each step with a callout explaining the action, and produce a manual.
AGENT → annot.annot_annotate_url({ url: "https://staging.example.com/signup", annotations: [ { type: "rect", locator: "input[name=email]", intent: "info" }, { type: "callout", atLocator: "h1", targetLocator: "input[name=email]", content: "Step 1: Enter your email" } ], encode: { format: "smart", saveSizePreset: "standard" }}) ← 180 KB PNG-8 (~60% smaller than raw PNG-32)
[ playwright.browser_navigate + fill + click to advance ]
AGENT → annot.annot_annotate_url({ url: "https://staging.example.com/signup/verify", annotations: [ { type: "rect", locator: "[data-testid=otp-input]", intent: "info" }, { type: "callout", at: { x: 40, y: 40 }, targetLocator: "[data-testid=otp-input]", content: "Step 2: Enter the code from your inbox" } ], encode: { format: "smart", saveSizePreset: "standard" }}) ← 95 KB PNG-8 (sparse form, even smaller)
[ ... 3 more steps ... ]After 5 annotated steps:
| Encoding | Total bundle size |
|---|---|
| PNG-32 raw (default 0.1.x) | ~2.5 MB |
encode: smart + standard | ~750 KB |
That’s the difference between “this manual is too heavy for the LMS” and “this manual loads instantly on a phone.”
Composing with redaction
Section titled “Composing with redaction”For manuals that walk through authenticated flows, redact sensitive fields per-step:
AGENT → annot.annot_redact_url({ url: "https://staging.example.com/profile", regions: [ { locator: "[data-testid=ssn]", style: "blur" }, { locator: "[data-testid=card-last4]", style: "solid", color: "#000" } ], encode: { format: "smart", saveSizePreset: "standard" }}) ← redacted + encoded in one MCP callThe redactions burn destructively into the bitmap before the encode pipeline, so the encoded output (whether PNG-8 or JPEG) carries no recoverable original pixels.
Why the smart heuristic helps here
Section titled “Why the smart heuristic helps here”Manual screenshots are typically:
- UI-heavy — solid backgrounds, sharp text, limited palette. Smart picks PNG-8 → ~60% file-size reduction.
- Occasionally photo-mixed (e.g. a step shows a product carousel). Smart’s photo-heavy fallback picks JPEG → ~80% reduction on those specific images, no quality loss the reader notices.
The agent doesn’t have to think about which encoding to pick
per screenshot — format: "smart" handles the decision tree
based on actual pixel content.
License posture
Section titled “License posture”@ingcreators/annot-annotator is Apache-2.0 end to end. The
bundled Median Cut quantizer powering the PNG-8 path is pure
TypeScript; no GPL inclusion, no separate install step.
See also
Section titled “See also”- Encode pipeline reference — full options + defaults + tradeoffs.
- Living product docs — same MCP composition pattern, but the resulting docs site re-syncs against the live UI on every Playwright run.
- Agent bug-report autopilot — same MCP composition pattern for a different goal.
annot_annotate_urlschema.