[{"data":1,"prerenderedAt":435},["ShallowReactive",2],{"eng-shopify-agent":3},{"id":4,"title":5,"body":6,"description":418,"extension":419,"meta":420,"navigation":421,"path":422,"seo":423,"stem":424,"subtitle":425,"tags":426,"year":433,"__hash__":434},"engineering/engineering/shopify-agent.md","shopify-agent",{"type":7,"value":8,"toc":405},"minimark",[9,14,18,21,25,37,60,63,65,69,88,91,133,159,165,167,171,174,192,210,230,233,236,238,242,245,251,257,277,280,283,285,289,292,295,306,308,312,319,322,324,328,396,398,402],[10,11,13],"h2",{"id":12},"the-problem","The Problem",[15,16,17],"p",{},"E-commerce store operations involve three categories of work: data retrieval (what's my revenue, what's low on stock), code quality enforcement (accessibility compliance, performance budgets, SEO), and decision-making (should I reorder this SKU, which pages need fixes). Traditional tooling addresses these in isolation — a separate dashboard for analytics, a separate CI step for linting, a separate spreadsheet for purchasing. The question I wanted to answer: can a single autonomous agent system handle all three, with appropriate guardrails for each?",[19,20],"hr",{},[10,22,24],{"id":23},"system-design","System Design",[15,26,27,28,32,33,36],{},"The system is a CLI-driven orchestrator built on the Claude Agent SDK's ",[29,30,31],"code",{},"query()"," function. A main agent receives natural language commands and delegates to three specialized subagents via the SDK's ",[29,34,35],{},"Task"," tool, each operating on a different model tier based on the complexity/cost tradeoff of their domain:",[38,39,40,48,54],"ul",{},[41,42,43,47],"li",{},[44,45,46],"strong",{},"theme-auditor"," runs on Haiku — fast, cost-efficient, handles high-volume file scanning where the task is pattern recognition rather than deep reasoning",[41,49,50,53],{},[44,51,52],{},"store-analytics"," runs on Sonnet — needs multi-step reasoning to translate natural language business questions into SQL, interpret results, and generate actionable recommendations",[41,55,56,59],{},[44,57,58],{},"seo-optimizer"," runs on Sonnet — requires judgment calls about content quality and search intent that benefit from stronger reasoning",[15,61,62],{},"This isn't an arbitrary split. I profiled the token economics: a full theme audit touches 40-80 Liquid files. Running that on Sonnet would cost ~$2-4 per audit. On Haiku, it's ~$0.15. The analytics agent, by contrast, rarely processes more than a few thousand tokens of query results but needs to reason about what those numbers mean. Model selection per subagent is a cost/quality optimization, not a preference.",[19,64],{},[10,66,68],{"id":67},"the-hard-problem-dom-to-source-mapping","The Hard Problem: DOM-to-Source Mapping",[15,70,71,72,75,76,79,80,83,84,87],{},"The most technically interesting piece is the audit pipeline's auto-fix loop. The challenge: axe-core reports accessibility violations against rendered DOM elements, but the source code that generated those elements is Liquid templates — a server-side templating language where an ",[29,73,74],{},"\u003Cimg>"," missing ",[29,77,78],{},"alt"," text might originate from a ",[29,81,82],{},"{% for image in product.images %}"," loop in ",[29,85,86],{},"snippets/product-media-gallery.liquid",".",[15,89,90],{},"Mapping a browser DOM violation back to the correct Liquid template file requires understanding Shopify's rendering pipeline. The system uses a four-strategy resolution chain:",[92,93,94,111,121,127],"ol",{},[41,95,96,99,100,103,104,107,108],{},[44,97,98],{},"Section ID extraction"," — Shopify wraps each section's output in ",[29,101,102],{},"#shopify-section-{name}",", so ",[29,105,106],{},"#shopify-section-featured-collection"," maps directly to ",[29,109,110],{},"sections/featured-collection.liquid",[41,112,113,116,117,120],{},[44,114,115],{},"Structural knowledge"," — a product page violation likely lives in ",[29,118,119],{},"sections/main-product.liquid"," or its known snippet dependencies",[41,122,123,126],{},[44,124,125],{},"HTML fragment grep"," — extract a unique class or attribute from the violating HTML and search the Liquid codebase",[41,128,129,132],{},[44,130,131],{},"Page-type heuristics"," — fall back to the most probable template for the current page type",[15,134,135,136,139,140,142,143,146,147,150,151,154,155,158],{},"Each violation is then classified into a confidence tier: ",[44,137,138],{},"auto-fix"," (well-understood patterns like missing ",[29,141,78],{}," attributes), ",[44,144,145],{},"suggest-and-apply"," (requires human review, like color contrast adjustments), or ",[44,148,149],{},"manual-only"," (too risky to automate, like heading hierarchy restructuring). The subagent only acts on fixable items, applies patches using Liquid-aware context (e.g., ",[29,152,153],{},"alt=\"{{ image.alt | default: product.title }}\""," instead of hardcoded strings), then verifies each fix by running ",[29,156,157],{},"shopify theme check",". Failed verifications trigger a retry loop, capped at 3 attempts per violation.",[15,160,161,162],{},"This is the design principle I'd bring to any AI system that modifies production assets: ",[44,163,164],{},"classify confidence before acting, verify after acting, and know when to stop.",[19,166],{},[10,168,170],{"id":169},"permission-architecture","Permission Architecture",[15,172,173],{},"The agent operates on a defense-in-depth model with three independent layers:",[15,175,176,179,180,183,184,187,188,191],{},[44,177,178],{},"Layer 1 — CLI flags."," The user explicitly opts into risk levels: ",[29,181,182],{},"--allow-writes"," enables file modifications, ",[29,185,186],{},"--allow-mutations"," enables Shopify API state changes, ",[29,189,190],{},"--dangerous"," enables destructive operations. Default is read-only.",[15,193,194,201,202,205,206,209],{},[44,195,196,197,200],{},"Layer 2 — ",[29,198,199],{},"canUseTool"," callback."," Every tool invocation passes through a runtime permission check before execution. This gates MCP tool access (mutation tools require the flag), file writes (critical files are always blocked), and bash commands (destructive patterns like ",[29,203,204],{},"rm -rf"," and ",[29,207,208],{},"shopify theme publish"," require explicit opt-in).",[15,211,212,215,216,218,219,222,223,222,226,229],{},[44,213,214],{},"Layer 3 — PreToolUse hooks."," Even if ",[29,217,199],{}," somehow passes, hooks provide a second check specifically for critical file protection (",[29,220,221],{},".env",", ",[29,224,225],{},"shopify.app.toml",[29,227,228],{},"config/settings_data.json","). This catches subagent attempts that bypass the main orchestrator's permission check — defense-in-depth means no single layer is trusted alone.",[15,231,232],{},"Additionally, the BigQuery tool enforces SELECT-only queries with a 1GB scan cost cap and dry-run support, and GraphQL mutations are restricted to a 6-mutation allowlist regardless of permission flags.",[15,234,235],{},"Every tool invocation is logged to a JSONL audit trail via PostToolUse hooks. SubagentStop hooks capture each subagent's final output for a consolidated session report.",[19,237],{},[10,239,241],{"id":240},"data-architecture","Data Architecture",[15,243,244],{},"The system interfaces with two data paths depending on query characteristics:",[15,246,247,250],{},[44,248,249],{},"Real-time path"," — Shopify Admin GraphQL API via 7 custom MCP tools. Used for single-entity lookups (specific order, current inventory level) where freshness matters.",[15,252,253,256],{},[44,254,255],{},"Warehouse path"," — BigQuery with three datasets:",[38,258,259,265,271],{},[41,260,261,264],{},[29,262,263],{},"shopify_warehouse"," (custom pipeline): 8 tables with partition/cluster optimization, 5 analytics views",[41,266,267,270],{},[29,268,269],{},"marketing_warehouse"," (Fivetran-managed): GA4, Klaviyo, Meta Ads, Google Ads, Search Console",[41,272,273,276],{},[29,274,275],{},"unified_analytics"," (cross-source joins): 6 views that combine store operations with marketing attribution",[15,278,279],{},"The analytics subagent's system prompt includes explicit routing guidance: \"What's my revenue?\" routes to the custom warehouse, \"What's my ROAS?\" routes to unified views that use GA4 as the attribution source rather than platform-reported vanity metrics. This encoding of domain expertise into the prompt is deliberate — the agent shouldn't need to reason about data architecture on every query.",[15,281,282],{},"The ingestion pipeline is AWS serverless (CDK-defined): Shopify webhooks → API Gateway (HMAC verification) → Lambda (DynamoDB deduplication) → SQS FIFO → Lambda (BigQuery batch insert). A daily EventBridge-triggered full sync catches any webhook gaps.",[19,284],{},[10,286,288],{"id":287},"the-restock-agent","The Restock Agent",[15,290,291],{},"The most autonomous component: a daily Lambda that queries BigQuery for 30-day sales velocity and current inventory, groups SKUs by supplier, fetches reorder configuration from Shopify product metafields, then sends each supplier group to Claude Haiku for evaluation.",[15,293,294],{},"Haiku doesn't just check \"is stock below threshold.\" It evaluates seasonality, minimum order quantities, velocity trends, and lead times to decide whether to reorder, how much, and with what confidence. The output is a structured purchase order with per-SKU reasoning and a confidence score. In production mode, POs are emailed to suppliers via SES.",[15,296,297,298,301,302,305],{},"The key design choice: ",[44,299,300],{},"the AI makes the decision, but the system enforces idempotency."," A ",[29,303,304],{},"last_po_date"," metafield on each product prevents duplicate orders. The entire pipeline costs approximately $0.27/month in infrastructure and AI tokens.",[19,307],{},[10,309,311],{"id":310},"what-id-do-differently","What I'd Do Differently",[15,313,314,315,318],{},"The DOM-to-Liquid mapper uses synchronous ",[29,316,317],{},"execSync"," for grep operations. In a production system I'd move to async workers with a file index cache. The fix classification is rule-based — a learned model trained on historical fix success/failure would be more accurate. And the subagent prompt engineering, while effective, is fragile — changes to Shopify's theme architecture would require prompt updates rather than being discovered dynamically.",[15,320,321],{},"These are the tradeoffs of building a working system quickly versus building a system that scales to thousands of stores. Both are valid engineering contexts.",[19,323],{},[10,325,327],{"id":326},"numbers","Numbers",[329,330,331,344],"table",{},[332,333,334],"thead",{},[335,336,337,341],"tr",{},[338,339,340],"th",{},"Metric",[338,342,343],{},"Value",[345,346,347,356,364,372,380,388],"tbody",{},[335,348,349,353],{},[350,351,352],"td",{},"Application code",[350,354,355],{},"2,362 lines TypeScript",[335,357,358,361],{},[350,359,360],{},"Test coverage",[350,362,363],{},"115 tests across 7 test files",[335,365,366,369],{},[350,367,368],{},"MCP tools",[350,370,371],{},"8 (7 Shopify + 1 BigQuery)",[335,373,374,377],{},[350,375,376],{},"Subagents",[350,378,379],{},"3 (theme on Haiku, analytics + SEO on Sonnet)",[335,381,382,385],{},[350,383,384],{},"Infrastructure",[350,386,387],{},"API Gateway, 4 Lambdas, SQS FIFO, DynamoDB, S3, EventBridge, SES",[335,389,390,393],{},[350,391,392],{},"Data warehouse",[350,394,395],{},"8 tables, 11 views, 6 cross-source views",[19,397],{},[10,399,401],{"id":400},"stack","Stack",[15,403,404],{},"Claude Agent SDK · Anthropic API · Shopify Plus Admin GraphQL · Playwright · axe-core · web-vitals · Google BigQuery · Fivetran · AWS CDK (Lambda, API Gateway, SQS, DynamoDB, S3, EventBridge, SES, Secrets Manager) · MCP SDK · Commander.js · Zod · Node.js native test runner",{"title":406,"searchDepth":407,"depth":407,"links":408},"",2,[409,410,411,412,413,414,415,416,417],{"id":12,"depth":407,"text":13},{"id":23,"depth":407,"text":24},{"id":67,"depth":407,"text":68},{"id":169,"depth":407,"text":170},{"id":240,"depth":407,"text":241},{"id":287,"depth":407,"text":288},{"id":310,"depth":407,"text":311},{"id":326,"depth":407,"text":327},{"id":400,"depth":407,"text":401},"Architecture narrative for an autonomous multi-agent system built with Claude Agent SDK, Playwright auto-fix loops, and BigQuery data pipeline. Explores model-tier cost optimization, DOM-to-Liquid source mapping, and defense-in-depth permission architecture.","md",{},true,"/engineering/shopify-agent",{"title":5,"description":418},"engineering/shopify-agent","Autonomous store management through multi-agent orchestration, domain-specific evaluation loops, and tiered permission controls.",[427,428,429,430,431,432],"Claude Agent SDK","Playwright","BigQuery","AWS CDK","MCP","TypeScript","2026","WxlCNgx57xSvgs2q2Rb8BIJ9zAQgLLdchJsSCHOSRls",1773700380809]