Per-product, not per-site: why every item in your catalogue needs its own agent score

Catalogue tools were built for search engine optimisation. They average everything. AI shopping agents make decisions one product at a time, and the averages hide the products that will actually fail at checkout.

If you ask a typical catalogue audit tool how AI-ready your store is, it returns a single number. Maybe a few numbers. Schema coverage at 87 per cent. GTIN coverage at 64 per cent. Stock availability declared on 91 per cent of pages. The dashboard turns green and the team moves on.

Here is the problem with that. An AI shopping agent does not buy your catalogue. It buys one product. When a consumer says "find me an organic oat milk under four pounds", the agent picks three candidate products, ranks them, and tries to purchase the winner. If that winning product happens to be the 13 per cent without schema, or the 36 per cent without a GTIN, the agent fails at checkout and the consumer gets nothing. The 87 per cent average looks healthy on the dashboard. The customer experience is broken.

"An AI shopping agent does not buy your catalogue. It buys one product."

This is the fundamental shift agentic commerce forces on catalogue measurement. Per-site averages were a useful proxy when humans were the consumers and Google was the intermediary, because humans tolerate noise. They scroll past the broken results. Agents do not scroll. An agent that picks a product that fails at checkout simply abandons the purchase and reports back that the merchant could not fulfil. The damage is silent and total.

So we built something different. Aidō Lighthouse now scores every individual product in your catalogue against a single question: can an AI agent actually buy this?

The five axes

Each product gets a score from zero to one hundred across five dimensions. Each dimension answers a question an agent will ask before attempting to purchase.

Identifiable

Can the agent uniquely refer to this product across merchants? GTIN, SKU, brand. Without these the agent cannot compare your product against a competitor's.

Priceable

Can the agent get a current, machine-readable price? A numeric price field, a declared currency, no "call for quote". If the price needs scraping or interpretation, the agent will skip you.

Stockable

Can the agent know if it is in stock? Schema.org availability declared, ideally at the variant level. "InStock" with no variant breakdown means the agent has to guess for size or colour.

Purchasable

Are there hidden constraints that will block agent purchase? B2B-only, prescription required, age-restricted, made-to-order. These look like normal products until checkout fails.

Returnable

Is there a per-product return policy declared? Schema.org MerchantReturnPolicy with a return window, fees, and method. Site-wide policies do not satisfy mandate-bound agents.

The five scores are weighted into a single Agent Purchasability number per product. We chose the weights based on what actually breaks agent commerce in the wild: priceable and purchasable matter most because they are the immediate blockers, identifiable matters because it gates comparison, stockable matters because it determines selection, returnable carries the smallest weight because it affects post-purchase rather than the purchase itself.

You see the score on every row of your product catalogue table. Click into a row and you get the breakdown: which axis dragged the score down, what specifically was missing, and what the agent would do as a result.

The hidden constraints

The most useful thing we discovered while building this is how much of a typical catalogue is not actually agent-purchasable, even when the schema looks fine. We detect six patterns that silently break agent commerce.

  • B2B / wholesale. Products with names like "Case of 24" or "Wholesale only" or "Trade pack" sit alongside retail SKUs in the same feed. An agent buying for a consumer will check out a single unit and the merchant will reject the order because of minimum quantity rules.
  • Prescription / regulated. Anything requiring a verified prescription, a healthcare practitioner identifier, or controlled substance handling. The agent has no way to provide these, and there is no schema field that declares the requirement.
  • Call for price. Products that require a quote, a phone call, or a sales conversation. Common in furniture, B2B services, and luxury items. From an agent's perspective these may as well not be on sale.
  • Age-restricted. Alcohol, tobacco, vaping, firearms, certain cosmetics. The checkout will demand identity verification that an autonomous agent cannot provide. The product looks purchasable on the page and is not.
  • Made-to-order. Bespoke jewellery, custom furniture, personalised items. Variable lead time and bespoke pricing make these incompatible with agents acting under a fixed spending mandate.
  • Subscription-only. Some products only ship as part of a recurring subscription. An agent attempting a one-time purchase will be redirected into a subscription flow it has no authority to complete.

We detect these from the product copy itself: the title, the description, the categories, the attributes. Heuristic detection is imperfect, so we mark the flags as suggestions and let you review them. But the typical scan flags a surprising fraction of the catalogue, and those products are exactly the ones a merchant would otherwise spend agent traffic on without realising the conversion is structurally impossible.

What an industry's catalogue actually looks like

Because we scan many merchants, we can compute what a healthy catalogue looks like for a category and tell you where you sit against your peers. This is the part no individual merchant tool can offer, because no individual merchant has the comparison data.

The numbers are sometimes surprising. Premium beauty retailers in the dataset have catalogues with the deepest product content (descriptions averaging four hundred words, full ingredient declarations, comprehensive schema) but the lowest agent purchasability scores in the sector. The reason is a structural inheritance: global content teams produce excellent product pages, regional checkouts inherit all the friction of the local market, and the agent fails at the second step every time.

By contrast, simpler retailers on standard platforms (Shopify, VTEX, headless setups) often score worse on content depth but better on transactability. The platform handled the agent-friendly checkout for them. They did not have to think about it. This is informational but it is also strategic. If your competitor on Shopify scores higher than you, despite your richer product content, the answer is not more content.

For each catalogue, we now show you how your GTIN coverage, brand coverage, return policy coverage, and average token cost sit relative to your category peers. You will not see a competitor's name. You will see whether you are above or below the median, and by how much.

Token cost: the new dimension nobody is measuring

An LLM-based agent reads your catalogue at a literal cost. Every product description, every attribute, every category breadcrumb counts as input tokens, and the agent's principal pays for those tokens. A catalogue with verbose product descriptions and bloated metadata is a catalogue that costs more for the agent to comparison-shop. Multiply that across all of an agent's candidate products for a single query, and the cost difference between a lean catalogue and a verbose one is meaningful.

We compute an estimated token cost for each product, sum it for the catalogue, and compare it against your category. A merchant whose products average twice as many tokens as their peers is, in the agent's economy, twice as expensive to consider. Agents under cost pressure will deprioritise expensive catalogues.

The fix here is rarely "write less". It is usually "write less marketing copy in the structured fields". Marketing prose belongs in the description for human consumption. Schema fields should carry only the data an agent needs.

Where this is going

The Agent Purchasability Score is what we can measure today from a typical e-commerce site. None of the data we score against was designed for agents. We are inferring from schema that was built for search engines, from content meant for humans, from APIs designed for the merchant's own apps. The fidelity will only get so high.

What is needed is a feed format that declares per-product agent terms directly. Whether each product is purchasable by an agent. Which agent payment protocols it accepts. Whether it is eligible for spending mandates. What its replenishment terms are. What products may substitute for it. None of this exists in any current catalogue standard.

So we are proposing one. The Agent-Optimised Catalogue Feed (AOCF) is a working draft of an open extension to the Google Merchant Center feed format that adds these fields. It is published openly under CC BY 4.0. We want it shipped, broken, criticised, and improved by anyone willing to engage with it. The frame is identity-first, polite, and verifiable. Same principles as the agent identity work we have been doing on the scanner side.

If you operate an e-commerce catalogue and are thinking about agentic commerce seriously, the most useful thing you can do this quarter is run a scan, look at the products in the lower tiers, and pick three to investigate. Almost always at least one of them turns out to be a structural problem you did not know you had. The other two are quick fixes. None of this is dramatic engineering. It is just measurement, applied at the level the agents actually operate at.

← All resources Scan your catalogue →