AR advertising is now the highest-volume consumer XR surface on the planet. A 3D billboard above a Tokyo crossing, a virtual lipstick layered over a TikTok selfie, a QR-anchored AR card in a newspaper insert — these are the formats that have actually reached scale, and they share an unglamorous engineering reality. The experience lives or dies in the first second on a cold device. Teams that author the experience like a film miss that budget and lose the audience before anything renders. Teams that design for the cold-start path keep the user long enough for the brand impression to land. That distinction is what separates AR advertising that drives measurable lift from AR advertising that drives novelty taps. We have seen this pattern repeatedly across consumer GPU pipelines, and the production stack behind it is more constrained than the marketing decks suggest. What is an AR billboard actually doing? The viral 3D billboard format — the wave crashing out of a Shinjuku screen, the cat leaping over a Seoul intersection — is not technically AR in the headset sense. It is forced-perspective stereoscopic content played on a curved LED wall, rendered once and shipped as video. The “augmented” effect comes from the corner geometry of the display and the camera angle of whoever is filming it for social media. The clip then travels as user-generated content, which is where the real reach happens. The actual AR layer sits one step closer to the phone. When a viewer scans a QR code on the billboard, or points their camera at a printed marker, the device pulls a WebAR or app-bound experience that overlays 3D content on the live camera feed. This is where engineering pressure concentrates. The 3D billboard itself is a rendering problem solved in post-production. The AR companion is a real-time problem solved on whatever GPU the user happens to be holding. That split matters because the two pieces have different failure modes. The billboard fails if the stereoscopic geometry is off — a fixable, one-time issue. The AR companion fails if cold-start time-to-first-frame exceeds the user’s patience, which on a mid-tier Android in 2025 is roughly two seconds before the thumb moves on. Authoring the experience without instrumenting that budget is the most common error in the format. How the cosmetics try-on pipeline actually runs Virtual makeup is the most engineered slice of AR advertising because it has a direct conversion signal: did the shopper add the shade to cart. The pipeline behind it is a chain of CV models that each have to clear a frame-time budget on consumer hardware. A representative stack: Stage Typical model class Frame budget on mid-tier mobile GPU Face detection Lightweight SSD or BlazeFace variant 3–5 ms Landmark regression 468-point mesh (MediaPipe-style) 4–7 ms Segmentation (lips, eyes, hair) U-Net derivative, quantised 6–10 ms Material shading and blend Custom GLSL/Metal shader 4–6 ms Tracking smoothing Kalman or one-euro filter <1 ms That is an observed-pattern range across the AR beauty stacks we have profiled, not a benchmarked rate from a named vendor. The total budget needs to land under 33 ms to hit 30 fps, and under 16 ms to feel like a mirror. Drop a frame consistently and the shade swatch looks like it is sliding off the lip. Drop several and the user closes the experience. The non-obvious constraint is asset streaming order. A try-on session loads model weights, shader programs, environment maps, and the catalogue of shades the brand wants to feature. If the loader pulls the catalogue before the inference graph is warm, the first interaction stutters. If it pulls the inference graph first and streams shades on demand, the first shade tap pauses. The right order is brand-specific and has to be measured per device class — a Pixel 8 and a Galaxy A14 produce different curves even on the same WebGL implementation. Which AR ads actually drive ROI versus novelty? The honest answer is that two formats have produced repeatable commercial outcomes, and the rest are still in the novelty band. The first is virtual try-on integrated into the e-commerce funnel. When the try-on widget sits on the product detail page and the shade selector is the same control that adds to cart, the AR layer becomes a conversion tool rather than a marketing surface. Brands that have wired it this way (across cosmetics, eyewear, and watches) report conversion-rate lifts on the order of 20–40% for users who engage the AR, though attribution is messy because AR-engaging users self-select for higher purchase intent. This is an industry-reported observed pattern, not a controlled benchmark. The second is social-platform AR effects with shoppable links. Snapchat Lenses, Instagram Spark AR (now sunset for third parties but still relevant historically), and TikTok Effect House have all produced campaigns where the AR experience itself is the discovery channel and the link out is the conversion. The structural advantage is that the platform handles the runtime, so the cold-start budget is amortised across a session the user is already in. The novelty band — standalone branded AR apps, marker-based magazine inserts, one-off billboard tie-ins without a follow-through experience — generates impressions and PR but rarely produces a measurable funnel. The asset budget and engineering cost per impression are too high relative to the conversion path. Device fragmentation and the cold-start path A practitioner-grade AR ad stack assumes three fallback tiers, not one. The top tier is recent flagship devices (recent iPhones, top Android SoCs) that can run the full segmentation-and-shader pipeline at native frame rates. The middle tier — most of the actual audience — can run a reduced pipeline: lower-resolution segmentation, simplified shader, fewer simultaneous tracked features. The bottom tier should not run the AR experience at all, and should instead fall through to a video preview of what the AR would look like, with a clear “view on a newer device” path. The mistake we have seen most often is shipping the experience tuned for the top tier and letting the middle and bottom tiers degrade silently. The user does not know the experience is supposed to look better; they just see a janky lipstick and close the tab. A per-tier rendering fallback, decided at session start based on a quick GPU capability probe, retains far more users than a single-pipeline build with optimistic defaults. The cold-start path needs the same tiering. On a warm device with cached assets, time-to-first-frame can be under a second. On a cold device on a 4G connection, the same pipeline can take five to eight seconds — well past the abandonment threshold. Pre-warming the inference graph with a placeholder texture while the real shade catalogue streams in is one of the few techniques that consistently moves this number. Where this is evolving Three directions are worth tracking, in order of how close they are to production. Generative try-on is the most immediate. Instead of pre-modelling every shade as a shader parameter set, recent diffusion-based pipelines can render arbitrary makeup descriptions onto a face mesh in near-real-time. The frame budget is not there yet on mobile — current generative try-on still runs server-side with the result streamed back — but the gap is closing fast. Personalised AR creative sits one step behind. The idea is that the AR experience adapts to the viewer’s prior signals: a returning shopper sees their saved shades pre-loaded, a first-time visitor sees the bestsellers. This is straightforward in principle and gated mostly by attribution plumbing rather than rendering. Social-AR-native commerce, where the entire purchase happens inside the platform AR session without a redirect, is the structural endgame. Snap and TikTok have both shipped pieces of this. Whether it consolidates into a dominant pattern depends on platform decisions outside any single brand’s control. What remained imperfect Several pieces of this stack are still uncomfortable in practice. Cross-device colour fidelity for cosmetics try-on remains poor — the same shade rendered on two phones with different display calibrations and ambient-light sensors looks like two different products, and there is no clean solution short of per-device colour profiles that no brand has built. Attribution from AR engagement to revenue is still proxy-heavy; the “users who engaged AR converted 30% better” framing conflates causation with self-selection, and we do not have a clean controlled-experiment design that survives contact with a real campaign launch. Generative try-on is impressive in demos but introduces a new failure mode — the model occasionally hallucinates makeup the brand does not sell, which is a legal problem the rendering team cannot fix alone. And the cold-start budget on bottom-tier devices is, frankly, never going to be solved within the AR layer; it has to be solved by deciding earlier in the funnel which users see the AR at all. These are not blockers, but they are real, and any team telling you the stack is solved is selling something. The teams getting AR advertising right are the ones treating it as a GPU and pipeline engineering problem with a marketing surface on top, not the other way around. We see this gap most clearly when an experience is authored for the brand deck and then handed to engineering to “make it run on phones.” It rarely runs on phones. The experiences that do run start with the cold-start budget and work backwards. For the broader argument about why GPU-bound consumer experiences need that inversion, our GPU engineering practice page covers the audit pattern we use to instrument it. FAQ What are the production patterns for AR advertising — billboards, social filters, native ads? Three patterns dominate: forced-perspective 3D content on curved LED displays (the “billboard” format, which is a rendering problem solved in post), real-time camera-overlay AR delivered via WebAR or platform SDKs (Snap, TikTok, Meta), and QR/marker-anchored experiences that bridge print or out-of-home to a phone session. Each has different cold-start, fragmentation, and conversion-path profiles. How does AR beauty try-on integrate measurably into a brand’s e-commerce funnel? The integration that produces measurable lift is the one where the try-on widget sits directly on the product detail page and the shade selector is the same control that adds to cart. When AR is a separate destination, conversion attribution collapses; when it is wired into the cart action, the funnel data is clean enough to act on. Conversion-rate lifts of 20–40% are reported by AR-engaging users, though that figure includes self-selection bias. Which AR advertising examples actually drive ROI versus novelty engagement? The two formats with repeatable commercial outcomes are virtual try-on integrated into e-commerce funnels and social-platform AR effects with shoppable links. Standalone branded AR apps and one-off magazine or billboard tie-ins generate impressions but rarely produce a measurable funnel — the engineering cost per impression is too high relative to the conversion path. What CV pipeline runs behind virtual makeup, hair, and skincare try-on at scale? A chained stack: face detection (lightweight SSD-class), 468-point landmark regression, segmentation of lips/eyes/hair via a quantised U-Net derivative, custom shader-based material blending, and a Kalman or one-euro smoothing filter. Total frame budget needs to land under 33 ms for 30 fps on mid-tier mobile GPUs, and asset streaming order matters as much as model latency. How do AR newspaper and billboard ads handle device fragmentation and cold-start UX? The robust pattern is three explicit rendering tiers — flagship, mid-tier, and bottom-tier — decided at session start via a GPU capability probe. The bottom tier should not run the AR experience at all; it should fall through to a video preview. Cold-start time-to-first-frame budgets need to be measured per tier on cold devices on slow connections, not on warm cached test phones. Where are AR beauty and advertising applications evolving — generative try-on, personalization, social integration? Three directions: generative try-on (currently server-side, frame budget not yet there for on-device), personalised AR creative gated more by attribution plumbing than rendering, and social-AR-native commerce where the purchase happens inside the platform session. The first is the closest to production; the third has the most structural upside but depends on platform decisions outside any single brand’s control.