AI-Enabled Medical Devices: The Computer Vision Layer Behind FDA-Cleared Tools The FDA has cleared close to a thousand AI-enabled medical devices, and the overwhelming majority sit on a computer vision pipeline. Strip the marketing away and the same patterns recur: a classifier or segmentation network, a locked model version, a validation cohort that mirrors the intended population, and a post-market surveillance plan that catches drift before it harms patients. The technical novelty is real, but it lives inside a regulatory frame that consumer-grade CV teams rarely encounter. That frame changes how the pipeline is designed. Medical-device CV is not “computer vision with extra paperwork.” It is computer vision where every architectural choice — augmentation strategy, calibration step, the question of whether the model is allowed to learn on production data — gets read as a regulatory artefact. Teams that internalise this early ship cleared products faster. Teams that retrofit compliance at submission usually rebuild large parts of the system. What “AI-enabled medical device” actually means under FDA SaMD rules The FDA classifies software that performs a medical function as Software as a Medical Device (SaMD). When that software uses AI — typically a CV model on radiology, pathology, dermatology, or ophthalmology images — it falls under the agency’s evolving AI/ML framework. The clearance pathway is usually 510(k), occasionally De Novo for novel intended uses, and rarely PMA for high-risk applications. The constraint that shapes engineering most is lock-and-key model versioning. The cleared device corresponds to one specific set of weights, validated against one specific dataset, for one specific intended use. Any meaningful change — retraining, architecture swap, new training data, even some hyperparameter shifts — is a regulatory event that may require a new submission. The FDA’s Predetermined Change Control Plan (PCCP) framework, formalised in 2024, lets sponsors pre-declare a bounded change envelope, but it does not remove the obligation; it just front-loads it. This is why a medical-device CV team’s repository looks different from a consumer CV team’s. The model registry is the source of regulatory truth. Training data lineage, augmentation seeds, evaluation splits, and threshold selection are all artefacts that must be reproducible years later when a post-market issue surfaces. Where CV is already FDA-cleared: the device categories Across cleared devices, computer vision shows up in a small number of recurring patterns: Device category Typical CV task Representative cleared examples Radiology assist (CADe/CADx) Detection + classification on CT, MRI, X-ray Aidoc, Viz.ai (stroke), Zebra Medical Vision Ophthalmology screening Classification on retinal fundus images IDx-DR (diabetic retinopathy), EyeArt Dermatology triage Classification on dermoscopic/clinical images SkinVision, several 510(k)-cleared triage tools Pathology Segmentation + classification on whole-slide images Paige Prostate, Ibex Cardiology imaging Segmentation + measurement on echocardiograms Caption Health, Ultromics Endoscopy Real-time detection on video frames GI Genius (polyp detection) These are operational measurements from the FDA’s public AI/ML device list, not estimates. The pattern that recurs is unsurprising once you see it: CV models are clearing fastest in domains where the input modality is already digital, the ground truth is well-defined by expert consensus, and the clinical decision is bounded (assist, screen, or triage — rarely autonomous diagnosis). What validation evidence each device required The validation evidence that supports clearance is not a single benchmark number. It is a structured argument that the model performs as intended across the population it will see in deployment. In practice this means three layers. First, analytical validation: how the model behaves on a fixed, locked test set with known ground truth. This is the number that ends up in marketing materials — sensitivity, specificity, AUC. It is the easiest layer to produce and the least informative on its own. Second, clinical validation: how the model performs against a clinical reference standard in a cohort that resembles the intended-use population. This is where most submissions either succeed or get pushed back. The cohort must reflect realistic distributions of age, sex, ethnicity, scanner manufacturer, acquisition protocol, and disease prevalence. A model trained mostly on Siemens CT scans from one academic hospital will not generalise to the GE and Philips scanners in community hospitals, and the FDA reviewers know this. Third, clinical utility evidence: does the device improve a patient-relevant outcome — reader accuracy, time-to-diagnosis, missed-finding rate. This is observed-pattern evidence from our experience across medical-device engagements: programmes that design the clinical-utility study before training the model end up with cleaner submissions than programmes that train first and then look for a study that the model “wins.” The operational constraints that distinguish medical-device CV Several engineering decisions look optional in consumer CV and are non-negotiable in medical-device CV. Deterministic inference. The same input must produce the same output, every time, on the cleared hardware target. This rules out non-deterministic GPU kernels, certain attention implementations, and some forms of mixed-precision arithmetic unless tested across the full input distribution. Frameworks like PyTorch and TensorRT support deterministic modes, but they cost throughput. Bounded post-processing. The model’s raw output is rarely the device’s output. Threshold selection, lesion clustering, false-positive suppression — all of this is part of the cleared device and must be locked alongside the weights. A change to the non-maximum suppression IoU threshold is a model change. PACS and EHR integration as part of the device. A CV inference engine in isolation is not a medical device. The cleared device usually includes the DICOM intake path, the worklist integration, the rendering of findings back into the radiologist’s PACS viewer, and the audit trail. Integration patterns that work — DICOM secondary capture, structured reports, HL7 FHIR for findings — are well-known, but each integration surface is part of the regulatory submission. Post-market surveillance. Drift is the silent failure mode. A CV model that performed at 0.94 AUC at clearance may drift to 0.88 over eighteen months as scanner populations shift, protocols update, or patient demographics change at deployed sites. The cleared device’s surveillance plan must detect this, and the team must have a path to mitigate it — which usually means a new submission, because retraining on production data is the kind of change the FDA wants to see formally. The 6–12 month gap between compliance-first and accuracy-first programmes Medical-device CV programmes that design for FDA validation evidence from day one — locked test sets defined before training, cohort design reviewed by regulatory counsel early, deterministic inference baked into the architecture — reach cleared-device status six to twelve months faster than programmes that design for accuracy first and bolt compliance on at submission. This is an observed-pattern from our medical-device CV engagements, not a benchmarked rate; the exact gap depends on the device class, the novelty of the intended use, and the maturity of the sponsor’s quality management system. The mechanism is straightforward. Accuracy-first programmes optimise on a dataset that is not the validation cohort, then discover at submission that they need a new validation study to bridge the gap. That study takes 6–12 months to design, run, and report. Compliance-first programmes simply run the validation study they always knew they would need. What remained imperfect We are deliberate about the boundaries of what we have just described. The FDA-cleared count is a moving target — the agency’s public list updates monthly and the categorical breakdown shifts as new device classes (digital pathology, ambient AI in surgery) gain clearance. The 6–12 month gap is an observed pattern across our engagements; it has no published benchmark to cite against, and the variance across programmes is wide enough that some compliance-first programmes still struggle and some accuracy-first programmes manage to recover quickly. Lock-and-key versioning under PCCPs is still relatively new, and how the FDA reviews bounded change envelopes in practice will evolve over the next several review cycles. None of this changes the structural argument, but each is a place where a careful reader should expect the details to move. How this maps to deep learning in medical CV The deep-learning techniques themselves — convolutional architectures, vision transformers, U-Net variants for segmentation, multi-task heads for detection-plus-classification — are not the differentiator. The differentiator is how those techniques are wrapped in evidence. We develop this further in deep learning in medical computer vision, which covers the model-architecture side of the same pipeline, and in computer vision advancing modern clinical trials, which looks at the upstream evidence-generation side. The pattern we keep returning to in medical-device CV work is that the model is the small part. The validation infrastructure, the locked-versioning discipline, the integration surface, and the post-market plan are the device. FAQ How many AI-enabled medical devices has the FDA cleared, and which CV patterns recur across them? As of 2026, the FDA’s public AI/ML-enabled device list contains close to a thousand cleared devices, the majority of which are radiology assist tools, with secondary clusters in ophthalmology, pathology, cardiology imaging, and endoscopy. The recurring CV patterns are detection (CADe), classification (CADx), and segmentation, usually applied to a digital imaging modality with well-defined expert ground truth. What are the production patterns behind FDA-cleared CV diagnostics (CADe, CADx, radiomics)? CADe (computer-aided detection) typically uses object-detection architectures over CT, MRI, X-ray, or pathology slides to flag candidate findings. CADx (computer-aided diagnosis) adds a classification head that assigns a likelihood or category to each finding. Radiomics pipelines extract structured features from segmented regions and feed them into downstream models. All three patterns share a locked model version, a defined intended use, and a validation cohort designed to mirror the deployment population. How does deep learning in medical CV (classification, segmentation, detection) translate into regulatory artefacts? Every architectural and training choice generates a regulatory artefact: the model card, the training data lineage, the augmentation parameters, the evaluation splits, the threshold selection rationale, and the deterministic-inference configuration. The cleared device is the bundle, not just the weights. Where do AI medical-device pipelines need to handle generalisability, drift, and population shift? Generalisability is handled at cohort design — the validation set must span the scanner manufacturers, protocols, demographics, and disease prevalence the device will encounter. Drift and population shift are handled post-market through a surveillance plan that monitors input distributions and output statistics over time, with pre-declared thresholds that trigger investigation or resubmission. What integration patterns connect CV inference to PACS, EHR, and clinical workflow? DICOM is the universal intake for imaging; the standard outputs are DICOM secondary capture (annotated images), DICOM structured reports, and HL7 FHIR for structured findings that flow into the EHR. Worklist integration via DICOM modality worklist or HL7 ORM messages is the standard mechanism for routing inference results back into the radiologist’s reading workflow. Which AI-enabled medical-device companies and products define the current state of practice in 2026? Aidoc, Viz.ai, IDx-DR, Caption Health, Paige Prostate, GI Genius, and Ultromics are commonly cited examples spanning radiology, ophthalmology, cardiology, pathology, and endoscopy. The set is not exhaustive and shifts as new clearances land each quarter; the FDA’s public AI/ML-enabled device list is the authoritative source. Medical-device CV programmes succeed when the validation infrastructure is treated as the primary engineering deliverable. The model is the easy part — what remains hard is the discipline of locking it, validating it against the right cohort, integrating it into the clinical workflow without breaking the audit trail, and watching it for drift across the years it will run in production.