How should you choose an edge device? The embedded edge device selection for CV deployment is not primarily a performance benchmarking exercise. The right device depends on the inference model, the required throughput, power budget, integration requirements, and what model optimisation your team can execute. A device with twice the raw TOPS may be the wrong choice if it requires a complex toolchain your team cannot maintain. This article covers the four most commonly deployed embedded CV platforms — NVIDIA Jetson, Google Coral TPU, Hailo, and OAK-D — with honest assessments of where each works well and where it does not. For the broader edge deployment context, see how to deploy CV models on edge devices. Platform comparison overview Platform Compute Type Peak Performance Power Model Format Ecosystem Maturity NVIDIA Jetson Orin Nano GPU + CPU + DLA 40 TOPS 7–15W TensorRT, ONNX, PyTorch High NVIDIA Jetson Orin NX GPU + CPU + DLA 100 TOPS 10–25W TensorRT, ONNX, PyTorch High Google Coral Dev Board Edge TPU 4 TOPS (TPU) 2–4W TFLite (quantised) Moderate Hailo-8 M.2 Neural processor 26 TOPS 2.5W Hailo Dataflow Compiler Moderate Hailo-8L Neural processor 13 TOPS 1.5W Hailo Dataflow Compiler Moderate OAK-D (Myriad X VPU) VPU 4 TOPS 2–4W OpenVINO IR Moderate TOPS (Tera Operations Per Second) comparisons across different hardware architectures are not directly comparable — a GPU TOPS does not equal a TPU TOPS for the same model. Benchmark your specific model on the specific device. NVIDIA Jetson Jetson is the most flexible embedded CV platform and the most mature ecosystem. It runs standard PyTorch and TensorFlow, supports CUDA, and provides TensorRT for optimised deployment. Models developed in a standard GPU training environment deploy to Jetson with minimal code changes. Strengths: Full CUDA support — standard GPU deep learning code runs with minimal modification TensorRT provides substantial throughput improvement (typically 3–5×) over native PyTorch JetPack SDK provides complete environment (CUDA, cuDNN, TensorRT, DeepStream) Wide model support — virtually any architecture that runs on desktop GPU runs on Jetson Camera and sensor integration well-supported (CSI cameras, USB cameras, RTSP streams) Weaknesses: Higher power consumption than TPU/NPU alternatives (7–25W vs 1.5–4W) Higher cost: Jetson Orin Nano module ~$150–200; carrier board additional cost Thermal management required — sustained inference loads require active cooling or thermal throttling Where Jetson is the right choice: when you need model flexibility, when power is not severely constrained, when the team develops in PyTorch, or when the application requires running multiple models simultaneously (detection + tracking + classification). Google Coral TPU The Coral Edge TPU is a purpose-built inference accelerator optimised for TFLite quantised models. Extremely power-efficient at 2–4W for the development board. Strengths: Very low power consumption — the best option for battery-powered applications Fast inference for models that fit fully on-chip (sub-5ms for MobileNet-class models) USB Accelerator form factor allows add-on to existing compute (Raspberry Pi, laptop) Low cost: USB accelerator ~$60 Weaknesses: Hard requirement for INT8 quantised TFLite models — FP32 models are not supported Models must fit entirely in the TPU’s 8MB SRAM for maximum performance — larger models are executed partially on CPU, significantly degrading throughput Limited to TFLite model graph operations — custom operations not supported Toolchain is more constrained than Jetson’s — not all architectures are supported efficiently Where Coral is the right choice: lightweight inference (MobileNet, EfficientDet-Lite, lightweight YOLO) on battery-powered or power-constrained hardware where INT8 quantisation is acceptable and the model fits within TPU memory. Hailo Hailo produces dedicated neural processing units (NPUs) available as M.2 modules that add AI acceleration to existing compute. The Hailo-8 offers 26 TOPS at 2.5W — competitive power efficiency with much higher throughput than Coral. Strengths: Outstanding TOPS-per-watt: among the best available for embedded AI M.2 form factor integrates with standard SBCs (Raspberry Pi 5 with M.2 HAT, industrial computers) Supports a wide range of architectures including YOLOv5/v8, ResNet, EfficientDet Weaknesses: Hailo Dataflow Compiler is more complex than TensorRT or TFLite — requires graph compilation and optimisation specific to the Hailo architecture Ecosystem is less mature than Jetson — fewer pre-compiled models, smaller community Debugging deployment issues is harder than on Jetson due to less documentation Where Hailo is the right choice: high-throughput, power-constrained deployments (solar-powered cameras, edge nodes with limited power infrastructure) where Coral’s throughput is insufficient and Jetson’s power consumption is prohibitive. Commonly used in smart camera applications and edge AI appliances. OAK-D (OpenCV AI Kit with Depth) The OAK-D combines a Myriad X VPU with an RGB camera and stereo depth cameras in a single integrated unit. Designed to simplify the hardware integration challenge for depth-aware CV. Strengths: Integrated RGB + stereo depth — useful for 3D detection, obstacle avoidance, spatial AI DepthAI SDK simplifies model deployment OpenVINO-based inference with hardware acceleration USB connectivity — simple integration with host systems Weaknesses: Fixed camera configuration — not suitable when custom camera setup is required Myriad X VPU has lower throughput than Hailo or Jetson for most workloads OpenVINO IR model format adds a conversion step Less active development of the platform as Intel restructures Where OAK-D is the right choice: applications requiring integrated depth estimation alongside RGB detection — robotics, pick-and-place automation, obstacle detection. Not the right choice for high-throughput video analytics. INT8 quantisation checklist (required for Coral and recommended for Hailo/Jetson) Calibration dataset prepared (representative subset of deployment images, typically 100–1000 samples) Post-training quantisation (PTQ) applied and accuracy validated on held-out test set Accuracy drop assessed — target <1–2% mAP drop for detection; <1% accuracy drop for classification If PTQ accuracy drop is unacceptable: quantisation-aware training (QAT) applied Model exported to target format (TFLite for Coral, ONNX/TensorRT for Hailo/Jetson) Inference outputs validated against FP32 baseline on same inputs Platform selection decision guide In our experience, the platform selection reduces to three questions: Is the team working in PyTorch and needs model flexibility? → Jetson Is power the primary constraint and the model is lightweight? → Coral TPU Is power the primary constraint and the model requires higher throughput? → Hailo-8 Does the application require integrated depth? → OAK-D Across our deployments, Jetson is the most common choice for industrial and commercial CV applications where power and cost allow it, because the lower deployment friction and larger ecosystem reduce project risk more than the power and cost savings of alternatives justify. Coral and Hailo are the right choices for deployments that genuinely require their power efficiency — high-volume camera networks, battery-powered systems, and scenarios where dozens or hundreds of devices must be deployed at low unit cost.