Deploying high-performance object detectors on TinyML platforms poses significant challenges due to tight hardware constraints and the modular complexity of modern detection pipelines. Neural Architecture Search (NAS) offers a path toward automation, but existing methods either restrict optimization to individual modules, sacrificing cross-module synergy, or require global searches that are computationally intractable.
We propose ELASTIC (Efficient Once for All Iterative Search for Object Detection on Microcontrollers), a unified, hardware-aware NAS framework that alternates optimization across modules (e.g., backbone, neck, and head) in a cyclic fashion. ELASTIC introduces a novel Population Passthrough mechanism in evolutionary search that retains high-quality candidates between search stages, yielding faster convergence, up to an 8% final mAP gain, and eliminates search instability observed without population passthrough.
In a controlled comparison, empirical results show ELASTIC achieves +4.75% higher mAP and 2× faster convergence than progressive NAS strategies on SVHN, and delivers a +9.09% mAP improvement on PascalVOC given the same search budget. ELASTIC achieves 72.3% mAP on PascalVOC, outperforming MCUNET by 20.9% and TinyissimoYOLO by 16.3%. When deployed on MAX78000/MAX78002 microcontrollers, ELASTIC-derived models outperform Analog Devices’ TinySSD baselines, reducing energy by up to 71.6%, lowering latency by up to 2.4×, and improving mAP by up to 6.99 percentage points across multiple datasets.
Our method begins with a pretrained supernet and performs iterative neural architecture search by alternating optimization between the backbone and head. The Population Passthrough mechanism ensures continuity by retaining top-performing candidates across module alternations.
| Dataset | Method | mAP | ↑mAP | Cost | MACs | Params | Latency |
|---|---|---|---|---|---|---|---|
| SVHN | Backbone-Only | 75.53% | +0.18% | 10.8 hrs | 137.6 M | 0.73 M | 2.62 ms |
| Head-Only | 75.35% | +0.00% | 4.7 hrs | 475.3 M | 1.01 M | 2.69 ms | |
| Progressive | 79.62% | +4.27% | 30.8 hrs | 78.4 M | 0.52 M | 2.36 ms | |
| ELASTIC (OURS) | 80.09% | +4.74% | 12.5 hrs | 105.8 M | 0.61 M | 2.34 ms | |
| PascalVOC | Backbone-Only | 22.02% | +2.80% | 9.6 hrs | 436.1 M | 0.58 M | 2.11 ms |
| Head-Only | 19.22% | +0.00% | 6.0 hrs | 240.0 M | 0.36 M | 2.48 ms | |
| Progressive | 21.74% | +2.52% | 14.8 hrs | 540.0 M | 0.41 M | 2.66 ms | |
| ELASTIC (OURS) | 30.83% | +11.61% | 14.65 hrs | 642.8 M | 0.57 M | 2.00 ms |
Notes: “MACs” denotes multiply–accumulate operations; “Cost” refers to total GPU hours required during search. Highlighted rows mark ELASTIC’s results for each dataset.
| Method | MACs | ↓MACs | Params | VOC mAP | ↑mAP |
|---|---|---|---|---|---|
| TY: 20-3-88 | 32M | 90.7% | 0.58M | 53% | +30% |
| TY: 20-7-88 | 44M | 87.2% | 0.58M | 47% | +24% |
| TY: 20-3-112 | 54M | 84.3% | 0.89M | 56% | +33% |
| TY: 20-7-112 | 70M | 79.6% | 0.91M | 53% | +30% |
| TY: 20-3-224 | 218M | 36.4% | 3.34M | 23% | +0% |
| MCUNet | 168M | 51.0% | 1.2M | 51.4% | +28.4% |
| MCUNetV2-M4 | 172M | 49.9% | 0.47M | 64.6% | +41.6% |
| MCUNetV2-H7 | 343M | 0% | 0.67M | 68.3% | +45.3% |
| ELASTIC (OURS) | 86M | 74.9% | 1.36M | 72.3% | +49.3% |
Notes: “MACs” denotes multiply–accumulate operations (lower is better). “↓MACs” and “↑mAP” indicate relative change w.r.t. the MCUNetV2-H7 reference.
| Dataset | Search Space | Mean mAP | ↑mAP | Std. Dev. | Variance |
|---|---|---|---|---|---|
| SVHN | Joint Search | 44.61% | 0% | 1.39 | 1.93 |
| Head Search | 67.38% | +22.77% | 0.37 | 0.136 | |
| ELASTIC (3) | 70.42% | +25.81% | 0.47 | 0.221 | |
| ELASTIC (5) | 70.68% | +26.07% | 0.45 | 0.199 | |
| PascalVOC | Joint Search | 4.99% | 0% | 0.04 | 1.40 × 10 −3 |
| Head Search | 27.89% | +22.90% | 0.01 | 9.86 × 10 −5 | |
| ELASTIC (3) | 28.28% | +23.29% | 0.01 | 7.04 × 10 −5 | |
| ELASTIC (5) | 28.32% | +23.33% | 0.01 | 6.60 × 10 −5 |
Notes: Smaller variance values indicate greater stability and consistency in discovered architectures.
| Model | Dataset | Device | Params | Energy (μJ) | Latency (ms) | Power (mW) | mAP (%) |
|---|---|---|---|---|---|---|---|
| ai87-fpndetector | PascalVOC | MAX78002 | 2.18M | 62001 | 122.6 | 445.76 | 50.66 |
| ELASTIC (OURS) | PascalVOC | MAX78002 | 1.32M | 17581 | 51.1 | 285.02 | 57.65 |
| ai85net-tinierssd-face | VGGFace2 | MAX78000 | 0.28M | 1712 | 43.4 | 29.90 | 84.72 |
| ELASTIC (OURS) | VGGFace2 | MAX78000 | 0.22M | 1368 | 45.6 | 20.90 | 87.10 |
| ai85net-tinierssd | SVHN | MAX78000 | 0.19M | 573 | 14.0 | 29.20 | 83.60 |
| ELASTIC (OURS) | SVHN | MAX78000 | 0.22M | 341 | 13.0 | 16.70 | 88.10 |
@misc{tran2025elasticefficientiterativesearch,
title={ELASTIC: Efficient Once For All Iterative Search for Object Detection on Microcontrollers},
author={Tony Tran and Qin Lin and Bin Hu},
year={2025},
eprint={2503.21999},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2503.21999},
}