ELASTIC: Efficient Once For All Iterative Search for Object Detection on Microcontrollers

Abstract

Deploying high-performance object detectors on TinyML platforms poses significant challenges due to tight hardware constraints and the modular complexity of modern detection pipelines. Neural Architecture Search (NAS) offers a path toward automation, but existing methods either restrict optimization to individual modules, sacrificing cross-module synergy, or require global searches that are computationally intractable.

We propose ELASTIC (Efficient Once for All Iterative Search for Object Detection on Microcontrollers), a unified, hardware-aware NAS framework that alternates optimization across modules (e.g., backbone, neck, and head) in a cyclic fashion. ELASTIC introduces a novel Population Passthrough mechanism in evolutionary search that retains high-quality candidates between search stages, yielding faster convergence, up to an 8% final mAP gain, and eliminates search instability observed without population passthrough.

In a controlled comparison, empirical results show ELASTIC achieves +4.75% higher mAP and 2× faster convergence than progressive NAS strategies on SVHN, and delivers a +9.09% mAP improvement on PascalVOC given the same search budget. ELASTIC achieves 72.3% mAP on PascalVOC, outperforming MCUNET by 20.9% and TinyissimoYOLO by 16.3%. When deployed on MAX78000/MAX78002 microcontrollers, ELASTIC-derived models outperform Analog Devices’ TinySSD baselines, reducing energy by up to 71.6%, lowering latency by up to 2.4×, and improving mAP by up to 6.99 percentage points across multiple datasets.

Overview of ELASTIC Framework

Our method begins with a pretrained supernet and performs iterative neural architecture search by alternating optimization between the backbone and head. The Population Passthrough mechanism ensures continuity by retaining top-performing candidates across module alternations.

Quantitative Comparison of Search Strategies

Quantitative comparison of search strategies on the SVHN and PascalVOC subset. ELASTIC consistently achieves the highest mAP on both datasets and reduces GPU hours by 59% on SVHN.

Dataset	Method	mAP	↑mAP	Cost	MACs	Params	Latency
SVHN	Backbone-Only	75.53%	+0.18%	10.8 hrs	137.6 M	0.73 M	2.62 ms
	Head-Only	75.35%	+0.00%	4.7 hrs	475.3 M	1.01 M	2.69 ms
	Progressive	79.62%	+4.27%	30.8 hrs	78.4 M	0.52 M	2.36 ms
	ELASTIC (OURS)	80.09%	+4.74%	12.5 hrs	105.8 M	0.61 M	2.34 ms
PascalVOC	Backbone-Only	22.02%	+2.80%	9.6 hrs	436.1 M	0.58 M	2.11 ms
	Head-Only	19.22%	+0.00%	6.0 hrs	240.0 M	0.36 M	2.48 ms
	Progressive	21.74%	+2.52%	14.8 hrs	540.0 M	0.41 M	2.66 ms
	ELASTIC (OURS)	30.83%	+11.61%	14.65 hrs	642.8 M	0.57 M	2.00 ms

Notes: “MACs” denotes multiply–accumulate operations; “Cost” refers to total GPU hours required during search. Highlighted rows mark ELASTIC’s results for each dataset.

Comparison of ELASTIC Framework

Comparison of ELASTIC with TinyissimoYOLO and MCUNET on the full PascalVOC dataset, considering all object classes and counts. ELASTIC achieves a 20.9% mAP boost over MCUNET, 4.0% over MCUNetV2, and a 16.3% over TinyissimoYOLO’s best performing model, while discovering a model with significantly fewer MACs, enabling faster inference on microcontrollers.

Method	MACs	↓MACs	Params	VOC mAP	↑mAP
TY: 20-3-88	32M	90.7%	0.58M	53%	+30%
TY: 20-7-88	44M	87.2%	0.58M	47%	+24%
TY: 20-3-112	54M	84.3%	0.89M	56%	+33%
TY: 20-7-112	70M	79.6%	0.91M	53%	+30%
TY: 20-3-224	218M	36.4%	3.34M	23%	+0%
MCUNet	168M	51.0%	1.2M	51.4%	+28.4%
MCUNetV2-M4	172M	49.9%	0.47M	64.6%	+41.6%
MCUNetV2-H7	343M	0%	0.67M	68.3%	+45.3%
ELASTIC (OURS)	86M	74.9%	1.36M	72.3%	+49.3%

Notes: “MACs” denotes multiply–accumulate operations (lower is better). “↓MACs” and “↑mAP” indicate relative change w.r.t. the MCUNetV2-H7 reference.

Search Space Refinement & Evolution

Search space refinement distributions — **Search space refinement through ELASTIC iteration.** Distributions of 100 randomly sampled architectures from three head search spaces—joint NAS, progressive head-only, and ELASTIC-refined—on SVHN (left) and PascalVOC (right). ELASTIC produces architectures with a mean mAP improvement of +4.87% on SVHN and +0.43% on PascalVOC over progressive search. Compared to global joint search, ELASTIC improves the mAP by 26.07% and 23.33% on SVHN and PascalVOC, respectively.

Search space evolution over iterations — **Search Space Evolution Over Iterations.** Mean accuracy and distribution of randomly sampled architectures at iterations 0, 3, and 5. On SVHN, ELASTIC continues to improve mean mAP by +3.3% while reducing variance by approximately 89.7%. On PascalVOC, mean mAP improves from 27.89% to 28.32%, with variance dropping by approximately 33%.

Mean mAP and Variance across Search Spaces. Superscripts indicate the iteration number in the iterative search process. A higher mean mAP indicates better average performance, while lower variance reflects more stable and consistent architectures discovered across runs. On SVHN, the mean mAP increases from 44.61% to 70.68%, accompanied by a reduction in variance from 0.193 to 0.199. On PascalVOC, mean mAP improves from 4.99% to 28.32%, while variance decreases by over 95%.

Dataset	Search Space	Mean mAP	↑mAP	Std. Dev.	Variance
SVHN	Joint Search	44.61%	0%	1.39	1.93
	Head Search	67.38%	+22.77%	0.37	0.136
	ELASTIC ⁽³⁾	70.42%	+25.81%	0.47	0.221
	ELASTIC ⁽⁵⁾	70.68%	+26.07%	0.45	0.199
PascalVOC	Joint Search	4.99%	0%	0.04	1.40 × 10 ⁻³
	Head Search	27.89%	+22.90%	0.01	9.86 × 10 ⁻⁵
	ELASTIC ⁽³⁾	28.28%	+23.29%	0.01	7.04 × 10 ⁻⁵
	ELASTIC ⁽⁵⁾	28.32%	+23.33%	0.01	6.60 × 10 ⁻⁵

Notes: Smaller variance values indicate greater stability and consistency in discovered architectures.

Performance & Efficiency on MAX78000/MAX78002

Comparison of performance/efficiency metrics on the MAX78000/MAX78002 platforms. ELASTIC models achieve higher mAP, with gains of up to 6.99%, while consistently reducing energy consumption by up to 3.5× and power usage by up to 1.6× compared to baseline models. On the PascalVOC dataset, ELASTIC reduces latency by 2.4×.

Model	Dataset	Device	Params	Energy (μJ)	Latency (ms)	Power (mW)	mAP (%)
ai87-fpndetector	PascalVOC	MAX78002	2.18M	62001	122.6	445.76	50.66
ELASTIC (OURS)	PascalVOC	MAX78002	1.32M	17581	51.1	285.02	57.65
ai85net-tinierssd-face	VGGFace2	MAX78000	0.28M	1712	43.4	29.90	84.72
ELASTIC (OURS)	VGGFace2	MAX78000	0.22M	1368	45.6	20.90	87.10
ai85net-tinierssd	SVHN	MAX78000	0.19M	573	14.0	29.20	83.60
ELASTIC (OURS)	SVHN	MAX78000	0.22M	341	13.0	16.70	88.10

Comparison of ELASTIC-derived models vs. scaled baselines — **Comparison of ELASTIC-derived models vs. scaled ai85net-tinierssd baselines.** Marker size reflects number of FLOPs, positively correlated with energy. ELASTIC-derived models can reduce FLOPs by up to 66% with up to a 5.7% accuracy gain.

**ELASTIC deployment on MAX78000 using SVHN dataset.** ELASTIC produces compact, high-performing architectures suitable for ultra-low-power deployment.

Detection Demo on MAX78000

SVHN Demo on MAX78000. On-device digit detection example.

VGGFace2 Demo on MAX78000. On-device face detection example.

BibTeX

@misc{tran2025elasticefficientiterativesearch,
      title={ELASTIC: Efficient Once For All Iterative Search for Object Detection on Microcontrollers}, 
      author={Tony Tran and Qin Lin and Bin Hu},
      year={2025},
      eprint={2503.21999},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2503.21999},
}