av
Eiraj Saqib successfully defended his doctoral thesis
On June 9, Eiraj Saqib successfully presented and defended his doctoral thesis "Bottleneck-Aware Optimization of Distributed CNN Inference for Edge-Cloud IoT Systems" and can now call himself a PhD in Electronics.
The opponent of the thesis was Professor Holger Frøning, Heidelberg University, Germany. Together with the grading committee consisting of Professor Slawomir Nowaczyk, Halmstad University, Professor Giandomenico Licciardo, University of Salerno and Associate Professor Qing He, Mid Sweden University, the work was carefully reviewed and approved.
The thesis was supervised by Professor Mattias O'Nils and Dr. Irida Shallari at Mid Sweden University.
Abstract
The proliferation of the Internet of Things (IoT) necessitates deploying Deep Learning (DL) models, specifically Convolu[1]tional Neural Networks (CNNs), on resource constrained edge devices. However, the high computational and memory de[1]mands of CNNs often exceed the capabilities of IoT nodes, while traditional cloud offloading suffers from latency and bandwidth limitations. This thesis proposes a comprehensive framework for Split Computing, enabling efficient distributed inference by partitioning CNNs between IoT nodes and edge servers.
The core contribution is a bottleneck-aware feature com[1]pression mechanism designed to minimize data traffic at the partition point. The research demonstrates that combining partitioning with extreme quantization (down to 1-bit) and compression reduces data transmission by over 99% with min[1]imal accuracy loss. This approach is augmented by a novel hybrid structured pruning criterion, utilizing L2-norm magni[1]tude and entropy, which selectively removes non informative channels to achieve significant speed-ups and energy savings compared to baseline execution modes.
To address quantization induced accuracy degradation, the thesis introduces Time-dependent Clustering Loss (TCL), a regularization technique that clusters activations during train[1]ing to ensure robustness against extreme quantization errors.
Furthermore, the complex selection of partition points, compres[1]sion ratios, and quantization levels is automated via CO-NAS (Compression Optimization, and Neural Architecture Search), a differentiable architecture search framework that efficiently discovers Pareto-optimal configurations.
Validated on diverse hardware platforms (e.g., Raspberry Pi, NVIDIA Jetson) and standard datasets (CIFAR-100, Tiny[1]ImageNet), these methodologies establish a robust pathway for Edge Intelligence. By unifying partitioning, quantization, pruning, and automated search, this work provides a scalable solution for deploying high performance vision models in resource constrained IoT environments.