Public defense of doctoral thesis with Eiraj Saqib
Welcome to a public defense of doctoral thesis with Eiraj Saqib, who will present his thesis "Bottleneck-Aware Optimization of Distributed CNN Inference for Edge-Cloud IoT Systems".
Date: 9 June 2026
Time: 09:00
Location: Campus Sundsvall, M-building, room M102 and online via Youtube and Zoom.
Doctoral thesis: Bottleneck-Aware Optimization of Distributed CNN Inference for Edge-Cloud IoT Systems
Respondent: Eiraj Saqib
Supervisor and chairman: Professor Mattias O'Nils, Mid Sweden University and Lecturer Irida Shallari
Opponent: Professor Holger Frøing, Heidelberg University, Germany
Examination Board:
Professor Slawomir Nowaczyk, Halmstad University
Professor Giandomenico Licciardo, University of Salerno
Associate Professor Qing He, Mid Sweden University
Reserve: Associate Professor Johan Sidén, Mid Sweden University
Abstract
The proliferation of the Internet of Things (IoT) necessitates deploying Deep Learning (DL) models, specifically Convolu[1]tional Neural Networks (CNNs), on resource constrained edge devices. However, the high computational and memory de[1]mands of CNNs often exceed the capabilities of IoT nodes, while traditional cloud offloading suffers from latency and bandwidth limitations. This thesis proposes a comprehensive framework for Split Computing, enabling efficient distributed inference by partitioning CNNs between IoT nodes and edge servers.
The core contribution is a bottleneck-aware feature com[1]pression mechanism designed to minimize data traffic at the partition point. The research demonstrates that combining partitioning with extreme quantization (down to 1-bit) and compression reduces data transmission by over 99% with min[1]imal accuracy loss. This approach is augmented by a novel hybrid structured pruning criterion, utilizing L2-norm magni[1]tude and entropy, which selectively removes non informative channels to achieve significant speed-ups and energy savings compared to baseline execution modes.
To address quantization induced accuracy degradation, the thesis introduces Time-dependent Clustering Loss (TCL), a regularization technique that clusters activations during train[1]ing to ensure robustness against extreme quantization errors.
Furthermore, the complex selection of partition points, compres[1]sion ratios, and quantization levels is automated via CO-NAS (Compression Optimization, and Neural Architecture Search), a differentiable architecture search framework that efficiently discovers Pareto-optimal configurations.
Validated on diverse hardware platforms (e.g., Raspberry Pi, NVIDIA Jetson) and standard datasets (CIFAR-100, Tiny[1]ImageNet), these methodologies establish a robust pathway for Edge Intelligence. By unifying partitioning, quantization, pruning, and automated search, this work provides a scalable solution for deploying high performance vision models in resource constrained IoT environments.