av
Isaac Sánchez Leal successfully defended his doctoral thesis
On February 18, Isaac Sánchez Leal successfully defended his PhD thesis on partitioned deep neural network inference for resource-constrained IoT devices. His work presents a system-level methodology enabling efficient and accurate AI execution on battery-powered edge devices.
Opponent to the work was Professor Johan Lilius from Åbo Akademi, Finland. Together with the grading committee consited of Professor Per Gunnar Kjeldsberg, Norges teknisk-naturvitenskapelige universitet (NTNU), Associate Professor Oscar Gustafsson, Linköpings universitet and Professor Bonavolontà Francesco, Università degli Studi di Napoli Federico II (UniNa) / University of Naples Federico II, the work was carefully reviewed and approved.
The thesis was supervised by Professor Mattias O'Nils and co-supervised by Dr. Irida Shallari at Mid Sweden University.
With this achievement, Isaac can now proudly call himself Doctor in Electronics. Congratulations on this milestone and best of luck in the next chapter!
Abstract
The proliferation of the Internet of Things (IoT) has driven the deployment of Deep Learning models on constrained edge devices. However, a fundamental conflict exists between the computational demands of Deep Neural Networks (DNNs) and the strict energy and processing limits of battery-operated nodes. While intelligence partitioning offers a potential solution by offloading computation to a server, practical deployment is hindered by the structural barrier of modern DNNs, which are characterized by intensive early-layer computation and intermediate data expansion, creating critical bottlenecks in distributed environments. This thesis presents a system-level methodology to bridge the gap between algorithmic demands and hardware constraints.
The research begins by identifying the governing parameters of system efficiency through a systematic analysis method and a Design Space Exploration (DSE) method. Based on these core determinants, a co-design strategy is introduced to overcome the structural barrier to partitioning. By synergistically combining model- and data-level transformations, this approach induces efficiency at potential partition points, significantly reducing node energy consumption and system latency. Finally, the thesis proposes an accuracy recovery method to effectively decouple node efficiency from application accuracy. By shifting the paradigm from loss mitigation to compensation, this reconstruction engine ensures that performance is maintained relative to the baseline accuracy even under extreme optimization actions.
In summary, this thesis establishes a system-level methodology for the efficient partitioning of DNNs. It demonstrates that by operationalizing the presented formal design workflow, it is possible to exploit the capabilities of resource-unconstrained servers to maximize node battery life and minimize system response time. This work lays the foundation for ubiquitous intelligence, enabling the deployment of advanced AI on resource-limited hardware by transforming the structural limitations of DNNs into opportunities for distributed efficiency.