Stylianos I. Venieris’ Post

View profile for Stylianos I. Venieris, graphic

Senior Research Scientist / Head of Distributed AI group @ Samsung AI, Cambridge, UK | AI systems, Deep Learning, FPGAs

The extended version of our unzipFPGA work has been accepted by ACM Transactions on Design Automation of Electronic Systems ! unzipFPGA enhances CNN hardware accelerators with an on-the-fly mechanism of generating weights, drastically reducing the off-chip memory bandwidth requirements during inference. Key features to look out for include: 1) detailed description of the on-the-fly weights generation process, 2) a new automated hardware-aware methodology for selecting the per-layer compression ratios to achieve a balanced performance-accuracy trade-off, and 3) a comparison with optimised embedded GPU implementations with a geo. mean 2.3x higher energy efficiency across diverse CNNs. unzipFPGA was co-led by Javier F. and in close collab with Nicholas Lane. Many thanks to both! Preprint: https://lnkd.in/ebgSdMRy #deeplearning #ai #accelerators

  • No alternative text description for this image

To view or add a comment, sign in

Explore topics