FxHENN: FPGA-based acceleration framework for homomorphic encrypted CNN inference

Abstract

Fully homomorphic encryption (FHE) is a promising data privacy solution for machine learning, which allows the inference to be performed with encrypted data. However, it typically leads to 5-6 orders of magnitude higher computation and storage overhead. This paper proposes the first full-fledged FPGA acceleration framework for FHE-based convolution neural network (HE-CNN) inference. We then design parameterized HE operation modules with intra- and inter- HE-CNN layer resource management based on FPGA high-level synthesis (HLS) design flow. With sophisticated resource and performance modeling of the HE operation modules, the proposed FxHENN framework automatically performs design space exploration to determine the optimized resource provisioning and generates the accelerator circuit for a given HE-CNN model on a target FPGA device. Compared with the state-of-the-art CPU-based HE-CNN inference solution, FxHENN achieves up to 13.49X speedup of inference latency, and 1187.12X energy efficiency. Meanwhile, given this is the first attempt in the literature on FPGA acceleration of fullfledged non-interactive HE-CNN inference, our results obtained on low-power FPGA devices demonstrate HE-CNN inference for edge and embedded computing is practical.

Publication
In 2023 IEEE International Symposium on High-Performance Computer Architecture
Click the Cite button above to demo the feature to enable visitors to import publication metadata into their reference management software.
Create your slides in Markdown - click the Slides button to check out the example.
Lei Ju
Lei Ju
教授