Quantized Neural Network for Edge Devices – Why do we still need smaller models in the age of the large model?

Vortrag

Worum geht es in diesem Projekt?

Neural network models have been growing larger and larger rapidly over the decades. Meanwhile, larger models also cause more difficulties when deployed on edge devices. Recent research proved that neural networks can also work at lower precision, even 1-bit. Therefore, we will briefly introduce the following three topics in this presentation:

1. Why must we explore ultra-low precision quantization technology for neural networks on edge devices?
2. What is the Quantized Neural Network? How does it work in different precision and models?
3. How can we train and implement the Quantized Neural Network as a hardware accelerator on FPGA/ASIC?

Placeholder Image

Autoren / Vortragende

Projektbeteiligte

Lehrstuhl / Institution

Chair of Processor Design