Oral 7: Neural Network Accelerators

Session Chair: Chi-Chia Sun (National Formosa University) and Pei-Jun Lee (National Chi Nan University)
Date: Aug. 7 (Wed.), 2019
Time: 15:30 – 17:00
Room: 6F 茗廳


順序 口頭報告論文
1 15:30 – 15:43 (S0208) (Best Paper Candidates)
A bio-potential and piezoelectric sensing system with intelligent real-time computing, compressing and communication
Ing-Jer Huang, Hsu-Kang Dow, and Shih-Jung Pao
National Sun Yat-sen University

Abstract:

We present a wearable bio-signal sensing system that monitors electromyography (EMG), electrocardiogram (ECG), vibration, and temperature for elderly care. The system contains analog front-end circuits (AFE) for bio-potential signal sampling, amplifying and digitization, a phase-lock loop (PLL) circuit, a digital signal controller for AFE control and signal calibration, an 32-bit microprocessor for compression and communication. The prototype is implemented with a small 5x5cm development board which is then attached to smart T-shirt and arm bands as a light-weight wearable system. A tablet-based game has been implemented to engage elder people to exercise by interacting their muscles with the game character while monitoring their muscle fatigue and heart rate variation.

2 15:43 – 15:56 (S0177) (Best Paper Candidates)
An Electrocardiogram Classification System with Neural Network Hardware Implementation
Yu-Yi Liao, Peng-Wei Huang, and Shuenn-Yuh Lee
National Cheng-Kung University

Abstract:

This paper presents a real-time identification system for electrocardiogram (ECG) classification with the neural network (NN) classifier. The identification flow of the proposed system is described as following step by step: 1. Collecting ECG lead II signal. 2. Filtering original signals by wavelet transform. 3. Calculating twenty feature values and normalizing these features. 4. Using principal component analysis (PCA) to reduce feature number. 5. Classifying the normal beat, premature atrial complex (PAC) and premature ventricular contraction (PVC) by classifier. The accuracy of the proposed method are evaluated using different normal and abnormal ECG signals taken from the standard MIT-BIH arrhythmia database. The proposed system is verified on software design. The software part is designed by python and tested by Matlab, and the hardware is implemented by the chip fabricated with TSMC 0.18um CMOS technology. All machine learning processors, including preprocessing, feature extraction, and classifier, are implemented on a chip. The training data and testing data are independent each other. In other words, the person included in training data set never appears in testing data set for blind test. The accuracy of the proposed system is about 95.45% by the verification on the software. It reveals the proposed architecture is effective for ECG classification.

3 15:56 – 16:09 (S0072) (Best Paper Candidates)
An Acoustic DSP Processor with CNN-FFT Accelerators for Speech Enhancement
Yu-Chi Lee (1), Tai-Shih Chi (2), and Chia-Hsiang Yang (1)
(1) National Taiwan University and (2) National Chiao Tung University

Abstract:

This paper proposes an acoustic DSP processor with a neural network core for speech enhancement. Accelerators for convolutional neural network (CNN) and fast Fourier transform (FFT) are embedded. The CNN-based speech enhancement algorithm is adopted in this work. An array of multiply-accumulator (MAC) and coordinate rotation digital computer (CORDIC) engines are deployed to efficiently compute linear and nonlinear functions. Hardware sharing is applied to reduce hardware area by leveraging the high similarity between CNN and FFT computations. The proposed DSP processor chip is fabricated in a 40-nm CMOS technology with a core area of 4.3 mm^2. The chip's power dissipation is 2.17 mW at an operating frequency of 5 MHz. The speech intelligibility can be enhanced by up to 41% under low SNR conditions.

4 16:09 – 16:22 (S0159)
Low-Complexity Neural Network-Based Digital Pre-Distortion of 5G Wideband Power Amplifiers with Hybrid Architecture Design
You-Cheng Lu, Ching-Chun Liao, Sin-Sheng Wong, and An-Yeu (Andy) Wu
National Taiwan University

Abstract:

Digital pre-distortion (DPD), is exploited for power amplifier (PA) linearization. Due to the ultra-high linearity required modulations employed in 5G communications, the performance of PA linearization based on conventional polynomial-based (MP) DPD becomes limited. Besides, despite of superior linearization performance, the complexity of recently proposed deep learning-based DPD is too high to be efficiently implemented in hardware. In this paper, a low-complexity hybrid neural network (NN)-based DPD of 5G wideband PA is proposed. The linearization error can be jointly compensated by a hybrid architecture design with the proposed NN compensation model and a coarse-grained MP linearizer. The simulation results show that, with comparable linearization performance, the total parameters can be reduced by 80% compared with the state-of-the-art NN model.

5 16:22 – 16:35 (S0160)
Recurrent Neural Network-based Equalizer with Utilization of Coding Gain in Advance
Chieh-Fang Teng, Han-Mo Ou, and An-Yeu (Andy) Wu
National Taiwan University

Abstract:

Recently, deep learning has been exploited in many fields with revolutionized breakthroughs. In the light of this, deep learning-assisted communication systems have also attracted much attention in recent years and have potential to break down the conventional design rule for communication systems. In this work, a recurrent neural network-based equalizer is proposed, which not only eliminates channel fading, but also exploits the code structure with utilization of coding gain in advance. The equalizer in conventional block-based design may destroy the code structure and degrade the capacity of coding gain for decoder. On the contrary, our proposed approach can increase the overall utilization of coding gain with more than 1.5 dB gain.

6 16:35 – 16:48 (S0118)
A New and Efficient SVM Accelerator Design
Jian-Jhang Chen, Jer-Min Jou and Ming-Han Shieh
National Cheng Kung University

Abstract:

Support vector machines (SVMs) are widely used in various artificial intelligence (AI) applications. Due to AI applications’ high computation complexity and real-time requirement, it is critical to speed up the SVM operation efficiently. The most part of the SVM computation is the kernel functions, which dominate the overall SVM speed and need to be implemented with special hardware. In this paper, we designed a new SVM hardware accelerator that speeds up efficiently the calculation of kernel functions by changing the form of the decision function and by tiling the loops in it. And, we had also designed a new efficient fixed-width multiplier with very low errors for use in this SVM accelerator. Therefore, our SVM accelerator has a significantly improved detection speed compared to others, and the fixed-width multiplier has the lowest errors than other approximate multipliers.

7 16:48 – 17:01 (S0105)
CNN Training Acceleration Solution Based on the FloatSD Technique and Its Systolic Array FPGA Implementation
Mu-Kai Sun, Chu-King Kung, and Tzi-Dar Chiueh
National Taiwan University

Abstract:

In this paper, we propose a CNN training acceleration system design based on FPGA implementation for the floating-point signed digit (FloatSD) number representation and update method [1]. The FloatSD technology exploits the imprecision tolerant characteristic of neural network training and adopts only a couple of non-zero digits in a neural network weight, reducing the convolution multiplication to addition of two shifted partial products. Furthermore, the mantissa field and the exponent field of neuron activations and gradients during training are also quantized. In addition, we describe in detail how the overall system operates with cooperation between software and acceleration hardware in FPGA. Finally, we present the design of a systolic array based PE Cube for execution of convolution based on FloatSD arithmetic. The measure power consumption results indicate that the FloatSD MAC is more than 20 times energy efficient than the counterpart FP32 MAC.








Powered by Poetry-Life Professional Conference Organizer