Introduction
CNN are broadly employed in the computer vision, NLP and others areas. However, many redudance exists in the network. In other word, one could leverage less computing resources to finish the task without accuracy loss. Many publications come out for the simplification. One of them is model quantization.
Quantization means to regress the activation or weight to several discrete set of number. Generally, quantization implies to fix point data type. In the publications, some focus on quantization on the weights, some others focus on the quantization on activation and also focus on both. Image classification is the most usual application to verify the algorithm, other scenarios such as detection, tracking, segmentation and even super resolution also attract attentions. Here summarize one of the papers.
Other related papers could be found in model-compression-summary
Learning to train a binary neural network
paper link
From HPI
Tricks
- add gradient clipping threshold (also read the Dorefa-net to have a more deep insight). It is a kind of normalization or batchnorm or histogram adjust to the gradient.
- add skip connection
- From their experience, bottleneck is harmful for binarized network and should be avoid in network structure design.
- The forward/backward scale in XNor-net and Dorefa-net seems not useful. (they didn’t employ PRelu either)
- Visualize the histogram/weight when possible
Thinking
The gradients have a large dynamic range which is harmful for quantization. We are not better to normalize them as they might impact the converge. However, we could limited the range by clipping.