Introduction

CNN are broadly employed in the computer vision, NLP and others areas. However, many redudance exists in the network. In other word, one could leverage less computing resources to finish the task without accuracy loss. Many publications come out for the simplification. One of them is model quantization.

Quantization means to regress the activation or weight to several discrete set of number. Generally, quantization implies to fix point data type. In the publications, some focus on quantization on the weights, some others focus on the quantization on activation and also focus on both. Image classification is the most usual application to verify the algorithm, other scenarios such as detection, tracking, segmentation and even super resolution also attract attentions. Here summarize one of the papers.

Other related papers could be found in model-compression-summary

PACT: PARAMETERIZED CLIPPING ACTIVATION FOR QUANTIZED NEURAL NETWORKS

paper link
Comment by reviewer

Highlight

The paper proposes to set a learnable parameter during training to estimate the scale (dynamic range) of the activation. It is similar with RELU6 in my thought. However the clip value is not 6, it’s a trainable value, updated by gradient.

TAGS: paper model-compression ai quantization

« Image segmentation « Homepage » Relaxed Quantization for Discretized Neural Networks »