Quantization and low-precision computation for efficient inference | Deep Learning Systems Class Notes