Reintroducing Straight-Through Estimators as Principled Methods for Stochastic Binary Networks

作者: Alexander Shekhovtsov , Viktor Yanush

DOI:

关键词:

摘要: Training neural networks with binary weights and activations is a challenging problem due to the lack of gradients and difficulty of optimization over discrete weights. Many successful …

参考文章(50)
Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun, Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification international conference on computer vision. pp. 1026- 1034 ,(2015) , 10.1109/ICCV.2015.123
Christian Szegedy, Sergey Ioffe, Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift international conference on machine learning. ,vol. 1, pp. 448- 456 ,(2015)
Mark Horowitz, 1.1 Computing's energy problem (and what we can do about it) international solid-state circuits conference. pp. 10- 14 ,(2014) , 10.1109/ISSCC.2014.6757323
Endre Boros, Peter L. Hammer, Pseudo-boolean optimization Discrete Applied Mathematics. ,vol. 123, pp. 155- 225 ,(2002) , 10.1016/S0166-218X(01)00341-9
Radford M. Neal, Connectionist learning of belief networks Artificial Intelligence. ,vol. 56, pp. 71- 113 ,(1992) , 10.1016/0004-3702(92)90065-6
Ilya Sutskever, Geoffrey Hinton, Alex Krizhevsky, Ruslan Salakhutdinov, Nitish Srivastava, Dropout: a simple way to prevent neural networks from overfitting Journal of Machine Learning Research. ,vol. 15, pp. 1929- 1958 ,(2014)
Alex Graves, Practical Variational Inference for Neural Networks neural information processing systems. ,vol. 24, pp. 2348- 2356 ,(2011)
Ruslan R Salakhutdinov, Yichuan Tang, Learning Stochastic Feedforward Neural Networks neural information processing systems. ,vol. 26, pp. 530- 538 ,(2013)