BATCH
-
Rethinking Batch Normalization in TransformersARXIV/NLP 2020. 3. 25. 19:24
https://arxiv.org/abs/2003.07845v1 Rethinking Batch Normalization in Transformers The standard normalization method for neural network (NN) models used in Natural Language Processing (NLP) is layer normalization (LN). This is different than batch normalization (BN), which is widely-adopted in Computer Vision. The preferred use of LN in arxiv.org abstract NLP에서 사용되는 Neural network 모델의 표준 정규화 방법은 ..