Optimize Hierarchical Softmax with Word Similarity Knowledge

Zhixuan Yang, Chong Ruan, Caihua Li, Junfeng Hu

Full Text: PDF