Transformer 代码实现及应用

栏目: 数据库 · 发布时间: 6年前

内容简介:Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, Illia Polosukhin(Submitted on 12 Jun 2017 (v1), last revised 6 Dec 2017 (this version, v5))

Transformer_implementation_and_application 该资源仅仅用300行代码(Tensorflow 2)完整复现了Transformer模型,并且应用在神经机器翻译任务和聊天机器人上。

Attention Is All You Need

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, Illia Polosukhin

(Submitted on 12 Jun 2017 (v1), last revised 6 Dec 2017 (this version, v5))

The dominant sequence transduction models are based on complex recurrent or convolutional neural networks in an encoder-decoder configuration. The best performing models also connect the encoder and decoder through an attention mechanism. We propose a new simple network architecture, the Transformer, based solely on attention mechanisms, dispensing with recurrence and convolutions entirely. Experiments on two machine translation tasks show these models to be superior in quality while being more parallelizable and requiring significantly less time to train. Our model achieves 28.4 BLEU on the WMT 2014 English-to-German translation task, improving over the existing best results, including ensembles by over 2 BLEU. On the WMT 2014 English-to-French translation task, our model establishes a new single-model state-of-the-art BLEU score of 41.8 after training for 3.5 days on eight GPUs, a small fraction of the training costs of the best models from the literature. We show that the Transformer generalizes well to other tasks by applying it successfully to English constituency parsing both with large and limited training data.

Comments: 15 pages, 5 figures

Subjects: Computation and Language (cs.CL); Machine Learning (cs.LG)

Cite as: arXiv:1706.03762 [cs.CL]

(or arXiv:1706.03762v5 [cs.CL] for this version)


以上就是本文的全部内容,希望对大家的学习有所帮助,也希望大家多多支持 码农网

查看所有标签

本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们

微积分的历程

微积分的历程

William Dunham / 李伯民、汪军、张怀勇 / 人民邮电出版社 / 2010-8 / 29.00元

“微积分”这一名称最早出现在哪本书中?第一本微积分教科书又是谁人所写?微积分究竟是谁人发明的?著名的洛必达法则居然是伯努利的研究成果?谁被誉为“分析学的化身”?谁又被誉为“现代分析学之父”?哪些数学天才使微积分的创建过程终于画上完美的句号?……本书将带你一一探究上述问题。 本书宛如一座陈列室,汇聚了十多位数学大师的杰作,当你徜徉其中时会对人类的想象力惊叹不已,当你离去时必然满怀对天才们的钦佩......一起来看看 《微积分的历程》 这本书的介绍吧!

html转js在线工具
html转js在线工具

html转js在线工具

UNIX 时间戳转换
UNIX 时间戳转换

UNIX 时间戳转换

RGB HSV 转换
RGB HSV 转换

RGB HSV 互转工具