Register
Login
Resources
Docs Blog Datasets Glossary Case Studies Tutorials & Webinars
Product
Data Engine LLMs Platform Enterprise
Pricing Explore
Connect to our Discord channel
Myle Ott d6be0c7e00
Use FP32 for multi-head attention softmax
6 years ago
..
36e360d907
Use PyTorch LayerNorm and improve weight init
6 years ago
6641520612
fairseq-py goes distributed (#106)
6 years ago
d3795d6cd1
Merge internal changes (#136)
6 years ago
6641520612
fairseq-py goes distributed (#106)
6 years ago
d3795d6cd1
Merge internal changes (#136)
6 years ago
56f9ec3c38
Use ATen built-in conv_tbc method (#66)
6 years ago
d6be0c7e00
Use FP32 for multi-head attention softmax
6 years ago
81b47e7ee6
Fix buffers in sinusoidal positional embeddings
6 years ago

Comments

Loading...