view article Article Tensor Parallelism (TP) in Transformers: 5 Minutes to Understand 23 days ago • 62
view article Article Welcome Gemma 3: Google's all new multimodal, multilingual, long context open LLM +2 Mar 12 • 479
view article Article Training and Finetuning Reranker Models with Sentence Transformers v4 Mar 26 • 177
DAPO: An Open-Source LLM Reinforcement Learning System at Scale Paper • 2503.14476 • Published Mar 18 • 144
Slamming: Training a Speech Language Model on One GPU in a Day Paper • 2502.15814 • Published Feb 19 • 69
view article Article Train 400x faster Static Embedding Models with Sentence Transformers Jan 15 • 222
Aya Dataset: An Open-Access Collection for Multilingual Instruction Tuning Paper • 2402.06619 • Published Feb 9, 2024 • 56
MobileLLM Collection Optimizing Sub-billion Parameter Language Models for On-Device Use Cases (ICML 2024) https://arxiv.org/abs/2402.14905 • 50 items • Updated 15 days ago • 136
view article Article Training and Finetuning Embedding Models with Sentence Transformers v3 May 28, 2024 • 262
Aya Datasets Collection The Aya Collection is a massive multilingual collection for over 100 languages consisting of 513 million instances of prompts and completions. • 5 items • Updated Jul 31 • 23
Cohere Labs Aya 23 Collection Aya 23 is an open weights research release of an instruction fine-tuned model with highly advanced multilingual capabilities. • 3 items • Updated Jul 31 • 56