In a Training Loop 🔄

7 36 85

neuralink

https://phucnguyen.dev

AI & ML interests

distributed training @nous research. ex-nanotron @huggingface

Recent Activity

upvoted a paper about 22 hours ago

Efficient Pre-Training with Token Superposition

published a Space 15 days ago

neuralink/distill-blog-phuc

upvoted a paper about 2 months ago

Shortcut-connected Expert Parallelism for Accelerating Mixture-of-Experts

View all activity

Organizations

upvoted a paper about 22 hours ago

Efficient Pre-Training with Token Superposition

Paper • 2605.06546 • Published 8 days ago • 34

upvoted a paper about 2 months ago

Shortcut-connected Expert Parallelism for Accelerating Mixture-of-Experts

Paper • 2404.05019 • Published Apr 7, 2024 • 2

upvoted a paper 6 months ago

R-Horizon: How Far Can Your Large Reasoning Model Really Go in Breadth and Depth?

Paper • 2510.08189 • Published Oct 9, 2025 • 27

upvoted an article 9 months ago

Article

Arc Virtual Cell Challenge: A Primer

FL33TW00D-HF, abhinadduri

•

Jul 18, 2025

• 66

upvoted an article 12 months ago

Article

The Transformers Library: standardizing model definitions

lysandre, ArthurZ, pcuenq, julien-c

•

May 15, 2025

• 121

upvoted 2 articles about 1 year ago

Article

You could have designed state of the art positional encoding

FL33TW00D-HF

•

Nov 25, 2024

• 478

Article

Welcome Llama 4 Maverick & Scout on Hugging Face

burtenshaw, reach-vb, pcuenq, clem, rajatarya, jsulz, lysandre

•

Apr 5, 2025

• 149

upvoted a paper about 1 year ago

SmolVLM: Redefining small and efficient multimodal models

Paper • 2504.05299 • Published Apr 7, 2025 • 207

upvoted 3 articles about 1 year ago

Article

Open R1: Update #3

open-r1

•

Mar 11, 2025

• 297

Article

LLM Inference on Edge: A Fun and Easy Guide to run LLMs via React Native on your Phone!

medmekk, marcsun13

•

Mar 7, 2025

• 98

Article

Open-source DeepResearch – Freeing our search agents

m-ric, albertvillanova, merve, thomwolf, clefourrier

•

Feb 4, 2025

• 1.32k

upvoted 2 articles over 1 year ago

Article

Open-R1: a fully open reproduction of DeepSeek-R1

eliebak, lvwerra, lewtun

•

Jan 28, 2025

• 889

Article

Open-R1: Update #1

open-r1

•

Feb 2, 2025

• 305

upvoted 4 papers over 1 year ago

Domino: Eliminating Communication in LLM Training via Generic Tensor Slicing and Overlapping

Paper • 2409.15241 • Published Sep 23, 2024 • 1

upvoted an article over 1 year ago

Article

How NuminaMath Won the 1st AIMO Progress Prize

yfleureau, liyongsea, edbeeching, lewtun, benlipkin, romansoletskyi, vwxyzjn, kashif

•

Jul 11, 2024

• 128

upvoted a paper over 1 year ago

Grokking: Generalization Beyond Overfitting on Small Algorithmic Datasets

Paper • 2201.02177 • Published Jan 6, 2022 • 2

upvoted an article over 1 year ago

Article

A failed experiment: Infini-Attention, and why we should keep trying?

neuralink, lvwerra, thomwolf

•

Aug 14, 2024

• 76

neuralink

AI & ML interests

Recent Activity

Organizations

neuralink's activity

Arc Virtual Cell Challenge: A Primer

The Transformers Library: standardizing model definitions

You could have designed state of the art positional encoding

Welcome Llama 4 Maverick & Scout on Hugging Face

Open R1: Update #3

LLM Inference on Edge: A Fun and Easy Guide to run LLMs via React Native on your Phone!

Open-source DeepResearch – Freeing our search agents

Open-R1: a fully open reproduction of DeepSeek-R1

Open-R1: Update #1

How NuminaMath Won the 1st AIMO Progress Prize

A failed experiment: Infini-Attention, and why we should keep trying?