9 10

Sdeerk

AI & ML interests

None yet

Recent Activity

liked a Space about 2 months ago

HuggingFaceTB/smol-training-playbook

upvoted an article 4 months ago

Continuous batching from first principles

liked a Space 5 months ago

HuggingFaceFW/blogpost-fineweb-v1

View all activity

Organizations

liked a Space about 2 months ago

The Smol Training Playbook

📚

3.05k

The secrets to building world-class LLMs

upvoted an article 4 months ago

Article

Continuous batching from first principles

Nov 25, 2025

•

345

liked a Space 5 months ago

FineWeb: decanting the web for the finest text data at scale

🍷

1.31k

Read a detailed overview of the FineWeb web‑scale text dataset

liked a model 5 months ago

PaddlePaddle/PaddleOCR-VL

Image-Text-to-Text • 1.0B • Updated 5 days ago • 8.28k • 1.57k

liked a model 6 months ago

baidu/ERNIE-4.5-21B-A3B-Thinking

Text Generation • 22B • Updated Nov 26, 2025 • 1.47k • 776

upvoted an article 7 months ago

Article

Vision Language Models (Better, faster, stronger)

May 12, 2025

•

601

liked 2 datasets 7 months ago

Jofthomas/hermes-function-calling-thinking-V1

Viewer • Updated Feb 16, 2025 • 3.57k • 356 • 74

NousResearch/hermes-function-calling-v1

Viewer • Updated Jan 3 • 11.6k • 6.67k • 386

upvoted a paper 8 months ago

Group Sequence Policy Optimization

Paper • 2507.18071 • Published Jul 24, 2025 • 320

liked a Space 8 months ago

Awesome O1 R1

💻

[Keep updating]Collect everything about o1 and r1!

upvoted an article 8 months ago

Article

Mixture of Experts Explained

Dec 11, 2023

•

1.1k

upvoted an article 9 months ago

Article

Vision Language Models Explained

Apr 11, 2024

•

527

updated a model 9 months ago

baidu/ERNIE-4.5-21B-A3B-Base-Paddle

Text Generation • 22B • Updated Aug 20, 2025 • 4 • 10

liked a Space 9 months ago

The Ultra-Scale Playbook

🌌

3.75k

The ultimate guide to training LLM on large GPU Clusters

liked a dataset 9 months ago

openai/gsm8k

Benchmark • Updated Dec 20, 2025 • 17.6k • 673k • 1.21k

upvoted a collection 9 months ago

ERNIE 4.5

Collection

collection of ERNIE 4.5 models. • 27 items • Updated Nov 11, 2025 • 184

updated a model 9 months ago

baidu/ERNIE-4.5-21B-A3B-Paddle

Text Generation • 22B • Updated Sep 9, 2025 • 27 • 13

liked a dataset 10 months ago

K-and-K/knights-and-knaves

Viewer • Updated Oct 31, 2024 • 6.9k • 451 • 35

upvoted 2 articles 11 months ago

Article

Pre-Train BERT with Hugging Face Transformers and Habana Gaudi

Aug 22, 2022

•

Article

How to generate text: using different decoding methods for language generation with Transformers

Mar 1, 2020

•

294

Sdeerk

AI & ML interests

Recent Activity

Organizations

Sdeerk's activity

The Smol Training Playbook

Continuous batching from first principles

FineWeb: decanting the web for the finest text data at scale

Vision Language Models (Better, faster, stronger)

Awesome O1 R1

Mixture of Experts Explained

Vision Language Models Explained

The Ultra-Scale Playbook

Pre-Train BERT with Hugging Face Transformers and Habana Gaudi

How to generate text: using different decoding methods for language generation with Transformers