Hugging Face

Team

company

Verified

https://huggingface.co

huggingface

Activity Feed

AI & ML interests

The AI community building the future.

Recent Activity

sayakpaul updated a dataset about 4 hours ago

huggingface/diffusers-metadata

alvarobartt updated a dataset about 5 hours ago

huggingface/DEH-image-scan-data

eustlb authored a paper 11 days ago

Open ASR Leaderboard: Towards Reproducible and Transparent Multilingual and Long-Form Speech Recognition Evaluation

View all activity

Papers

FineVision: Open Data Is All You Need

SmolVLM: Redefining small and efficient multimodal models

View all Papers

Articles

sayakpaul

updated a dataset about 4 hours ago

huggingface/diffusers-metadata

Viewer • Updated about 4 hours ago • 79 • 1.11k • 12

alvarobartt

updated a dataset about 5 hours ago

huggingface/DEH-image-scan-data

Viewer • Updated about 5 hours ago • 3 • 1.41k • 1

nielsr

updated a Space about 8 hours ago

AI Deadlines

⚡

550

Generate project deadlines

lysandre

updated a dataset about 9 hours ago

huggingface/transformers-metadata

Viewer • Updated about 9 hours ago • 1.94k • 1.46k • 31

eustlb

authored a paper 11 days ago

Open ASR Leaderboard: Towards Reproducible and Transparent Multilingual and Long-Form Speech Recognition Evaluation

Paper • 2510.06961 • Published Oct 8 • 9

Steveeeeeeen

authored 2 papers 15 days ago

Treble10: A high-quality dataset for far-field speech recognition, dereverberation, and enhancement

Paper • 2510.23141 • Published Oct 27 • 4

Open ASR Leaderboard: Towards Reproducible and Transparent Multilingual and Long-Form Speech Recognition Evaluation

Paper • 2510.06961 • Published Oct 8 • 9

badaoui

posted an update 19 days ago

Post

355

Building high-performance, reproducible kernels for AMD ROCm just got a lot easier.

I've put together a guide on building, testing, and sharing ROCm-compatible kernels using the Hugging Face kernel-builder and kernels libraries; so you can focus on optimizing performance rather than spending time on setup.

Learn how to:

- Use Nix for reproducible builds
- Integrate kernels as native PyTorch operators
- Share your kernels on the Hub for anyone to use with kernels.get_kernel()

We use the 🏆 award-winning RadeonFlow GEMM kernel as a practical example.

📜 Check out the full guide here : https://huggingface.co/blog/build-rocm-kernels

evalstate

posted an update 21 days ago

Post

2181

Hugging Face MCP Server v0.2.46
~~~~~~~~~~~~~~~~~~~~~~~~~~~~

- Add "discover" to Dynamic Space tool. Recommend deselecting "space_search" if using dynamic spaces.

evalstate

posted an update 23 days ago

Post

2887

Hugging Face MCP Server v0.2.45
~~~~~~~~~~~~~~~~~~~~~~~~~~~~

- New! Experimental dynamic_space tool.
- Default Image Generator changed to Qwen-Image-Fast

evalstate

posted an update 29 days ago

Post

2147

Hugging Face MCP Server v0.2.40
~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Improved progressive disclosure and descriptions for Jobs tool.

AdinaY

posted an update 30 days ago

Post

3194

Kimi K2 Thinking is now live on the hub 🔥

moonshotai/Kimi-K2-Thinking

✨ 1T MoE for deep reasoning & tool use
✨ Native INT4 quantization = 2× faster inference
✨ 256K context window
✨ Modified MIT license

AdinaY

posted an update about 1 month ago

Post

627

Chinese open source AI in October wasn’t about bigger models, it was about real world impact 🔥

https://huggingface.co/collections/zh-ai-community/october-2025-china-open-source-highlights

✨ Vision-Language & OCR wave 🌊
- DeepSeek-OCR : 3B
- PaddleOCR-VL : 0.9B
- Qwen3-VL : 2B / 4B / 8B / 32B /30B-A3B
- Open-Bee: Bee-8B-RL
- http://Z.ai Glyph :10B

OCR is industrializing, the real game now is understanding the (long context) document, not just reading it.

✨ Text generation: scale or innovation?
- MiniMax-M2: 229B
- Antgroup Ling-1T & Ring-1T
- Moonshot Kimi-Linear : linear-attention challenger
- Kwaipilot KAT-Dev

Efficiency is the key.

✨ Any-to-Any & World-Model : one step forward to the real world
- BAAI Emu 3.5
- Antgroup Ming-flash-omni
- HunyuanWorld-Mirror: 3D

Aligning with the “world model” globally

✨ Audio & Speech + Video & Visual: released from entertainment labs to delivery platforms
- SoulX-Podcast TTS
- LongCat-Audio-Codec & LongCat-Video by Meituan delivery paltform
- xiabs DreamOmni 2

Looking forward to what's next 🚀

AdinaY

authored a paper about 1 month ago

RoboChallenge: Large-scale Real-robot Evaluation of Embodied Policies

Paper • 2510.17950 • Published Oct 20 • 7

AdinaY

posted an update about 1 month ago

Post

504

Kimi Linear🚀 Hybrid linear attention model from Moonshot AI

https://huggingface.co/collections/moonshotai/kimi-linear-a3b

✨ 48B total/ 3B active - MIT license
✨ Up to 1M context
✨ 84.3 on RULER (128k) with 3.98× speedup
✨ Hybrid KDA + MLA architecture for peak throughput & quality

nouamanetazi

posted an update about 1 month ago

Post

3942

After training 𝐒𝐦𝐨𝐥𝐋𝐌𝟑 on 𝟑𝟖𝟒 𝐇𝟏𝟎𝟎𝐬 for nearly a month, I've come to realize something most people overlook: 𝐢𝐧𝐟𝐫𝐚𝐬𝐭𝐫𝐮𝐜𝐭𝐮𝐫𝐞 𝐢𝐬 𝐭𝐡𝐞 𝐦𝐚𝐤𝐞-𝐨𝐫-𝐛𝐫𝐞𝐚𝐤 𝐟𝐚𝐜𝐭𝐨𝐫 𝐢𝐧 𝐋𝐋𝐌 𝐭𝐫𝐚𝐢𝐧𝐢𝐧𝐠. 🔥

Everyone talks about model architecture and data quality. And yes, those matter immensely. But here's what nobody tells you: when your training run fails at 2 AM because of mysterious 𝐍𝐂𝐂𝐋 𝐞𝐫𝐫𝐨𝐫𝐬, or when your expensive GPU cluster is running at 𝟔𝟎% 𝐞𝐟𝐟𝐢𝐜𝐢𝐞𝐧𝐜𝐲, the problem isn't your model. It's most probably a 𝐦𝐢𝐬𝐮𝐬𝐞 𝐨𝐟 𝐭𝐡𝐞 𝐡𝐚𝐫𝐝𝐰𝐚𝐫𝐞. 🛠️

Questions that seemed simple but had no clear answers: Why is 𝐌𝐨𝐄 𝐭𝐫𝐚𝐢𝐧𝐢𝐧𝐠 𝐬𝐥𝐨𝐰𝐞𝐫 𝐭𝐡𝐚𝐧 𝐝𝐞𝐧𝐬𝐞 𝐦𝐨𝐝𝐞𝐥𝐬? Which 𝐍𝐂𝐂𝐋 𝐟𝐥𝐚𝐠𝐬 should we actually set? How often should we checkpoint without killing throughput?

That's why we built 𝐓𝐡𝐞 𝐒𝐦𝐨𝐥 𝐓𝐫𝐚𝐢𝐧𝐢𝐧𝐠 𝐏𝐥𝐚𝐲𝐛𝐨𝐨𝐤 📖: a complete guide covering everything from model architecture and data curation to the SmolLM3 training marathon, post-training techniques, and crucially, the 𝐢𝐧𝐟𝐫𝐚𝐬𝐭𝐫𝐮𝐜𝐭𝐮𝐫𝐞 𝐥𝐚𝐲𝐞𝐫 that most teams get wrong.

We validated real vs theoretical bandwidth across the entire stack: 𝐇𝐁𝐌𝟑 𝐡𝐢𝐭𝐭𝐢𝐧𝐠 𝟑 𝐓𝐁/𝐬, 𝐍𝐕𝐋𝐢𝐧𝐤 𝟒.𝟎 𝐫𝐞𝐚𝐜𝐡𝐢𝐧𝐠 𝟕𝟖𝟔 𝐆𝐁/𝐬, 𝐏𝐂𝐈𝐞 𝐆𝐞𝐧𝟒 𝐚𝐭 𝟏𝟒.𝟐 𝐆𝐁/𝐬. Then we ran collective operations across 𝟏𝟐𝟖 𝐆𝐏𝐔𝐬 (16 nodes, 8xH100s each) and measured how performance degrades at scale: all-reduce drops from 𝟒𝟖𝟎 𝐆𝐁/𝐬 on a single node to 𝟑𝟐𝟎-𝟑𝟓𝟎 𝐆𝐁/𝐬 across 16 nodes.

If you've ever wondered why your training runs are slower than they should be, or you're planning to scale up and want to avoid expensive mistakes, this guide might save you weeks of debugging.

𝐓𝐡𝐞 𝐒𝐦𝐨𝐥 𝐓𝐫𝐚𝐢𝐧𝐢𝐧𝐠 𝐏𝐥𝐚𝐲𝐛𝐨𝐨𝐤: https://lnkd.in/e5MKXUHS

Shared with ❤️ by the HuggingFace team

meg

posted an update about 1 month ago

Post

3726

🤖 Did you know your voice might be cloned without your consent from just *one sentence* of audio?
That's not great. So with @frimelle , we brainstormed a new idea for developers who want to curb malicious use: ✨The Voice Consent Gate.✨
Details, code, here: https://huggingface.co/blog/voice-consent-gate

3 replies

evalstate

posted an update about 1 month ago

Post

323

Hugging Face MCP Server v0.2.35
~~~~~~~~~~~~~~~~~~~~~~~~~~~~

$HF_TOKEN is expanded in Jobs Secrets environment variables.

AdinaY

posted an update about 1 month ago

Post

1747

Ming-flash-omni Preview 🚀 Multimodal foundation model from AntGroup

inclusionAI/Ming-flash-omni-Preview

✨ Built on Ling-Flash-2.0: 10B total/6B active
✨ Generative segmentation-as-editing
✨ SOTA contextual & dialect ASR
✨ High-fidelity image generation

AdinaY

posted an update about 1 month ago

Post

1855

Glyph 🔥 a framework that scales context length by compressing text into images and processing them with vision–language models, released by Z.ai.

Paper:https://huggingface.co/papers/2510.17800
Model:https://huggingface.co/zai-org/Glyph

✨ Compresses long sequences visually to bypass token limits
✨ Reduces computational and memory costs
✨ Preserves meaning through multimodal encoding
✨ Built on GLM-4.1V-9B-Base

AI & ML interests

Recent Activity

Papers

Articles

On the Shifting Global Compute Landscape

Announcing Hugging Face Fundamentals: A New Learning Track on DataCamp

Yay! Organizations can now publish blog Articles

Team members 191

huggingface's activity

AI Deadlines