AI & ML interests

The AI community building the future.

Recent Activity

Articles

badaouiย 
posted an update 19 days ago
view post
Post
355
Building high-performance, reproducible kernels forย AMD ROCmย just got a lot easier.

I've put together a guide on building, testing, and sharingย ROCm-compatible kernelsย using the Hugging Faceย kernel-builder and kernelsย libraries; so you can focus onย optimizing performanceย rather than spending time on setup.

Learn how to:

- Use Nix for reproducible builds
- Integrate kernels as native PyTorch operators
- Share your kernels on the Hub for anyone to use withย kernels.get_kernel()

We use the ๐Ÿ† award-winning RadeonFlow GEMM kernel as a practical example.

๐Ÿ“œ Check out the full guide here : https://huggingface.co/blog/build-rocm-kernels
evalstateย 
posted an update 21 days ago
view post
Post
2181
Hugging Face MCP Server v0.2.46
~~~~~~~~~~~~~~~~~~~~~~~~~~~~

- Add "discover" to Dynamic Space tool. Recommend deselecting "space_search" if using dynamic spaces.
evalstateย 
posted an update 23 days ago
view post
Post
2887
Hugging Face MCP Server v0.2.45
~~~~~~~~~~~~~~~~~~~~~~~~~~~~

- New! Experimental dynamic_space tool.
- Default Image Generator changed to Qwen-Image-Fast
evalstateย 
posted an update 29 days ago
view post
Post
2147
Hugging Face MCP Server v0.2.40
~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Improved progressive disclosure and descriptions for Jobs tool.
AdinaYย 
posted an update 30 days ago
view post
Post
3194
Kimi K2 Thinking is now live on the hub ๐Ÿ”ฅ

moonshotai/Kimi-K2-Thinking

โœจ 1T MoE for deep reasoning & tool use
โœจ Native INT4 quantization = 2ร— faster inference
โœจ 256K context window
โœจ Modified MIT license
AdinaYย 
posted an update about 1 month ago
view post
Post
627
Chinese open source AI in October wasnโ€™t about bigger models, it was about real world impact ๐Ÿ”ฅ

https://huggingface.co/collections/zh-ai-community/october-2025-china-open-source-highlights

โœจ Vision-Language & OCR wave ๐ŸŒŠ
- DeepSeek-OCR : 3B
- PaddleOCR-VL : 0.9B
- Qwen3-VL : 2B / 4B / 8B / 32B /30B-A3B
- Open-Bee: Bee-8B-RL
- http://Z.ai Glyph :10B

OCR is industrializing, the real game now is understanding the (long context) document, not just reading it.

โœจ Text generation: scale or innovation?
- MiniMax-M2: 229B
- Antgroup Ling-1T & Ring-1T
- Moonshot Kimi-Linear : linear-attention challenger
- Kwaipilot KAT-Dev

Efficiency is the key.

โœจ Any-to-Any & World-Model : one step forward to the real world
- BAAI Emu 3.5
- Antgroup Ming-flash-omni
- HunyuanWorld-Mirror: 3D

Aligning with the โ€œworld modelโ€ globally

โœจ Audio & Speech + Video & Visual: released from entertainment labs to delivery platforms
- SoulX-Podcast TTS
- LongCat-Audio-Codec & LongCat-Video by Meituan delivery paltform
- xiabs DreamOmni 2

Looking forward to what's next ๐Ÿš€
AdinaYย 
posted an update about 1 month ago
view post
Post
504
Kimi Linear๐Ÿš€ Hybrid linear attention model from Moonshot AI

https://huggingface.co/collections/moonshotai/kimi-linear-a3b

โœจ 48B total/ 3B active - MIT license
โœจ Up to 1M context
โœจ 84.3 on RULER (128k) with 3.98ร— speedup
โœจ Hybrid KDA + MLA architecture for peak throughput & quality
nouamanetaziย 
posted an update about 1 month ago
view post
Post
3942
After training ๐’๐ฆ๐จ๐ฅ๐‹๐Œ๐Ÿ‘ on ๐Ÿ‘๐Ÿ–๐Ÿ’ ๐‡๐Ÿ๐ŸŽ๐ŸŽ๐ฌ for nearly a month, I've come to realize something most people overlook: ๐ข๐ง๐Ÿ๐ซ๐š๐ฌ๐ญ๐ซ๐ฎ๐œ๐ญ๐ฎ๐ซ๐ž ๐ข๐ฌ ๐ญ๐ก๐ž ๐ฆ๐š๐ค๐ž-๐จ๐ซ-๐›๐ซ๐ž๐š๐ค ๐Ÿ๐š๐œ๐ญ๐จ๐ซ ๐ข๐ง ๐‹๐‹๐Œ ๐ญ๐ซ๐š๐ข๐ง๐ข๐ง๐ . ๐Ÿ”ฅ

Everyone talks about model architecture and data quality. And yes, those matter immensely. But here's what nobody tells you: when your training run fails at 2 AM because of mysterious ๐๐‚๐‚๐‹ ๐ž๐ซ๐ซ๐จ๐ซ๐ฌ, or when your expensive GPU cluster is running at ๐Ÿ”๐ŸŽ% ๐ž๐Ÿ๐Ÿ๐ข๐œ๐ข๐ž๐ง๐œ๐ฒ, the problem isn't your model. It's most probably a ๐ฆ๐ข๐ฌ๐ฎ๐ฌ๐ž ๐จ๐Ÿ ๐ญ๐ก๐ž ๐ก๐š๐ซ๐๐ฐ๐š๐ซ๐ž. ๐Ÿ› ๏ธ

Questions that seemed simple but had no clear answers: Why is ๐Œ๐จ๐„ ๐ญ๐ซ๐š๐ข๐ง๐ข๐ง๐  ๐ฌ๐ฅ๐จ๐ฐ๐ž๐ซ ๐ญ๐ก๐š๐ง ๐๐ž๐ง๐ฌ๐ž ๐ฆ๐จ๐๐ž๐ฅ๐ฌ? Which ๐๐‚๐‚๐‹ ๐Ÿ๐ฅ๐š๐ ๐ฌ should we actually set? How often should we checkpoint without killing throughput?

That's why we built ๐“๐ก๐ž ๐’๐ฆ๐จ๐ฅ ๐“๐ซ๐š๐ข๐ง๐ข๐ง๐  ๐๐ฅ๐š๐ฒ๐›๐จ๐จ๐ค ๐Ÿ“–: a complete guide covering everything from model architecture and data curation to the SmolLM3 training marathon, post-training techniques, and crucially, the ๐ข๐ง๐Ÿ๐ซ๐š๐ฌ๐ญ๐ซ๐ฎ๐œ๐ญ๐ฎ๐ซ๐ž ๐ฅ๐š๐ฒ๐ž๐ซ that most teams get wrong.

We validated real vs theoretical bandwidth across the entire stack: ๐‡๐๐Œ๐Ÿ‘ ๐ก๐ข๐ญ๐ญ๐ข๐ง๐  ๐Ÿ‘ ๐“๐/๐ฌ, ๐๐•๐‹๐ข๐ง๐ค ๐Ÿ’.๐ŸŽ ๐ซ๐ž๐š๐œ๐ก๐ข๐ง๐  ๐Ÿ•๐Ÿ–๐Ÿ” ๐†๐/๐ฌ, ๐๐‚๐ˆ๐ž ๐†๐ž๐ง๐Ÿ’ ๐š๐ญ ๐Ÿ๐Ÿ’.๐Ÿ ๐†๐/๐ฌ. Then we ran collective operations across ๐Ÿ๐Ÿ๐Ÿ– ๐†๐๐”๐ฌ (16 nodes, 8xH100s each) and measured how performance degrades at scale: all-reduce drops from ๐Ÿ’๐Ÿ–๐ŸŽ ๐†๐/๐ฌ on a single node to ๐Ÿ‘๐Ÿ๐ŸŽ-๐Ÿ‘๐Ÿ“๐ŸŽ ๐†๐/๐ฌ across 16 nodes.

If you've ever wondered why your training runs are slower than they should be, or you're planning to scale up and want to avoid expensive mistakes, this guide might save you weeks of debugging.

๐“๐ก๐ž ๐’๐ฆ๐จ๐ฅ ๐“๐ซ๐š๐ข๐ง๐ข๐ง๐  ๐๐ฅ๐š๐ฒ๐›๐จ๐จ๐ค: https://lnkd.in/e5MKXUHS

Shared with โค๏ธ by the HuggingFace team
megย 
posted an update about 1 month ago
view post
Post
3726
๐Ÿค– Did you know your voice might be cloned without your consent from just *one sentence* of audio?
That's not great. So with @frimelle , we brainstormed a new idea for developers who want to curb malicious use: โœจThe Voice Consent Gate.โœจ
Details, code, here: https://huggingface.co/blog/voice-consent-gate
  • 3 replies
ยท
evalstateย 
posted an update about 1 month ago
view post
Post
323
Hugging Face MCP Server v0.2.35
~~~~~~~~~~~~~~~~~~~~~~~~~~~~

$HF_TOKEN is expanded in Jobs Secrets environment variables.
AdinaYย 
posted an update about 1 month ago
view post
Post
1747
Ming-flash-omni Preview ๐Ÿš€ Multimodal foundation model from AntGroup

inclusionAI/Ming-flash-omni-Preview

โœจ Built on Ling-Flash-2.0: 10B total/6B active
โœจ Generative segmentation-as-editing
โœจ SOTA contextual & dialect ASR
โœจ High-fidelity image generation
AdinaYย 
posted an update about 1 month ago
view post
Post
1855

Glyph ๐Ÿ”ฅ a framework that scales context length by compressing text into images and processing them with visionโ€“language models, released by Z.ai.

Paper:https://huggingface.co/papers/2510.17800
Model:https://huggingface.co/zai-org/Glyph

โœจ Compresses long sequences visually to bypass token limits
โœจ Reduces computational and memory costs
โœจ Preserves meaning through multimodal encoding
โœจ Built on GLM-4.1V-9B-Base