ZeroGPU Explorers

community

Activity Feed

AI & ML interests

None defined yet.

Recent Activity

BK-Lee authored a paper 19 days ago

Recursive Think-Answer Process for LLMs and VLMs

BK-Lee submitted a paper 19 days ago

Recursive Think-Answer Process for LLMs and VLMs

gsarti authored a paper 20 days ago

Agents of Chaos

View all activity

BestWishYsh

authored a paper 17 days ago

Helios: Real Real-Time Long Video Generation Model

Paper • 2603.04379 • Published 18 days ago • 173

BestWishYsh

posted an update 17 days ago

Post

3320

🚀 Introducing Helios: a 14B real-time long-video generation model!

It’s completely wild—faster than 1.3B models and achieves this without using self-forcing. Welcome to the new era of video generation! 😎👇

💻 Code: https://github.com/PKU-YuanGroup/Helios
🏠 Page: https://pku-yuangroup.github.io/Helios-Page
📄 Paper: Helios: Real Real-Time Long Video Generation Model (2603.04379)

🔹 True Single-GPU Extreme Speed ⚡️
No need to rely on traditional workarounds like KV-cache, quantization, sparse/linear attention, or TinyVAE. Helios hits an end-to-end 19.5 FPS on a single H100!

Training is also highly accessible: an 80GB VRAM can fit four 14B models.

🔹 Solving Long-Video "Drift" from the Core 🎥
Tired of visual drift and repetitive loops? We ditched traditional hacks (like error banks, self-forcing, or keyframe sampling).

Instead, our innovative training strategy simulates & eliminates drift directly, keeping minute-long videos incredibly coherent with stunning quality. ✨

🔹 3 Model Variants for Full Coverage 🛠️
With a unified architecture natively supporting T2V, I2V, and V2V, we are open-sourcing 3 flavors:

1️⃣ Base: Single-stage denoising for extreme high-fidelity.
2️⃣ Mid: Pyramid denoising + CFG-Zero for the perfect balance of quality & throughput.
3️⃣ Distilled: Adversarial Distillation (DMD) for ultra-fast, few-step generation.

🔹 Day-0 Ecosystem Ready 🌍
We wanted deployment to be a breeze from the second we launched. Helios drops with comprehensive Day-0 hardware and framework support:

✅ Huawei Ascend-NPU
✅ HuggingFace Diffusers
✅ vLLM-Omni
✅ SGLang-Diffusion

Try it out and let us know what you think!

6 replies

appvoid

posted an update 20 days ago

Post

2485

Let's keep the momentum for small models. I just published dot. It's the first pretrained causal model that is trained on math/symbols rather than english. The goal is to get an agnostic fewshot meta learner that learns from reality itself instead of language.

It's already decent at some tasks, with next version coming in a few weeks.

appvoid/dot

5 replies

appvoid

posted an update 21 days ago

Post

235

Are you ready for some ●s? Tomorrow will be a good day.

4 replies

victor

submitted a paper to Daily Papers 24 days ago

DualPath: Breaking the Storage Bandwidth Bottleneck in Agentic LLM Inference

Paper • 2602.21548 • Published 25 days ago • 46

appvoid

posted an update 28 days ago

Post

911

granite-4.0-350m, rwkv7-g1d-0.4b and LFM2-350M are currently the best sub 0.5b models currently for fewshot, simple language tasks

no one is saying this:

if you need the absolute speed + small size + quality, granite 350m is the current king

3 replies

Taishi-N324

authored a paper about 1 month ago

On the Optimal Reasoning Length for RL-Trained Language Models

Paper • 2602.09591 • Published Feb 10 • 5

1aurent

authored a paper about 1 month ago

Ministral 3

Paper • 2601.08584 • Published Jan 13 • 58

victor

posted an update about 2 months ago

Post

1708

Interesting article: use Claude Code to help open models write CUDA kernels (for eg) by turning CC traces into Skills. They made a library out of it 👀

https://huggingface.co/blog/upskill

LanguageBind

submitted a paper to Daily Papers about 2 months ago

iFSQ: Improving FSQ for Image Generation with 1 Line of Code

Paper • 2601.17124 • Published Jan 23 • 33

BestWishYsh

authored a paper 2 months ago

Focal Guidance: Unlocking Controllability from Semantic-Weak Layers in Video Diffusion Models

Paper • 2601.07287 • Published Jan 12 • 5

victor

posted an update 3 months ago

Post

3457

Nvidia is on a roll lately. Nemotron 3 Nano is my new fav local model, but here's the real flex: they published the entire evaluation setup. Configs, prompts, logs, all of it. This is how you do open models 🔥

https://huggingface.co/blog/nvidia/nemotron-3-nano-evaluation-recipe

KingNish

posted an update 3 months ago

Post

3280

Muon vs MuonClip vs Muon+Adamw

Muon has gone from an experiment to a mainstream optimizer, but does it hold up for fine‑tuning? We ran head‑to‑head tests on Qwen3‑4B (10k+ high‑quality instruction rows) to find out.

Short story: Pure Muon converged fastest at the start, but its gradient‑norm spikes made training unstable. MuonClip (Kimi K2’s clipping) stabilizes long pretraining runs, yet in our small‑scale fine‑tune it underperformed, lower token accuracy and slower convergence. The winner was the hybrid: Muon for 2D layers + AdamW for 1D layers. It delivered the best balance of stability and final performance and even beat vanilla AdamW.

Takeaway: for small-scale fine-tuning, hybrid = practical and reliable.

Next Step: scale to larger models/datasets to see if Muon’s spikes become catastrophic or if clipping wins out.

Full Blog Link: https://huggingface.co/blog/KingNish/optimizer-part1

KingNish

posted an update 3 months ago

Post

2726

I tested Muon vs MuonClip vs Muon+AdamW for fine-tuning LLMs
Just published a blog on that, Read here 👉 https://huggingface.co/blog/KingNish/optimizer-part1

1 reply

thecollabagepatch

posted an update 4 months ago

Post

444

hey musicians

hf continues to make the anti-suno device possible with gary4juce, the VST for your DAW that doesn't try to replace you.

v2 just released. https://thepatch.gumroad.com/l/gary4juce (pay what you want)

now you can use google's magenta-realtime model to generate 48k samples based on your input audio (or other model outputs...there's 4 to play with now).

just duplicate my hf space, turn on an L4/L40s and throw the url into the plugin.

i've got a few finetunes you can switch to as well. or you can push your finetune to the hub and play around.

the space: thecollabagepatch/magenta-retry (you can also use the html web tester to play around with realtime generation on the L40s)

13 replies

appvoid

posted an update 5 months ago

Post

330

What's the best model out there below 700m parameters that is good at few shot tasks? I want to test it against arco-3

Norod78

posted an update 5 months ago

Post

1833

Multilingual Tokenization Showdown
Analyzing 12 LLM Tokenizers Across 204 Languages.

First, I've created a dataset with Wikipedia's "Cat" article text in 272 languages:
Norod78/WikiCat-Multilingual

For each language entry with at least 100 words, I tokenized the text using 12 tokenizers and calculated the "Characters per token" ratio and "Word per token" ratio. The higher this ratio is, the more information each token represents on average for that language (and perhaps allowing the llm to potentially learn more per-parameter if trained on a dataset of that language).

You can see a slideshow summary of the results here:
https://norod.github.io/wikicat-tokenizer-eval/tokenizer-slideshow.html

I hope I interpreted the results correctly, I've made the code available on GitHub so you can re-create the raw results jsonl with this repo:
https://github.com/Norod/wikicat-tokenizer-eval

Post on X:
https://x.com/Norod78/status/1984366900550266999

appvoid

posted an update 5 months ago

Post

278

Introducing arco-3

The first project, as far as I know, that focuses purely on few-shot prompting results rather than zero-shot like usually done with decoder-only transformer models. This model excels at few-shot tasks compared to most 0.6b and even bigger models. It also outperforms the base model on some popular language modeling benchmarks.

appvoid/arco-3

Try it yourself!

LanguageBind

authored 2 papers 5 months ago

Look-Back: Implicit Visual Re-focusing in MLLM Reasoning

Paper • 2507.03019 • Published Jul 2, 2025 • 1

Can Understanding and Generation Truly Benefit Together -- or Just Coexist?

Paper • 2509.09666 • Published Sep 11, 2025 • 34

AI & ML interests

Recent Activity

Team members 749

zero-gpu-explorers's activity