Models

71,953

Full-text search

Active filters: reinforcement-learning

Adilbai/stock-trading-rl-agent

Reinforcement Learning • Updated Jan 8 • 115 • 133

ValueFX9507/Tifa-DeepsexV2-7b-MGRPO-GGUF-Q8

Reinforcement Learning • 8B • Updated Mar 28, 2025 • 17.6k • 200

zai-org/GLM-TTS

Text-to-Speech • Updated Jan 12 • 1.91k • 332

exla-ai/openpie-0.6

Robotics • Updated Feb 4 • 56 • 16

XunmeiLiu/VFIG-4B

Reinforcement Learning • 4B • Updated 13 days ago • 329 • 5

wanglab/bioreason-pro-rl

Reinforcement Learning • 4B • Updated 12 days ago • 199 • 7

zlab-princeton/Vero-Qwen3T-8B

Image-Text-to-Text • 9B • Updated 1 day ago • 11 • 2

batteryphil/mamba-2.8b-latent

Reinforcement Learning • 3B • Updated 3 days ago • 1.16k • 2

IntelligenceLab/COS-PLAY

Reinforcement Learning • Updated 1 day ago • 2

ValueFX9507/Tifa-DeepsexV2-7b-MGRPO-GGUF-Q4

Reinforcement Learning • 8B • Updated Mar 26, 2025 • 720 • 224

Open-Reasoner-Zero/Open-Reasoner-Zero-1.5B

Reinforcement Learning • 2B • Updated Apr 6, 2025 • 67 • 1

TianheWu/VisualQuality-R1-7B

Reinforcement Learning • 8B • Updated Sep 19, 2025 • 1.65k • 11

JonusNattapong/AI-XAUUSD-Trading

Reinforcement Learning • Updated Oct 10, 2025 • 29

dunnolab/VintixII

Reinforcement Learning • Updated about 8 hours ago • 6 • 1

Intel/deepmath-v1

Text Generation • 4B • Updated Dec 8, 2025 • 14 • 12

HiThink-Research/CCPO-7B-3AO-AITW

Image-Text-to-Text • 8B • Updated Jan 13 • 6 • 1

PrimeIntellect/INTELLECT-3.1

Text Generation • 107B • Updated Feb 18 • 352 • 41

Shion1124/sapo-gdpo-dora-qwen-struct

Text Generation • 4B • Updated Feb 12 • 8 • 2

Saminx22/qwen3-0.6B-rlvr-grpo

Reinforcement Learning • Updated Feb 21 • 1

y-ohtani/qwen3-4b-grpo-tcr-agent

Text Generation • 4B • Updated Mar 1 • 16 • 2

OpenHands/CodeScout-4B

Text Generation • 4B • Updated 21 days ago • 130 • 1

Camais03/camie-crafter

Reinforcement Learning • Updated 10 days ago • 204 • 4

xxwu/Agent-STAR-RL-7B

Text Generation • 8B • Updated 14 days ago • 277 • 1

mradermacher/Agent-STAR-RL-7B-GGUF

Reinforcement Learning • 8B • Updated 13 days ago • 746 • 1

mradermacher/Agent-STAR-RL-7B-i1-GGUF

Reinforcement Learning • 8B • Updated 12 days ago • 5.36k • 1

zlab-princeton/Vero-Qwen3I-8B

Image-Text-to-Text • 9B • Updated 1 day ago • 15 • 1

chenh0001/ppo-LunarLander-v2

Reinforcement Learning • Updated 9 days ago • 1 • 2

Dynamical-Systems/Dynamical-30B-A3B

Text Generation • 31B • Updated 6 days ago • 256 • 1

Abc8264/TutorAI-Chemistry-Phi4

Reinforcement Learning • 4B • Updated 4 days ago • 356 • 1

BJyotibrat/cartpole-dqn-converged

Reinforcement Learning • Updated 3 days ago • 47 • 1