JetLM
/

SDAR-1.7B-Chat

Text Generation

Model card Files Files and versions

dav1dliu commited on Aug 18, 2025

Commit

3588ef6

·

verified ·

1 Parent(s): 5f8f25c

Update README.md

Files changed (1) hide show

README.md +5 -4

README.md CHANGED Viewed

@@ -6,7 +6,8 @@ library_name: transformers
 # SDAR
 <div align="center">
-<img src="https://raw.githubusercontent.com/JetAstra/SDAR/refs/heads/main/assets/SDAR_doc_head.png?token=GHSAT0AAAAAADFKOPNMXTQ6UR7GA34QYJJ22FDIWHQ" />
 <div>&nbsp;</div>
@@ -34,7 +35,7 @@ For **SDAR** models, inference hyperparameters are set to: `block_length = 4`, `
 For **Qwen3-1.7B-AR-SFT** and **Qwen3-30B-AR-SFT**, we use *greedy decoding*, and the base models **Qwen3-1.7B-Base** and **Qwen3-30B-Base** are derived from the [Qwen3 Technical Report](https://arxiv.org/abs/2505.09388).
 <p align="center">
-  <img src="https://raw.githubusercontent.com/JetAstra/SDAR/refs/heads/main/assets/table1.png?token=GHSAT0AAAAAADFKOPNMBLKTAWINPIVDQHKC2FDI34A" style="max-width:95%; height:auto;">
 <p align="center">
 ### SDAR-Sci v.s. AR Baseline
@@ -43,7 +44,7 @@ This table presents a **controlled comparison** between AR and SDAR under the sa
 The results are averaged over 8 runs for GPQA, and over 32 runs each for AIME 2024, AIME 2025, and LiveMathBench.
 <p align="center">
-  <img src="https://raw.githubusercontent.com/JetAstra/SDAR/refs/heads/main/assets/table2.png?token=GHSAT0AAAAAADFKOPNMDZXQX3RWFLUXAZQU2FDI4KA" style="max-width:95%; height:auto;">
 <p align="center">
 #### SDAR-Sci v.s. Other Models
@@ -52,5 +53,5 @@ This table positions **SDAR-30B-A3B-Sci(sample)** against leading open-source an
 Scores for external models are sourced from the [InternLM/Intern-S1](https://github.com/InternLM/Intern-S1) repository.
 <p align="center">
-  <img src="https://raw.githubusercontent.com/JetAstra/SDAR/refs/heads/main/assets/table3.png?token=GHSAT0AAAAAADFKOPNNMHMGMDWZ37WFK2MW2FDI4UQ" style="max-width:95%; height:auto;">
 <p align="center">

 # SDAR
 <div align="center">
+<img src="https://raw.githubusercontent.com/JetAstra/SDAR/main/assets/SDAR_doc_head.png">
 <div>&nbsp;</div>
 For **Qwen3-1.7B-AR-SFT** and **Qwen3-30B-AR-SFT**, we use *greedy decoding*, and the base models **Qwen3-1.7B-Base** and **Qwen3-30B-Base** are derived from the [Qwen3 Technical Report](https://arxiv.org/abs/2505.09388).
 <p align="center">
+  <img src="https://raw.githubusercontent.com/JetAstra/SDAR/main/assets/table1.png" style="max-width:80%; height:auto;">
 <p align="center">
 ### SDAR-Sci v.s. AR Baseline
 The results are averaged over 8 runs for GPQA, and over 32 runs each for AIME 2024, AIME 2025, and LiveMathBench.
 <p align="center">
+  <img src="https://raw.githubusercontent.com/JetAstra/SDAR/main/assets/table2.png" style="max-width:80%; height:auto;">
 <p align="center">
 #### SDAR-Sci v.s. Other Models
 Scores for external models are sourced from the [InternLM/Intern-S1](https://github.com/InternLM/Intern-S1) repository.
 <p align="center">
+  <img src="https://raw.githubusercontent.com/JetAstra/SDAR/main/assets/table3.png" style="max-width:80%; height:auto;">
 <p align="center">