TuanNguyen2003 commited on
Commit
515e131
·
verified ·
1 Parent(s): 907e87b

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +84 -0
README.md ADDED
@@ -0,0 +1,84 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ language:
4
+ - en
5
+ base_model: Qwen/Qwen3-1.7B
6
+ pipeline_tag: text-generation
7
+ library_name: transformers
8
+ tags:
9
+ - qwen3
10
+ - sft
11
+ - general-knowledge
12
+ - multiple-choice
13
+ - cs-552
14
+ datasets:
15
+ - cais/mmlu
16
+ metrics:
17
+ - accuracy
18
+ ---
19
+
20
+ # General Knowledge Model
21
+
22
+ This model is a fine-tuned version of [`Qwen/Qwen3-1.7B`](https://huggingface.co/Qwen/Qwen3-1.7B) for the CS-552 Modern NLP course project.
23
+
24
+ The model targets the **General Knowledge** benchmark, where it answers closed-book multiple-choice factual and reasoning questions. It was trained to return the final answer as a single option letter inside a LaTeX `\boxed{}` expression.
25
+
26
+ ## Intended output format
27
+
28
+ The model should produce answers in the following format:
29
+
30
+ ```text
31
+ \boxed{C}
32
+ ```
33
+
34
+ Anything outside `\boxed{}` is treated as reasoning and is not used for scoring by the evaluation pipeline.
35
+
36
+ ## Training procedure
37
+
38
+ This checkpoint was trained using **Supervised Fine-Tuning (SFT)** with LoRA on top of `Qwen/Qwen3-1.7B`.
39
+
40
+ The SFT data was formatted as instruction-style multiple-choice examples:
41
+
42
+ ```text
43
+ Q: ...
44
+ A) ...
45
+ B) ...
46
+ C) ...
47
+ D) ...
48
+
49
+ Answer: \boxed{C}
50
+ ```
51
+
52
+ The current checkpoint was trained on a processed General Knowledge dataset derived from MMLU-style multiple-choice examples.
53
+
54
+ ## Model behavior
55
+
56
+ The model is optimized for:
57
+
58
+ - closed-book factual question answering
59
+ - multiple-choice reasoning
60
+ - final-answer extraction through `\boxed{}`
61
+ - concise option-letter responses
62
+
63
+ The tokenizer chat template was configured with non-thinking mode to encourage concise answers.
64
+
65
+ ## Local validation
66
+
67
+ On the provided General Knowledge validation snapshot from the course starter repository, this checkpoint achieved:
68
+
69
+ - Extraction rate: `10/10`
70
+ - Accuracy: `6/10`
71
+
72
+ These validation samples are only a small sanity-check set and are not the hidden evaluation benchmark.
73
+
74
+ ## Framework versions
75
+
76
+ - Transformers
77
+ - PEFT
78
+ - PyTorch
79
+ - Datasets
80
+ - Hugging Face Hub
81
+
82
+ ## Limitations
83
+
84
+ This is an intermediate SFT baseline, not the final model. It was trained mainly to establish a working General Knowledge pipeline and verify that the model can produce extractable boxed answers. Performance may vary on broader or harder factual reasoning tasks.