Qwen3-4B-Text-Summarizer-finetuning

A State-of-the-Art Dialogue Summarization Model Fine-tuned by Rituraj Pandey

Model Description

Qwen3-4B-Text-Summarizer-finetuning is a highly efficient, fine-tuned version of the powerful Qwen/Qwen3-4B (4 Billion Parameter) Large Language Model. It has been specifically engineered to handle the complex task of Abstractive Dialogue Summarization.

Unlike standard summarizers that struggle with informal language, slang, and speaker transitions, this model leverages the advanced reasoning capabilities of the Qwen3 architecture, adapted via Rank-Stabilized LoRA (RSLoRA) to produce concise, factual, and coherent summaries of conversations.

Key Features

Advanced Base: Built on Qwen3-4B, utilizing its "Thinking" capabilities for better context understanding.
SOTA Adaptation: Trained using Rank 64 (r=64) LoRA adapters targeting all linear layers (k, q, v, o, gate, up, down), ensuring maximum plasticity and performance.
Dialogue Specialized: Fine-tuned on the SAMSum dataset, making it an expert at summarizing real-world, messy human conversations.
Efficient: 4-bit Quantized (QLoRA) for low-VRAM inference while maintaining 16-bit performance levels.

Technical Specifications

Feature	Specification
Base Model	Qwen/Qwen3-4B-Instruct
Architecture	Causal Decoder-Only Transformer
Quantization	4-bit (BitsAndBytes)
Adapter Rank	64 (High Capacity)
Alpha	32 (Rank Stabilized)
Target Modules	All Linear Layers
Training Framework	Unsloth + TRL + PEFT

How to Use

1. Fast Inference (Recommended with Unsloth)

For 2x faster inference, use the unsloth library.

from unsloth import FastLanguageModel
import torch

repo_id = "riturajpandey739/Qwen3-4B-Text-Summarizer-finetuning"

# Load model
model, tokenizer = FastLanguageModel.from_pretrained(
    model_name = repo_id,
    max_seq_length = 2048,
    dtype = None,
    load_in_4bit = True,
)
FastLanguageModel.for_inference(model)

# Define Prompt Template
prompt_template = """Below is a conversation between people. Write a concise summary of the conversation.

### Dialogue:
{}

### Summary:
"""

# Inference
dialogue = """
Rituraj: The model is working perfectly on Colab.
Scientist: That is great news. Did you push it to the hub?
Rituraj: Yes, I just updated the README as well.
"""

inputs = tokenizer([prompt_template.format(dialogue)], return_tensors = "pt").to("cuda")

outputs = model.generate(**inputs, max_new_tokens = 128, temperature = 0.1)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Downloads last month: 44

Model tree for riturajpandey739/Qwen3-4B-Text-Summarizer-finetuning

Base model

Qwen/Qwen3-4B-Base

Finetuned

Qwen/Qwen3-4B

Adapter

(107)

this model

riturajpandey739
/

Qwen3-4B-Text-Summarizer-finetuning