Qwen3-4B-Text-Summarizer-finetuning

Unsloth

A State-of-the-Art Dialogue Summarization Model Fine-tuned by Rituraj Pandey

Model Description

Qwen3-4B-Text-Summarizer-finetuning is a highly efficient, fine-tuned version of the powerful Qwen/Qwen3-4B (4 Billion Parameter) Large Language Model. It has been specifically engineered to handle the complex task of Abstractive Dialogue Summarization.

Unlike standard summarizers that struggle with informal language, slang, and speaker transitions, this model leverages the advanced reasoning capabilities of the Qwen3 architecture, adapted via Rank-Stabilized LoRA (RSLoRA) to produce concise, factual, and coherent summaries of conversations.

Key Features

  • Advanced Base: Built on Qwen3-4B, utilizing its "Thinking" capabilities for better context understanding.
  • SOTA Adaptation: Trained using Rank 64 (r=64) LoRA adapters targeting all linear layers (k, q, v, o, gate, up, down), ensuring maximum plasticity and performance.
  • Dialogue Specialized: Fine-tuned on the SAMSum dataset, making it an expert at summarizing real-world, messy human conversations.
  • Efficient: 4-bit Quantized (QLoRA) for low-VRAM inference while maintaining 16-bit performance levels.

Technical Specifications

Feature Specification
Base Model Qwen/Qwen3-4B-Instruct
Architecture Causal Decoder-Only Transformer
Quantization 4-bit (BitsAndBytes)
Adapter Rank 64 (High Capacity)
Alpha 32 (Rank Stabilized)
Target Modules All Linear Layers
Training Framework Unsloth + TRL + PEFT

How to Use

1. Fast Inference (Recommended with Unsloth)

For 2x faster inference, use the unsloth library.

from unsloth import FastLanguageModel
import torch

repo_id = "riturajpandey739/Qwen3-4B-Text-Summarizer-finetuning"

# Load model
model, tokenizer = FastLanguageModel.from_pretrained(
    model_name = repo_id,
    max_seq_length = 2048,
    dtype = None,
    load_in_4bit = True,
)
FastLanguageModel.for_inference(model)

# Define Prompt Template
prompt_template = """Below is a conversation between people. Write a concise summary of the conversation.

### Dialogue:
{}

### Summary:
"""

# Inference
dialogue = """
Rituraj: The model is working perfectly on Colab.
Scientist: That is great news. Did you push it to the hub?
Rituraj: Yes, I just updated the README as well.
"""

inputs = tokenizer([prompt_template.format(dialogue)], return_tensors = "pt").to("cuda")

outputs = model.generate(**inputs, max_new_tokens = 128, temperature = 0.1)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Downloads last month
44
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for riturajpandey739/Qwen3-4B-Text-Summarizer-finetuning

Base model

Qwen/Qwen3-4B-Base
Finetuned
Qwen/Qwen3-4B
Adapter
(107)
this model

Dataset used to train riturajpandey739/Qwen3-4B-Text-Summarizer-finetuning