LLaMA 3.2 1B Instruct - XSum Summarization (LoRA)

This model is a LoRA fine-tuned version of meta-llama/Llama-3.2-1B-Instruct for extreme summarization on the XSum dataset.

Model Details

  • Base Model: meta-llama/Llama-3.2-1B-Instruct
  • Method: LoRA (Low-Rank Adaptation)
  • Task: Instruction-based summarization
  • Dataset: XSum (extreme summarization)
  • Training Samples: 5,000
  • Validation Samples: 500

Training Configuration

LoRA Parameters

  • Rank (r): 16
  • Alpha: 32
  • Dropout: 0.05
  • Target Modules: q_proj, k_proj, v_proj...

Training Hyperparameters

  • Epochs: 3
  • Batch Size: 4
  • Gradient Accumulation: 4
  • Learning Rate: 0.0002
  • Optimizer: paged_adamw_8bit
  • Scheduler: cosine
  • Quantization: 4-bit (nf4)

Performance

Evaluated on 200 validation samples:

Metric Score
ROUGE-1 0.1912
ROUGE-2 0.0548
ROUGE-L 0.1374
ROUGE-Lsum 0.1415

Usage

from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
import torch

# Load base model
model_name = "meta-llama/Llama-3.2-1B-Instruct"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype=torch.float16,
    device_map="auto"
)

# Load LoRA adapters
model = PeftModel.from_pretrained(model, "Deepu1965/xsum-llama1b-instruct-lora")

# Prepare input
document = "Your news article here..."
prompt = (
    "<|begin_of_text|><|start_header_id|>system<|end_header_id|>\n\n"
    "You are a helpful assistant that summarizes news articles into one concise sentence.\n"
    "<|eot_id|><|start_header_id|>user<|end_header_id|>\n\n"
    f"Summarize this article in one sentence:\n\n{document}\n"
    "<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\n"
)

inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=128)
summary = tokenizer.decode(outputs[0], skip_special_tokens=True)

Training Details

  • Framework: HuggingFace Transformers + PEFT
  • Quantization: bitsandbytes 4-bit
  • Gradient Checkpointing: Enabled
  • Mixed Precision: FP16

Limitations

  • Trained on English news articles only
  • Optimized for single-sentence summaries
  • May not generalize well to other domains
  • Requires LoRA adapters loaded on top of base model

Citation

@misc{llama32-xsum-lora,
  author = {Your Name},
  title = {LLaMA 3.2 1B XSum LoRA},
  year = {2025},
  publisher = {HuggingFace},
  url = {https://huggingface.co/Deepu1965/xsum-llama1b-instruct-lora}
}
Downloads last month
8
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train Deepu1965/xsum-llama1b-instruct-lora