T5-Base News Summarizer (Multi-Style)

This model is a fine-tuned version of google/flan-t5-base trained on samples from CNN/DailyMail and XSum.
It generates news summaries in three styles: Harsh (Concise), Standard, and Detailed.


Model Description

  • Model Type: Sequence-to-Sequence Transformer (T5)
  • Language: English
  • Base Model: google/flan-t5-base
  • Training Data: ~9k mixed samples from CNN/DailyMail & XSum

Key Features

This model supports a Style Prompt that determines summary length and density:

  1. Harsh

    • Very concise
    • Headline-like
    • Trained mostly on XSum
  2. Standard

    • Balanced, general-purpose summarization
  3. Detailed

    • Longer, more contextual summaries
    • Trained with CNN/DailyMail

Usage

from transformers import pipeline

summarizer = pipeline("summarization", model="Hiratax/t5-news-summarizer")

text = """
The James Webb Space Telescope (JWST) has captured a lush landscape of stellar birth. 
The new image shows the Cosmic Cliffs, which are the edge of a giant gaseous cavity within the star-forming region NGC 3324.
"""

# 1. Standard
print(summarizer("summarize standard: " + text))

# 2. Harsh (Headline)
print(summarizer("summarize harsh: " + text))

# 3. Detailed
print(summarizer("summarize detailed: " + text))

Recommended Inference Parameters

Style Min Length Max Length Length Penalty Repetition Penalty N-Gram Block
Harsh 10 35% of input 1.0 2.0 3
Standard 60 150 2.0 1.5 3
Detailed 50% input 150% of input 1.5 1.2 4

Tip:
"Detailed" style benefits from no_repeat_ngram_size=4 to avoid repeated openings.


Training Procedure

Hyperparameters

  • Epochs: 8
  • Learning Rate: 1e-4
  • Batch Size: 4
  • Gradient Accumulation: 2
  • Weight Decay: 0.01
  • Optimizer: AdamW
  • Precision: FP16

Data Strategy

  • Harsh โ†’ XSum (abstractive, short)
  • Detailed โ†’ CNN/DailyMail (longer, higher detail)
  • Safety: Removed cases where summary > article length to reduce hallucinations

Limitations

  • May occasionally output the typo "occupys" (training noise).
  • Max input length: 512 tokens (longer text is truncated).
  • Model performance decreases on extremely long or highly technical articles.

License

Apache 2.0

Downloads last month
26
Safetensors
Model size
0.2B params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for Hiratax/t5-news-summarizer

Finetuned
(867)
this model

Space using Hiratax/t5-news-summarizer 1