Model Card for Model ID
This is a Natural Language Inference (NLI) model built by fine-tuning DistilBERT-base-uncased on the GPT-3 NLI dataset. The model performs textual entailment classification - given two pieces of text (a premise and a hypothesis), it determines the logical relationship between them.
Model Details
Model Description
What it does:
Takes two text inputs: a premise (text_a) and a hypothesis (text_b)
Classifies their relationship into one of three categories:
Entailment: The hypothesis logically follows from the premise
Neutral: The hypothesis is neither supported nor contradicted by the premise
Contradiction: The hypothesis contradicts the premise
Use Cases:
Reading comprehension tasks
Logical reasoning applications
Question-answering systems
Text coherence analysis
Information verification tasks
Architecture: DistilBERT-based sequence classification model with 3 output classes, optimized for efficiency while maintaining strong performance on natural language understanding tasks.
This type of model is fundamental for applications requiring understanding of logical relationships between text passages, such as fact-checking, automated reasoning, and reading comprehension systems.
How to Get Started with the Model
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
# Load the model and tokenizer
model_name = "gulupgulup/distilbert_nli"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)
Usage Example
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
# Load the model and tokenizer
model_name = "gulupgulup/distilbert_nli"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)
# Example premise and hypothesis
premise = "A person is riding a bicycle in the park."
hypothesis = "Someone is exercising outdoors."
# Tokenize the input
inputs = tokenizer(premise, hypothesis, return_tensors="pt", truncation=True, padding=True)
# Make prediction
with torch.no_grad():
outputs = model(**inputs)
predictions = torch.nn.functional.softmax(outputs.logits, dim=-1)
predicted_class = torch.argmax(predictions, dim=-1)
# Get the predicted label
id2label = {0: "entailment", 1: "neutral", 2: "contradiction"}
predicted_label = id2label[predicted_class.item()]
print(f"Premise: {premise}")
print(f"Hypothesis: {hypothesis}")
print(f"Predicted relationship: {predicted_label}")
print(f"Confidence scores: {predictions.squeeze().tolist()}")
Training Details
Training Data
Dataset: \href{https://huggingface.co/datasets/pietrolesci/gpt3_nli}{pietrolesci/gpt3_nli} - A natural language inference dataset containing premise-hypothesis pairs with three-class labels (entailment, neutral, contradiction). The dataset consists of text pairs (text_a and text_b) where the model learns to determine the logical relationship between the premise and hypothesis.ed]
Training Procedure
Base Model: DistilBERT-base-uncased fine-tuned for sequence classification with 3 output labels for natural language inference.
Training Framework: Hugging Face Transformers Trainer with Weights & Biases (wandb) integration for experiment tracking.
Data Split: The original training set was split into train (81%), validation (9%), and test (10%) sets using stratified sampling to maintain label distribution balance across splits.
Preprocessing [optional]
Text pairs are tokenized using DistilBERT's tokenizer with truncation and padding applied. The label column is cast to ClassLabel format with three categories: entailment, neutral, and contradiction.
Data Handling: Uses DataCollatorWithPadding for dynamic padding during training and tokenizes premise-hypothesis pairs jointly.
Training Hyperparameters
Learning Rate: 1e-5
Batch Size: 64 (both training and evaluation)
Number of Epochs: 5
Weight Decay: 0.01
Max Gradient Norm: 1.0
Optimizer: AdamW (default)
Evaluation Strategy: Every epoch
Save Strategy: Every epoch
Logging Steps: 100
Best Model Selection: Based on validation accuracy (higher is better)
Evaluation
Metrics
Accuracy: Primary evaluation metric measuring the percentage of correctly classified premise-hypothesis pairs across all three NLI categories.
Precision (Macro-averaged): Secondary metric calculating the average precision across all three classes (entailment, neutral, contradiction), giving equal weight to each class regardless of support. This metric is useful for understanding model performance on each NLI relationship type, especially important when dealing with potentially imbalanced class distributions.
Both metrics are computed using the evaluate library and rounded to 3 decimal places for reporting.
- Downloads last month
- 4
Model tree for gulupgulup/distilbert_nli
Base model
distilbert/distilbert-base-uncased