eltorio/ROCOv2-radiology
Viewer โข Updated โข 79.8k โข 2.52k โข 80
How to use WafaaFraih/blip-roco-radiology-captioning with Transformers:
# Use a pipeline as a high-level helper
# Warning: Pipeline type "image-to-text" is no longer supported in transformers v5.
# You must load the model directly (see below) or downgrade to v4.x with:
# 'pip install "transformers<5.0.0'
from transformers import pipeline
pipe = pipeline("image-to-text", model="WafaaFraih/blip-roco-radiology-captioning") # Load model directly
from transformers import AutoProcessor, AutoModelForImageTextToText
processor = AutoProcessor.from_pretrained("WafaaFraih/blip-roco-radiology-captioning")
model = AutoModelForImageTextToText.from_pretrained("WafaaFraih/blip-roco-radiology-captioning")This model is a fine-tuned version of BLIP on the ROCOv2 radiology dataset for generating captions of medical radiology images.
from transformers import BlipForConditionalGeneration, AutoProcessor
from PIL import Image
# Load model and processor
processor = AutoProcessor.from_pretrained("WafaaFraih/blip-roco-radiology-captioning")
model = BlipForConditionalGeneration.from_pretrained("WafaaFraih/blip-roco-radiology-captioning")
# Process image
image = Image.open("radiology_image.jpg")
inputs = processor(images=image, return_tensors="pt")
# Generate caption
generated_ids = model.generate(
pixel_values=inputs["pixel_values"],
max_new_tokens=64,
num_beams=5,
length_penalty=0.8
)
caption = processor.batch_decode(generated_ids, skip_special_tokens=True)[0].strip()
print(caption)