SDXL ControlNet - Brightness Control

This is a ControlNet model trained on Stable Diffusion XL to control image generation through brightness/grayscale information. It enables precise control over lighting, tonal values, and overall brightness structure in generated images.

Model Description

This ControlNet allows you to condition SDXL image generation using grayscale/brightness maps. By providing a grayscale image as input, you can control the brightness distribution and lighting structure of the generated image while maintaining creative freedom through text prompts.

Key features:

  • ๐ŸŽจ Control brightness and lighting in generated images
  • ๐Ÿ–ผ๏ธ Works with SDXL at 512x512 resolution
  • ๐Ÿ”„ Compatible with standard SDXL pipelines
  • ๐Ÿ’ก Trained on high-quality aesthetic images

Intended uses:

  • Image recoloring and colorization
  • Lighting control in text-to-image generation
  • Artistic image manipulation
  • Photo enhancement and stylization

Training Details

Training Data

The model was trained on 100,000 samples from the latentcat/grayscale_image_aesthetic_3M dataset, which consists of:

  • High-quality aesthetic images
  • Paired with their grayscale/brightness versions
  • Images resized to 512x512 resolution

Training Configuration

  • Base Model: stabilityai/stable-diffusion-xl-base-1.0
  • ControlNet Architecture: SDXL-based (initialized from UNet)
  • Training Resolution: 512x512
  • Training Steps: 782 (1 epoch)
  • Batch Size: 32 per device
  • Gradient Accumulation Steps: 4 (effective batch size: 128)
  • Learning Rate: 1e-5
  • Mixed Precision: FP16
  • Hardware: NVIDIA H100 80GB
  • Training Time: ~49 minutes
  • Optimizer: 8-bit Adam

Training Script

Training was performed using the diffusers ControlNet training script with the following key parameters:

accelerate launch --mixed_precision="fp16" train_controlnet_sdxl.py \
  --pretrained_model_name_or_path="stabilityai/stable-diffusion-xl-base-1.0" \
  --dataset_name="<local_dataset_path>" \
  --conditioning_image_column="conditioning_image" \
  --image_column="image" \
  --caption_column="text" \
  --mixed_precision="fp16" \
  --resolution=512 \
  --learning_rate=1e-5 \
  --train_batch_size=32 \
  --gradient_accumulation_steps=4 \
  --num_train_epochs=1 \
  --checkpointing_steps=390 \
  --enable_xformers_memory_efficient_attention \
  --use_8bit_adam

Usage

Installation

pip install diffusers transformers accelerate torch

Basic Usage

from diffusers import StableDiffusionXLControlNetPipeline, ControlNetModel
import torch
from PIL import Image

# Load ControlNet
controlnet = ControlNetModel.from_pretrained(
    "Oysiyl/controlnet-brightness-sdxl-100k",
    torch_dtype=torch.float16
)

# Load SDXL pipeline
pipe = StableDiffusionXLControlNetPipeline.from_pretrained(
    "stabilityai/stable-diffusion-xl-base-1.0",
    controlnet=controlnet,
    torch_dtype=torch.float16
)
pipe.enable_xformers_memory_efficient_attention()
pipe.to("cuda")

# Load your grayscale/brightness control image
control_image = Image.open("path/to/grayscale_image.png")

# Generate image
prompt = "a beautiful landscape, highly detailed, vibrant colors"
negative_prompt = "blurry, low quality, distorted"

image = pipe(
    prompt=prompt,
    negative_prompt=negative_prompt,
    image=control_image,
    num_inference_steps=30,
    controlnet_conditioning_scale=0.8,  # Adjust between 0.0-2.0
    guidance_scale=7.5,
).images[0]

image.save("output.png")

ControlNet Conditioning Scale

The controlnet_conditioning_scale parameter controls how strongly the brightness map influences the generation:

  • 0.3-0.5: Weak control, more creative freedom
  • 0.6-0.8: Balanced control (recommended)
  • 0.9-1.2: Strong control, closely follows brightness structure
  • 1.3-2.0: Very strong control, minimal deviation

Limitations and Bias

Current Limitations

โš ๏ธ Important: This model (trained for 1 epoch) has limited capability in preserving sharp geometric patterns:

  • โŒ Not recommended for QR codes - The model does not preserve QR code structure reliably
  • โŒ Limited geometric pattern preservation - Sharp edges and precise patterns may not be maintained
  • โš ๏ธ Natural images only - Best suited for natural scenes, landscapes, portraits, etc.

For geometric patterns and QR codes, consider using:

  • Canny ControlNet (edge detection)
  • Tile ControlNet (pattern coherence)
  • Purpose-built QR ControlNet models

Known Issues

  1. Training Duration: This model was trained for only 1 epoch. The original SD 1.5 brightness ControlNet was trained for 2 epochs, which may explain the difference in geometric pattern preservation.

  2. Dataset Focus: Trained exclusively on natural aesthetic images, which may limit performance on synthetic/geometric content.

Bias

The model inherits biases from:

  • The SDXL base model
  • The grayscale_image_aesthetic_3M dataset (aesthetic-focused images)

Available Checkpoints

This repository includes multiple checkpoints from the training run:

  • Root directory: Final model (step 782, end of epoch 1)
  • checkpoint-390: Mid-training checkpoint (50% completion)
  • checkpoint-780: Near-final checkpoint (99% completion)

Planned Improvements

๐Ÿ”„ Version 2 (Planned): Training for 2 epochs to match the original SD 1.5 brightness ControlNet methodology, which should improve:

  • Geometric pattern preservation
  • Sharp edge retention
  • Overall structural coherence

Citation

If you use this model, please cite:

@misc{controlnet-brightness-sdxl-100k,
  author = {Oysiyl},
  title = {SDXL ControlNet - Brightness Control},
  year = {2025},
  publisher = {HuggingFace},
  journal = {HuggingFace Model Hub},
  howpublished = {\url{https://huggingface.co/Oysiyl/controlnet-brightness-sdxl-100k}}
}

Acknowledgments

License

This model is released under the Apache 2.0 License. See the LICENSE file for details.

The base SDXL model has its own license terms which you should review at stabilityai/stable-diffusion-xl-base-1.0.

Downloads last month
17
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for Oysiyl/controlnet-brightness-sdxl-100k

Adapter
(7756)
this model

Dataset used to train Oysiyl/controlnet-brightness-sdxl-100k