PAIR Reflection Scorer (Cross‑Encoder)

This repository provides weights for a PAIR‑style cross‑encoder that scores the quality of counselor reflections in Motivational Interviewing (MI). Given a client/patient prompt and a counselor response, the model outputs a scalar score in [0,1] indicating how strongly the response reflects the prompt.

This model is based on the approach described in:

Min, Do June; Pérez‑Rosas, Verónica; Resnicow, Kenneth; Mihalcea, Rada. “PAIR: Prompt‑Aware margIn Ranking for Counselor Reflection Scoring in Motivational Interviewing.” EMNLP 2022. https://aclanthology.org/2022.emnlp-main.11/

Please credit the authors above when using this model or derivative work.

Task & Motivation (from the paper)

Reflections are a core verbal counseling skill used to convey understanding and acknowledgment of clients’ experiences.
The goal is to automatically score counselor reflections to provide timely, useful feedback for training and education.
Input to the scorer: a dialog turn consisting of a client prompt (likely to elicit a reflection) and the counselor’s response.
Output: a numeric reflection score capturing the quality/strength of the reflection.

Method: Prompt‑Aware Margin Ranking (PAIR)

PAIR trains a prompt‑aware cross‑encoder that contrasts positive and negative (prompt, response) pairs. The key idea is to learn, for a given prompt, to rank higher‑quality reflections above lower‑quality or mismatched responses using margin‑based ranking losses.

High‑level components reflected by this implementation:

Encoder: roberta-base cross‑encoder over concatenated (prompt, response).
Scoring head: Small MLP over the [CLS] token (768 → 512 → 1) with ELU.
Training objective (as per the paper/code): multi‑gap margin ranking that separates:
- High‑quality (HQ) reflections from medium‑quality (MQ) and low‑quality (LQ).
- HQ/MQ reflections from explicit mismatches (responses paired with the wrong prompt).
Inference: apply sigmoid to the logit to obtain a reflection score in [0,1].

The included cross_scorer_model.py shows the MLP head and margin losses consistent with a PAIR‑style training setup.

Files

reflection_scorer_weight.pt — fine‑tuned cross‑encoder weights (encoder + head).
cross_scorer_model.py — CrossScorerCrossEncoder module used for inference/training.
min_pair_2022.txt — text version summary of the PAIR paper (for reference in this repo).

Intended Use & Limitations

Intended for research, education, and tooling around reflection scoring in counseling‑style conversations.
Not a clinical or diagnostic tool; do not use for high‑stakes decisions.
Scores are not calibrated probabilities; treat relative differences with caution.
As with all ML models, outputs may reflect biases in pretraining/fine‑tuning data.

Quickstart

from huggingface_hub import hf_hub_download
from transformers import AutoModel, AutoTokenizer
import torch, importlib.util, sys

repo_id = "Khriis/PAIR"  # replace if you fork

# 1) Download weights and model code from the repo
ckpt_path = hf_hub_download(repo_id=repo_id, filename="reflection_scorer_weight.pt")
code_path = hf_hub_download(repo_id=repo_id, filename="cross_scorer_model.py")

# 2) Import model definition
spec = importlib.util.spec_from_file_location("cross_scorer_model", code_path)
mod = importlib.util.module_from_spec(spec)
sys.modules["cross_scorer_model"] = mod
spec.loader.exec_module(mod)

# 3) Build encoder + head and load state dict
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
encoder = AutoModel.from_pretrained("roberta-base", add_pooling_layer=False)
model = mod.CrossScorerCrossEncoder(encoder).to(device)
tokenizer = AutoTokenizer.from_pretrained("roberta-base")

state = torch.load(ckpt_path, map_location=device)
sd = state.get("model_state_dict", state)
model.load_state_dict(sd)
model.eval()

# 4) Score a (prompt, response) pair
prompt = "I’ve been overwhelmed at work and can’t focus."
response = "It sounds like you’re under a lot of pressure, and it’s affecting your ability to concentrate."
batch = tokenizer(prompt, response, padding="longest", truncation=True, return_tensors="pt").to(device)
with torch.no_grad():
    score = model.score_forward(**batch).sigmoid().item()
print("Reflection score:", round(score, 3))

Using in the Toolkit

The toolkit can download the file automatically (public repo). For offline use, place reflection_scorer_weight.pt locally and set REFLECTION_CKPT_PATH to that path.

Citation

If you use this model or code, please cite the PAIR paper:

Informal citation: “PAIR: Prompt‑Aware margIn Ranking for Counselor Reflection Scoring in Motivational Interviewing” (Min et al., EMNLP 2022). https://aclanthology.org/2022.emnlp-main.11/

BibTeX (adapt based on official entry):

@inproceedings{min-etal-2022-pair,
  title     = {PAIR: Prompt-Aware margIn Ranking for Counselor Reflection Scoring in Motivational Interviewing},
  author    = {Min, Do June and P{\'e}rez-Rosas, Ver{\'o}nica and Resnicow, Kenneth and Mihalcea, Rada},
  booktitle = {Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing},
  year      = {2022},
  url       = {https://aclanthology.org/2022.emnlp-main.11/}
}

Also cite RoBERTa:

@misc{liu2019roberta,
  title         = {{RoBERTa}: A Robustly Optimized {BERT} Pretraining Approach},
  author        = {Liu, Yinhan and others},
  year          = {2019},
  url           = {https://arxiv.org/abs/1907.11692}
}

Downloads last month: 11