Granite RAG Library

The Granite RAG Library includes six LoRA adapters for ibm-granite/granite-4.0-micro, ibm-granite/granite-4.1-3b, ibm-granite/granite-4.1-8b, and ibm-granite/granite-4.1-30b, each of which expects as input a (single-turn or multi-turn) conversation between a user and an AI assistant, and most of which also expect a set of grounding passages. For four of the adapters, we also release aLoRA versions. Each adapter has been developed for a specific task that is likely to be useful in Agentic RAG pipelines. We give a brief overview of the functionality of each adapter, as the details can be found in each individual adapter's README.

Capabilities implemented as LoRA adapters

The adapters made available in this HF repository are:

Query Rewrite (QR): Given a conversation ending with a user query, QR will decontextualize that last user query by rewriting it (whenever necessary) into an equivalent version that is standalone and can be understood by itself. While this adapter is general purpose for any multi-turn conversation, it is especially effective in RAG settings where its ability to rewrite a user query into a standalone version directly improves the retriever performance, which in turn improves the answer generation performance. This is a pre-retrieval adapter since its suggested use is before invoking retrieval. Further details can be found in the Query Rewrite README.

Query Clarification (QC): Given a conversation ending with a user query, (and optionally relevant content such as RAG documents), QC will detect whether the last user query is underspecified (no clear interpretation or multiple valid interpretations) and, if so, formulate an appropriate clarification request back to the user. The adapter is designed for conversational use cases where user queries may be ill-formed, unclear, or have multiple valid interpretations based on the underlying system or content. This adapter is pre-retrieval OR pre-generation since it can be used either before or after invoking retrieval. Further details can be found in the Query Clarification README.

Context Relevance (CR): Given a conversation ending with a user query, and an individual passage, CR classifies whether the passage is relevant, partially relevant, or irrelevant for answering the last user query - or if the passage may instead mislead or harm the downstream generator model’s response quality. This is a pre-generation adapter. Note that the Context Relevance adapter is only released for ibm-granite/granite-4.0-micro. Further details can be found in the Context Relevance README.

Answerability Determination (AD): Given a conversation ending with a user query, and a set of passages, AD classifies whether that final user query is answerable or unanswerable based on the available information in the passages. It is valuable for restraining over-eager models by identifying unanswerable queries and preventing the generation of hallucinated responses. It can also be used to indicate that the system should re-query the retriever with alternate formulations, to fetch more relevant passages. This is a pre-generation adapter. Further details can be found in the Answerability Determination README.

Hallucination Detection (HD): Given a conversation ending with an assistant response, and a set of passages, HD outputs a hallucination risk range for each sentence in the last assistant response, with respect to the set of passages. This could be used in concert with sampling techniques that yield multiple generated responses, some of which could then be filtered according to their HD scores. This is a post-generation adapter since its expected use is after invoking the LLM to create the response. Further details can be found in the Hallucination Detection README.

Citation Generation (CG): Given a conversation ending with an assistant response, and a set of passages, CG generates citations for that last assistant response from the provided passages. Citations are generated for each sentence in the response (when available), where each citation consists of a set of sentences from the supporting passages. This is a post-generation adapter since its expected use is after invoking the LLM, and therefore can be used to create citations for responses generated by any model. Further details can be found in the Citation Generation README.

Recommended Use

The recommended way to call all adapters is through the Mellea framework. For code snippets demonstrating how to use them please refer to the Mellea examples.

Model Signing

All adapter artifacts in this repository are signed to ensure integrity and provenance. Each adapter includes a model.sig signature file in its lora/ or alora/ directory that covers all artifacts in that directory (adapter_config.json, adapter_model.safetensors, io.yaml).

Adapter	Signature File Path (example)	Signing Identity
Answerability	`answerability/granite-4.1-3b/lora/model.sig`	`Granite-sign@ibm.com`
Citations	`citations/granite-4.1-3b/lora/model.sig`	`Granite-sign@ibm.com`
Context Relevance	`context_relevance/granite-4.1-3b/lora/model.sig`	`Granite-sign@ibm.com`
Hallucination Detection	`hallucination_detection/granite-4.1-3b/lora/model.sig`	`Granite-sign@ibm.com`
Query Clarification	`query_clarification/granite-4.1-3b/lora/model.sig`	`Granite-sign@ibm.com`
Query Rewrite	`query_rewrite/granite-4.1-3b/lora/model.sig`	`Granite-sign@ibm.com`

The same pattern applies to all model variants (granite-4.0-micro, granite-4.1-3b, granite-4.1-8b, granite-4.1-30b). Most adapters include both lora/ and alora/ variants, each with its own model.sig signature file.

Verifying Model Signatures

To verify the integrity of a downloaded adapter, use the model-signing tool:

# Install the model signing verification tool
pip install model-signing

# Verify all artifacts in an adapter's lora/ directory
model_signing verify sigstore \
  --signature <adapter>/<model-variant>/lora/model.sig \
  --identity Granite-sign@ibm.com \
  --identity_provider https://sigstore.verify.ibm.com/oauth2 \
  <adapter>/<model-variant>/lora/

For example, to verify the answerability adapter for granite-4.1-3b:

model_signing verify sigstore \
  --signature answerability/granite-4.1-3b/lora/model.sig \
  --identity Granite-sign@ibm.com \
  --identity_provider https://sigstore.verify.ibm.com/oauth2 \
  answerability/granite-4.1-3b/lora/

Each model.sig file contains a signature over all adapter artifacts in the corresponding lora/ directory, signed with the identity Granite-sign@ibm.com. This allows users to confirm that the adapter has not been tampered with after release.