zai-org
/

visualglm-6b

@@ -1,10 +1,10 @@
-The GLM-130B License
 1. Definitions
-“Licensor” means the GLM-130B Model Team that distributes its Software.
-“Software” means the GLM-130B model parameters made available under this license.
 2. License Grant

+The VisualGLM-6B License
 1. Definitions
+“Licensor” means the VisualGLM-6B Model Team that distributes its Software.
+“Software” means the VisualGLM-6B model parameters made available under this license.
 2. License Grant

README.md CHANGED Viewed

@@ -4,6 +4,7 @@ language:
 - en
 tags:
 - glm
 - chatglm
 - thudm
 ---
@@ -17,7 +18,7 @@ tags:
 </p>
 ## 介绍
-CVisualGLM-6B 是一个开源的，支持**图像、中文和英文**的多模态对话语言模型，语言模型基于 [ChatGLM-6B](https://github.com/THUDM/ChatGLM-6B)，具有 62 亿参数；图像部分通过训练 [BLIP2-Qformer](https://arxiv.org/abs/2301.12597) 构建起视觉模型与语言模型的桥梁，整体模型共78亿参数。
 VisualGLM-6B 依靠来自于 [CogView](https://arxiv.org/abs/2105.13290) 数据集的30M高质量中文图文对，与300M经过筛选的英文图文对进行预训练，中英文权重相同。该训练方式较好地将视觉信息对齐到ChatGLM的语义空间；之后的微调阶段，模型在长视觉问答数据上训练，以生成符合人类偏好的答案。
@@ -33,12 +34,12 @@ pip install SwissArmyTransformer>=0.3.6 torch>1.10.0 torchvision transformers>=4
 ```ipython
 >>> from transformers import AutoTokenizer, AutoModel
->>> tokenizer = AutoTokenizer.from_pretrained("THUDM/chatglm-6b", trust_remote_code=True)
->>> model = AutoModel.from_pretrained("THUDM/chatglm-6b", trust_remote_code=True).half().cuda()
 >>> image_path = "your image path"
 >>> response, history = model.chat(tokenizer, image_path, "描述这张图片。", history=[])
 >>> print(response)
->>> response, history = model.chat(tokenizer, "这张图片可能是在什么场所拍摄的？", history=history)
 >>> print(response)
 ```

 - en
 tags:
 - glm
+- visualglm
 - chatglm
 - thudm
 ---
 </p>
 ## 介绍
+VisualGLM-6B 是一个开源的，支持**图像、中文和英文**的多模态对话语言模型，语言模型基于 [ChatGLM-6B](https://github.com/THUDM/ChatGLM-6B)，具有 62 亿参数；图像部分通过训练 [BLIP2-Qformer](https://arxiv.org/abs/2301.12597) 构建起视觉模型与语言模型的桥梁，整体模型共78亿参数。
 VisualGLM-6B 依靠来自于 [CogView](https://arxiv.org/abs/2105.13290) 数据集的30M高质量中文图文对，与300M经过筛选的英文图文对进行预训练，中英文权重相同。该训练方式较好地将视觉信息对齐到ChatGLM的语义空间；之后的微调阶段，模型在长视觉问答数据上训练，以生成符合人类偏好的答案。
 ```ipython
 >>> from transformers import AutoTokenizer, AutoModel
+>>> tokenizer = AutoTokenizer.from_pretrained("THUDM/visualglm-6b", trust_remote_code=True)
+>>> model = AutoModel.from_pretrained("THUDM/visualglm-6b", trust_remote_code=True).half().cuda()
 >>> image_path = "your image path"
 >>> response, history = model.chat(tokenizer, image_path, "描述这张图片。", history=[])
 >>> print(response)
+>>> response, history = model.chat(tokenizer, image_path, "这张图片可能是在什么场所拍摄的？", history=history)
 >>> print(response)
 ```