chatglm在windows上私有化部署 | 零基础小白爬坑

站长

2024年04月24日 09:28 · 阅读数 73

本文41319字,阅读时间约5分钟,总结精简部分1分钟即可.

你将获得:

小白也能看懂操作的消费级显卡实现的在windows上本地部署chatglm的精简方法
一些部署时报错的原因和解决方法
本地有一个自己的chatglm量化模型,断网也能跑的那种~

食用前提醒:

nvidia cuda toolkit的部分本文未涉及,请活用搜索引擎.
本文配置显卡1070,跑的是chatglm-6b-int4
如果你的配置比较高,比如3090或4090这些24GB显存的高端卡,可以直接上chatglm-6b,因为量化模型推理速度真的慢!(鼠鼠哭哭)

系列预告:

langchain_chatglm : 本地断网也能挂载知识库的方法(附增加输入文件格式)
量化模型推理也太慢了!本地模型怎样在数据保密的情况下加速调用云gpu?

碎碎念:

又是间歇性踌躇满志,持续性emo和混吃等死的一天!我听着前面同学一个接一个的《Attention is all your need》分享,想着自己概率论和数学基础都没补完,还有两个月就要给导师交开题报告了,我论文一点头绪都没有...容我再emo一会...

好! I'm back!因为要做的事情很多,好久没有更新了,最近终于想到我可以任务导向自🐔学习!

建了什么任务先保密,总之是想了一个需要断网推理的商业场景,所以就有了这篇chatglm在windows上部署的爬坑文~

水平有限,大佬们多喷~我心理素质和抗压性可好了~

(os:听别人讲transformer都会emo的人说这句话你自己信吗?)

总结——bug的解决和步骤详解见后文,这里是简化的全步骤

conda create -n your_chatglm_env python=3.9
conda activate your_chatglm_env

// 如果你的cuda version >= 11.8 比如我的12.1 
pip uninstall torch torchvision torchaudio
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118

// 验证cuda和torch匹配
>>> import torch
>>> torch.cuda.is_available()
True
>>>

-----------------------------------------------------------
-----------------------------------------------------------

// 代码调用：远程加载模型
from transformers import AutoTokenizer, AutoModel
>>> tokenizer = AutoTokenizer.from_pretrained("THUDM/chatglm-6b", trust_remote_code=True)
>>> model = AutoModel.from_pretrained("THUDM/chatglm-6b", trust_remote_code=True).half().cuda()
>>> model = model.eval()
>>> response, history = model.chat(tokenizer, "你好", history=[])
>>> print(response)
你好👋!我是人工智能助手 ChatGLM-6B,很高兴见到你,欢迎问我任何问题。
>>> response, history = model.chat(tokenizer, "晚上睡不着应该怎么办", history=history)
>>> print(response)

-----------------------------------------------------------
-----------------------------------------------------------

// 代码调用：从本地调用模型

fork repo // 你要自己去github上fork
git clone your_chatglm_repo 

cd your_chatglm_repo 
pip install -r requirements.txt // 这里默认下载的是2.0.1 torch

pip uninstall torch torchvision torchaudio
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118

// 验证cuda和torch匹配
>>> import torch
>>> torch.cuda.is_available()
True
>>>

// 下载模型参数

- 完整模型  把[这里都下载](https://huggingface.co/THUDM/chatglm-6b/tree/main)，然后拖到你clone的那个repo里

- chatglm-6b-int8 把[这里都下载](https://huggingface.co/THUDM/chatglm-6b-int8/tree/main)，然后拖到你clone的那个repo（chatglm-6b-int8) 里

- chatglm-6b-int4 把[这里都下载](https://huggingface.co/THUDM/chatglm-6b-int4/tree/main)，然后拖到你clone的那个repo（chatglm-6b-int4) 里


// 运行量化的准备：下载gcc all packages 并添加正确的path，注意最上面的会最先匹配

// 正式调用

from transformers import AutoTokenizer, AutoModel
tokenizer = AutoTokenizer.from_pretrained("D:\\workspace_valeria\\ChatGLM-6b_int4",trust_remote_code=True,revision="v1.1.0")
model = AutoModel.from_pretrained("D:\\workspace_valeria\\ChatGLM-6b_int4",trust_remote_code=True,revision="v1.1.0").half().cuda()
model = model.eval()
response, history = model.chat(tokenizer, "你好", history=[])
print(response)

请将上述"D:\\workspace_valeria\\ChatGLM-6b_int4"更换为你本地的your_chatglm_repo_path

最终的输出

(chatglm_env) PS D:\workspace_valeria\ChatGLM-6B_git> python
Python 3.9.16 (main, May 17 2023, 17:49:16) [MSC v.1916 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> torch.cuda.is_available()
True
>>> from transformers import AutoTokenizer, AutoModel
>>> tokenizer = AutoTokenizer.from_pretrained("D:\\workspace_valeria\\ChatGLM-6b_int4",trust_remote_code=True,revision="v1.1.0")
>>> model = AutoModel.from_pretrained("D:\\workspace_valeria\\ChatGLM-6b_int4",trust_remote_code=True,revision="v1.1.0").half().cuda()


No compiled kernel found.
Compiling kernels : C:\Users\godli\.cache\huggingface\modules\transformers_modules\ChatGLM-6b_int4\quantization_kernels_parallel.c
Compiling gcc -O3 -fPIC -pthread -fopenmp -std=c99 C:\Users\godli\.cache\huggingface\modules\transformers_modules\ChatGLM-6b_int4\quantization_kernels_parallel.c -shared -o C:\Users\godli\.cache\huggingface\modules\transformers_modules\ChatGLM-6b_int4\quantization_kernels_parallel.so
D:/TDM-GCC-64/bin/../lib/gcc/x86_64-w64-mingw32/10.3.0/../../../../x86_64-w64-mingw32/bin/ld.exe: skipping incompatible C:/MinGW/lib/../lib/libmingwthrd.a when searching for -lmingwthrd
D:/TDM-GCC-64/bin/../lib/gcc/x86_64-w64-mingw32/10.3.0/../../../../x86_64-w64-mingw32/bin/ld.exe: skipping incompatible C:/MinGW/lib/../lib\libmingwthrd.a when searching for -lmingwthrd
D:/TDM-GCC-64/bin/../lib/gcc/x86_64-w64-mingw32/10.3.0/../../../../x86_64-w64-mingw32/bin/ld.exe: skipping incompatible C:/MinGW/lib/../lib/libmingw32.a when searching for -lmingw32
D:/TDM-GCC-64/bin/../lib/gcc/x86_64-w64-mingw32/10.3.0/../../../../x86_64-w64-mingw32/bin/ld.exe: skipping incompatible C:/MinGW/lib/../lib\libmingw32.a when searching for -lmingw32
D:/TDM-GCC-64/bin/../lib/gcc/x86_64-w64-mingw32/10.3.0/../../../../x86_64-w64-mingw32/bin/ld.exe: skipping incompatible C:/MinGW/lib/../lib/libkernel32.a when searching for -lkernel32
D:/TDM-GCC-64/bin/../lib/gcc/x86_64-w64-mingw32/10.3.0/../../../../x86_64-w64-mingw32/bin/ld.exe: skipping incompatible C:/MinGW/lib/../lib\libkernel32.a when searching for -lkernel32
D:/TDM-GCC-64/bin/../lib/gcc/x86_64-w64-mingw32/10.3.0/../../../../x86_64-w64-mingw32/bin/ld.exe: skipping incompatible C:/MinGW/lib/../lib/libmoldname.a when searching for -lmoldname
D:/TDM-GCC-64/bin/../lib/gcc/x86_64-w64-mingw32/10.3.0/../../../../x86_64-w64-mingw32/bin/ld.exe: skipping incompatible C:/MinGW/lib/../lib\libmoldname.a when searching for -lmoldname
D:/TDM-GCC-64/bin/../lib/gcc/x86_64-w64-mingw32/10.3.0/../../../../x86_64-w64-mingw32/bin/ld.exe: skipping incompatible C:/MinGW/lib/../lib/libmingwex.a when searching for -lmingwex
D:/TDM-GCC-64/bin/../lib/gcc/x86_64-w64-mingw32/10.3.0/../../../../x86_64-w64-mingw32/bin/ld.exe: skipping incompatible C:/MinGW/lib/../lib\libmingwex.a when searching for -lmingwex
D:/TDM-GCC-64/bin/../lib/gcc/x86_64-w64-mingw32/10.3.0/../../../../x86_64-w64-mingw32/bin/ld.exe: skipping incompatible C:/MinGW/lib/../lib/libmsvcrt.a when searching for -lmsvcrt
D:/TDM-GCC-64/bin/../lib/gcc/x86_64-w64-mingw32/10.3.0/../../../../x86_64-w64-mingw32/bin/ld.exe: skipping incompatible C:/MinGW/lib/../lib\libmsvcrt.a when searching for -lmsvcrt
D:/TDM-GCC-64/bin/../lib/gcc/x86_64-w64-mingw32/10.3.0/../../../../x86_64-w64-mingw32/bin/ld.exe: skipping incompatible C:/MinGW/lib/../lib/libkernel32.a when searching for -lkernel32
D:/TDM-GCC-64/bin/../lib/gcc/x86_64-w64-mingw32/10.3.0/../../../../x86_64-w64-mingw32/bin/ld.exe: skipping incompatible C:/MinGW/lib/../lib\libkernel32.a when searching for -lkernel32
D:/TDM-GCC-64/bin/../lib/gcc/x86_64-w64-mingw32/10.3.0/../../../../x86_64-w64-mingw32/bin/ld.exe: skipping incompatible C:/MinGW/lib/../lib/libadvapi32.a when searching for -ladvapi32
D:/TDM-GCC-64/bin/../lib/gcc/x86_64-w64-mingw32/10.3.0/../../../../x86_64-w64-mingw32/bin/ld.exe: skipping incompatible C:/MinGW/lib/../lib\libadvapi32.a when searching for -ladvapi32
D:/TDM-GCC-64/bin/../lib/gcc/x86_64-w64-mingw32/10.3.0/../../../../x86_64-w64-mingw32/bin/ld.exe: skipping incompatible C:/MinGW/lib/../lib/libshell32.a when searching for -lshell32
D:/TDM-GCC-64/bin/../lib/gcc/x86_64-w64-mingw32/10.3.0/../../../../x86_64-w64-mingw32/bin/ld.exe: skipping incompatible C:/MinGW/lib/../lib\libshell32.a when searching for -lshell32
D:/TDM-GCC-64/bin/../lib/gcc/x86_64-w64-mingw32/10.3.0/../../../../x86_64-w64-mingw32/bin/ld.exe: skipping incompatible C:/MinGW/lib/../lib/libuser32.a when searching for -luser32
D:/TDM-GCC-64/bin/../lib/gcc/x86_64-w64-mingw32/10.3.0/../../../../x86_64-w64-mingw32/bin/ld.exe: skipping incompatible C:/MinGW/lib/../lib\libuser32.a when searching for -luser32
D:/TDM-GCC-64/bin/../lib/gcc/x86_64-w64-mingw32/10.3.0/../../../../x86_64-w64-mingw32/bin/ld.exe: skipping incompatible C:/MinGW/lib/../lib/libkernel32.a when searching for -lkernel32
D:/TDM-GCC-64/bin/../lib/gcc/x86_64-w64-mingw32/10.3.0/../../../../x86_64-w64-mingw32/bin/ld.exe: skipping incompatible C:/MinGW/lib/../lib\libkernel32.a when searching for -lkernel32
D:/TDM-GCC-64/bin/../lib/gcc/x86_64-w64-mingw32/10.3.0/../../../../x86_64-w64-mingw32/bin/ld.exe: skipping incompatible C:/MinGW/lib/../lib/libmingwthrd.a when searching for -lmingwthrd
D:/TDM-GCC-64/bin/../lib/gcc/x86_64-w64-mingw32/10.3.0/../../../../x86_64-w64-mingw32/bin/ld.exe: skipping incompatible C:/MinGW/lib/../lib\libmingwthrd.a when searching for -lmingwthrd
D:/TDM-GCC-64/bin/../lib/gcc/x86_64-w64-mingw32/10.3.0/../../../../x86_64-w64-mingw32/bin/ld.exe: skipping incompatible C:/MinGW/lib/../lib/libmingw32.a when searching for -lmingw32
D:/TDM-GCC-64/bin/../lib/gcc/x86_64-w64-mingw32/10.3.0/../../../../x86_64-w64-mingw32/bin/ld.exe: skipping incompatible C:/MinGW/lib/../lib\libmingw32.a when searching for -lmingw32
D:/TDM-GCC-64/bin/../lib/gcc/x86_64-w64-mingw32/10.3.0/../../../../x86_64-w64-mingw32/bin/ld.exe: skipping incompatible C:/MinGW/lib/../lib/libkernel32.a when searching for -lkernel32
D:/TDM-GCC-64/bin/../lib/gcc/x86_64-w64-mingw32/10.3.0/../../../../x86_64-w64-mingw32/bin/ld.exe: skipping incompatible C:/MinGW/lib/../lib\libkernel32.a when searching for -lkernel32
D:/TDM-GCC-64/bin/../lib/gcc/x86_64-w64-mingw32/10.3.0/../../../../x86_64-w64-mingw32/bin/ld.exe: skipping incompatible C:/MinGW/lib/../lib/libmoldname.a when searching for -lmoldname
D:/TDM-GCC-64/bin/../lib/gcc/x86_64-w64-mingw32/10.3.0/../../../../x86_64-w64-mingw32/bin/ld.exe: skipping incompatible C:/MinGW/lib/../lib\libmoldname.a when searching for -lmoldname
D:/TDM-GCC-64/bin/../lib/gcc/x86_64-w64-mingw32/10.3.0/../../../../x86_64-w64-mingw32/bin/ld.exe: skipping incompatible C:/MinGW/lib/../lib/libmingwex.a when searching for -lmingwex
D:/TDM-GCC-64/bin/../lib/gcc/x86_64-w64-mingw32/10.3.0/../../../../x86_64-w64-mingw32/bin/ld.exe: skipping incompatible C:/MinGW/lib/../lib\libmingwex.a when searching for -lmingwex
D:/TDM-GCC-64/bin/../lib/gcc/x86_64-w64-mingw32/10.3.0/../../../../x86_64-w64-mingw32/bin/ld.exe: skipping incompatible C:/MinGW/lib/../lib/libmsvcrt.a when searching for -lmsvcrt
D:/TDM-GCC-64/bin/../lib/gcc/x86_64-w64-mingw32/10.3.0/../../../../x86_64-w64-mingw32/bin/ld.exe: skipping incompatible C:/MinGW/lib/../lib\libmsvcrt.a when searching for -lmsvcrt
D:/TDM-GCC-64/bin/../lib/gcc/x86_64-w64-mingw32/10.3.0/../../../../x86_64-w64-mingw32/bin/ld.exe: skipping incompatible C:/MinGW/lib/../lib/libkernel32.a when searching for -lkernel32
D:/TDM-GCC-64/bin/../lib/gcc/x86_64-w64-mingw32/10.3.0/../../../../x86_64-w64-mingw32/bin/ld.exe: skipping incompatible C:/MinGW/lib/../lib\libkernel32.a when searching for -lkernel32
D:/TDM-GCC-64/bin/../lib/gcc/x86_64-w64-mingw32/10.3.0/../../../../x86_64-w64-mingw32/bin/ld.exe: i386 architecture of input file `C:/MinGW/lib/../lib/dllcrt2.o' is incompatible with i386:x86-64 output
D:/TDM-GCC-64/bin/../lib/gcc/x86_64-w64-mingw32/10.3.0/../../../../x86_64-w64-mingw32/bin/ld.exe: warning: cannot find entry symbol DllMainCRTStartup; defaulting to 00000002b4091000
D:/TDM-GCC-64/bin/../lib/gcc/x86_64-w64-mingw32/10.3.0/../../../../x86_64-w64-mingw32/bin/ld.exe: C:/MinGW/lib/../lib/dllcrt2.o:(.text+0x39): undefined reference to `_free'
D:/TDM-GCC-64/bin/../lib/gcc/x86_64-w64-mingw32/10.3.0/../../../../x86_64-w64-mingw32/bin/ld.exe: C:/MinGW/lib/../lib/dllcrt2.o:(.text+0x4f): undefined reference to `_fflush'
D:/TDM-GCC-64/bin/../lib/gcc/x86_64-w64-mingw32/10.3.0/../../../../x86_64-w64-mingw32/bin/ld.exe: C:/MinGW/lib/../lib/dllcrt2.o:(.text+0x83): undefined reference to `_DllMain@12'
D:/TDM-GCC-64/bin/../lib/gcc/x86_64-w64-mingw32/10.3.0/../../../../x86_64-w64-mingw32/bin/ld.exe: C:/MinGW/lib/../lib/dllcrt2.o:(.text+0xb8): undefined reference to `_malloc'
D:/TDM-GCC-64/bin/../lib/gcc/x86_64-w64-mingw32/10.3.0/../../../../x86_64-w64-mingw32/bin/ld.exe: C:/MinGW/lib/../lib/dllcrt2.o:(.text+0xd1): undefined reference to `___dyn_tls_init_callback'
D:/TDM-GCC-64/bin/../lib/gcc/x86_64-w64-mingw32/10.3.0/../../../../x86_64-w64-mingw32/bin/ld.exe: C:/MinGW/lib/../lib/dllcrt2.o:(.text+0xee): undefined reference to `__pei386_runtime_relocator'
D:/TDM-GCC-64/bin/../lib/gcc/x86_64-w64-mingw32/10.3.0/../../../../x86_64-w64-mingw32/bin/ld.exe: C:/MinGW/lib/../lib/dllcrt2.o:(.text+0xf3): undefined reference to `___main'
D:/TDM-GCC-64/bin/../lib/gcc/x86_64-w64-mingw32/10.3.0/../../../../x86_64-w64-mingw32/bin/ld.exe: C:/MinGW/lib/../lib/dllcrt2.o:(.text+0x107): undefined reference to `_DllMain@12'
D:/TDM-GCC-64/bin/../lib/gcc/x86_64-w64-mingw32/10.3.0/../../../../x86_64-w64-mingw32/bin/ld.exe: C:/MinGW/lib/../lib/dllcrt2.o:(.text+0x131): undefined reference to `__errno'
D:/TDM-GCC-64/bin/../lib/gcc/x86_64-w64-mingw32/10.3.0/../../../../x86_64-w64-mingw32/bin/ld.exe: C:/MinGW/lib/../lib/dllcrt2.o:(.text+0x16b): undefined reference to `___dllonexit'
D:/TDM-GCC-64/bin/../lib/gcc/x86_64-w64-mingw32/10.3.0/../../../../x86_64-w64-mingw32/bin/ld.exe: C:/MinGW/lib/../lib/dllcrt2.o:(.text+0x19b): undefined reference to `___dllonexit'
D:/TDM-GCC-64/bin/../lib/gcc/x86_64-w64-mingw32/10.3.0/../../../../x86_64-w64-mingw32/bin/ld.exe: ertr000001.o:(.rdata+0x0): undefined reference to `_pei386_runtime_relocator'
collect2.exe: error: ld returned 1 exit status
Compile default cpu kernel failed, using default cpu kernel code.
Compiling gcc -O3 -fPIC -std=c99 C:\Users\godli\.cache\huggingface\modules\transformers_modules\ChatGLM-6b_int4\quantization_kernels.c -shared -o C:\Users\godli\.cache\huggingface\modules\transformers_modules\ChatGLM-6b_int4\quantization_kernels.so
D:/TDM-GCC-64/bin/../lib/gcc/x86_64-w64-mingw32/10.3.0/../../../../x86_64-w64-mingw32/bin/ld.exe: skipping incompatible C:/MinGW/lib/../lib/libmingw32.a when searching for -lmingw32
D:/TDM-GCC-64/bin/../lib/gcc/x86_64-w64-mingw32/10.3.0/../../../../x86_64-w64-mingw32/bin/ld.exe: skipping incompatible C:/MinGW/lib/../lib\libmingw32.a when searching for -lmingw32
D:/TDM-GCC-64/bin/../lib/gcc/x86_64-w64-mingw32/10.3.0/../../../../x86_64-w64-mingw32/bin/ld.exe: skipping incompatible C:/MinGW/lib/../lib/libkernel32.a when searching for -lkernel32
D:/TDM-GCC-64/bin/../lib/gcc/x86_64-w64-mingw32/10.3.0/../../../../x86_64-w64-mingw32/bin/ld.exe: skipping incompatible C:/MinGW/lib/../lib\libkernel32.a when searching for -lkernel32
D:/TDM-GCC-64/bin/../lib/gcc/x86_64-w64-mingw32/10.3.0/../../../../x86_64-w64-mingw32/bin/ld.exe: skipping incompatible C:/MinGW/lib/../lib/libmoldname.a when searching for -lmoldname
D:/TDM-GCC-64/bin/../lib/gcc/x86_64-w64-mingw32/10.3.0/../../../../x86_64-w64-mingw32/bin/ld.exe: skipping incompatible C:/MinGW/lib/../lib\libmoldname.a when searching for -lmoldname
D:/TDM-GCC-64/bin/../lib/gcc/x86_64-w64-mingw32/10.3.0/../../../../x86_64-w64-mingw32/bin/ld.exe: skipping incompatible C:/MinGW/lib/../lib/libmingwex.a when searching for -lmingwex
D:/TDM-GCC-64/bin/../lib/gcc/x86_64-w64-mingw32/10.3.0/../../../../x86_64-w64-mingw32/bin/ld.exe: skipping incompatible C:/MinGW/lib/../lib\libmingwex.a when searching for -lmingwex
D:/TDM-GCC-64/bin/../lib/gcc/x86_64-w64-mingw32/10.3.0/../../../../x86_64-w64-mingw32/bin/ld.exe: skipping incompatible C:/MinGW/lib/../lib/libmsvcrt.a when searching for -lmsvcrt
D:/TDM-GCC-64/bin/../lib/gcc/x86_64-w64-mingw32/10.3.0/../../../../x86_64-w64-mingw32/bin/ld.exe: skipping incompatible C:/MinGW/lib/../lib\libmsvcrt.a when searching for -lmsvcrt
D:/TDM-GCC-64/bin/../lib/gcc/x86_64-w64-mingw32/10.3.0/../../../../x86_64-w64-mingw32/bin/ld.exe: skipping incompatible C:/MinGW/lib/../lib/libkernel32.a when searching for -lkernel32
D:/TDM-GCC-64/bin/../lib/gcc/x86_64-w64-mingw32/10.3.0/../../../../x86_64-w64-mingw32/bin/ld.exe: skipping incompatible C:/MinGW/lib/../lib\libkernel32.a when searching for -lkernel32
D:/TDM-GCC-64/bin/../lib/gcc/x86_64-w64-mingw32/10.3.0/../../../../x86_64-w64-mingw32/bin/ld.exe: skipping incompatible C:/MinGW/lib/../lib/libadvapi32.a when searching for -ladvapi32
D:/TDM-GCC-64/bin/../lib/gcc/x86_64-w64-mingw32/10.3.0/../../../../x86_64-w64-mingw32/bin/ld.exe: skipping incompatible C:/MinGW/lib/../lib\libadvapi32.a when searching for -ladvapi32
D:/TDM-GCC-64/bin/../lib/gcc/x86_64-w64-mingw32/10.3.0/../../../../x86_64-w64-mingw32/bin/ld.exe: skipping incompatible C:/MinGW/lib/../lib/libshell32.a when searching for -lshell32
D:/TDM-GCC-64/bin/../lib/gcc/x86_64-w64-mingw32/10.3.0/../../../../x86_64-w64-mingw32/bin/ld.exe: skipping incompatible C:/MinGW/lib/../lib\libshell32.a when searching for -lshell32
D:/TDM-GCC-64/bin/../lib/gcc/x86_64-w64-mingw32/10.3.0/../../../../x86_64-w64-mingw32/bin/ld.exe: skipping incompatible C:/MinGW/lib/../lib/libuser32.a when searching for -luser32
D:/TDM-GCC-64/bin/../lib/gcc/x86_64-w64-mingw32/10.3.0/../../../../x86_64-w64-mingw32/bin/ld.exe: skipping incompatible C:/MinGW/lib/../lib\libuser32.a when searching for -luser32
D:/TDM-GCC-64/bin/../lib/gcc/x86_64-w64-mingw32/10.3.0/../../../../x86_64-w64-mingw32/bin/ld.exe: skipping incompatible C:/MinGW/lib/../lib/libkernel32.a when searching for -lkernel32
D:/TDM-GCC-64/bin/../lib/gcc/x86_64-w64-mingw32/10.3.0/../../../../x86_64-w64-mingw32/bin/ld.exe: skipping incompatible D:/TDM-GCC-64/bin/../lib/gcc/x86_64-w64-mingw32/10.3.0/../../../../x86_64-w64-mingw32/bin/ld.exe: skipping incompatible D:/TDM-GCC-64/bin/../lib/gcc/x86_64-w64-mingw32/10.3.0/../../../../x86_64-w64-mingw32/bin/ld.exe: skipping incompatible C:/MinGW/lib/../lib/libkernel32.a when searching for -lkernel32
D:/TDM-GCC-64/bin/../lib/gcc/x86_64-w64-mingw32/10.3.0/../../../../x86_64-w64-mingw32/bin/ld.exe: skipping incompatible C:/MinGW/lib/../lib\libkernel32.a when searching for -lkernel32
D:/TDM-GCC-64/bin/../lib/gcc/x86_64-w64-mingw32/10.3.0/../../../../x86_64-w64-mingw32/bin/ld.exe: skipping incompatible C:/MinGW/lib/../lib/libmoldname.a when searching for -lmoldname
D:/TDM-GCC-64/bin/../lib/gcc/x86_64-w64-mingw32/10.3.0/../../../../x86_64-w64-mingw32/bin/ld.exe: skipping incompatible C:/MinGW/lib/../lib\libmoldname.a when searching for -lmoldname
D:/TDM-GCC-64/bin/../lib/gcc/x86_64-w64-mingw32/10.3.0/../../../../x86_64-w64-mingw32/bin/ld.exe: skipping incompatible C:/MinGW/lib/../lib/libmingwex.a when searching for -lmingwex
D:/TDM-GCC-64/bin/../lib/gcc/x86_64-w64-mingw32/10.3.0/../../../../x86_64-w64-mingw32/bin/ld.exe: skipping incompatible C:/MinGW/lib/../lib\libmingwex.a when searching for -lmingwex
D:/TDM-GCC-64/bin/../lib/gcc/x86_64-w64-mingw32/10.3.0/../../../../x86_64-w64-mingw32/bin/ld.exe: skipping incompatible C:/MinGW/lib/../lib/libmsvcrt.a when searching for -lmsvcrt
D:/TDM-GCC-64/bin/../lib/gcc/x86_64-w64-mingw32/10.3.0/../../../../x86_64-w64-mingw32/bin/ld.exe: skipping incompatible C:/MinGW/lib/../lib\libmsvcrt.a when searching for -lmsvcrt
D:/TDM-GCC-64/bin/../lib/gcc/x86_64-w64-mingw32/10.3.0/../../../../x86_64-w64-mingw32/bin/ld.exe: skipping incompatible C:/MinGW/lib/../lib/libkernel32.a when searching for -lkernel32
D:/TDM-GCC-64/bin/../lib/gcc/x86_64-w64-mingw32/10.3.0/../../../../x86_64-w64-mingw32/bin/ld.exe: skipping incompatible C:/MinGW/lib/../lib\libkernel32.a when searching for -lkernel32
D:/TDM-GCC-64/bin/../lib/gcc/x86_64-w64-mingw32/10.3.0/../../../../x86_64-w64-mingw32/bin/ld.exe: i386 architecture of input file `C:/MinGW/lib/../lib/dllcrt2.o' is incompatible with i386:x86-64 output
D:/TDM-GCC-64/bin/../lib/gcc/x86_64-w64-mingw32/10.3.0/../../../../x86_64-w64-mingw32/bin/ld.exe: warning: cannot find entry symbol DllMainCRTStartup; defaulting to 00000003307e1000
D:/TDM-GCC-64/bin/../lib/gcc/x86_64-w64-mingw32/10.3.0/../../../../x86_64-w64-mingw32/bin/ld.exe: C:/MinGW/lib/../lib/dllcrt2.o:(.text+0x39): undefined reference to `_free'
D:/TDM-GCC-64/bin/../lib/gcc/x86_64-w64-mingw32/10.3.0/../../../../x86_64-w64-mingw32/bin/ld.exe: C:/MinGW/lib/../lib/dllcrt2.o:(.text+0x4f): undefined reference to `_fflush'
D:/TDM-GCC-64/bin/../lib/gcc/x86_64-w64-mingw32/10.3.0/../../../../x86_64-w64-mingw32/bin/ld.exe: C:/MinGW/lib/../lib/dllcrt2.o:(.text+0x83): undefined reference to `_DllMain@12'
D:/TDM-GCC-64/bin/../lib/gcc/x86_64-w64-mingw32/10.3.0/../../../../x86_64-w64-mingw32/bin/ld.exe: C:/MinGW/lib/../lib/dllcrt2.o:(.text+0xb8): undefined reference to `_malloc'
D:/TDM-GCC-64/bin/../lib/gcc/x86_64-w64-mingw32/10.3.0/../../../../x86_64-w64-mingw32/bin/ld.exe: C:/MinGW/lib/../lib/dllcrt2.o:(.text+0xd1): undefined reference to `___dyn_tls_init_callback'
D:/TDM-GCC-64/bin/../lib/gcc/x86_64-w64-mingw32/10.3.0/../../../../x86_64-w64-mingw32/bin/ld.exe: C:/MinGW/lib/../lib/dllcrt2.o:(.text+0xee): undefined reference to `__pei386_runtime_relocator'
D:/TDM-GCC-64/bin/../lib/gcc/x86_64-w64-mingw32/10.3.0/../../../../x86_64-w64-mingw32/bin/ld.exe: C:/MinGW/lib/../lib/dllcrt2.o:(.text+0xf3): undefined reference to `___main'
D:/TDM-GCC-64/bin/../lib/gcc/x86_64-w64-mingw32/10.3.0/../../../../x86_64-w64-mingw32/bin/ld.exe: C:/MinGW/lib/../lib/dllcrt2.o:(.text+0x107): undefined reference to `_DllMain@12'
D:/TDM-GCC-64/bin/../lib/gcc/x86_64-w64-mingw32/10.3.0/../../../../x86_64-w64-mingw32/bin/ld.exe: C:/MinGW/lib/../lib/dllcrt2.o:(.text+0x131): undefined reference to `__errno'
D:/TDM-GCC-64/bin/../lib/gcc/x86_64-w64-mingw32/10.3.0/../../../../x86_64-w64-mingw32/bin/ld.exe: C:/MinGW/lib/../lib/dllcrt2.o:(.text+0x16b): undefined reference to `___dllonexit'
D:/TDM-GCC-64/bin/../lib/gcc/x86_64-w64-mingw32/10.3.0/../../../../x86_64-w64-mingw32/bin/ld.exe: C:/MinGW/lib/../lib/dllcrt2.o:(.text+0x19b): undefined reference to `___dllonexit'
D:/TDM-GCC-64/bin/../lib/gcc/x86_64-w64-mingw32/10.3.0/../../../../x86_64-w64-mingw32/bin/ld.exe: ertr000001.o:(.rdata+0x0): undefined reference to `_pei386_runtime_relocator'
collect2.exe: error: ld returned 1 exit status
Compile default cpu kernel failed.
Failed to load kernel.
Cannot load cpu kernel, don't use quantized model on cpu.

Using quantization cache
Applying quantization to glm layers


>>> model = model.eval()
>>> response, history = model.chat(tokenizer, "你好", history=[])
The dtype of attention mask (torch.int64) is not bool
>>> print(response)
你好👋！我是人工智能助手 ChatGLM-6B，很高兴见到你，欢迎问我任何问题。
>>>

完成环境提示


-   OS:Windows11
-   Python:3.9.0
-   Transformers:4.27.1
-   PyTorch:2.0.1+cu118
-   CUDA Support (`python -c "import torch; print(torch.cuda.is_available())"`) :True
-   gcc: 6.3.0

创建虚拟环境

conda create -n chatglm_env python=3.9

conda activate chatglm_env

(chatglm_env) PS C:\Users\godli> pip list
Package                 Version
----------------------- ------------
accelerate              0.20.3
aiofiles                23.1.0
aiohttp                 3.8.4
aiosignal               1.3.1
altair                  5.0.1
astroid                 2.15.1
async-timeout           4.0.2
attrs                   23.1.0
autopep8                1.6.0
coloredlogs             15.0.1
cpm-kernels             1.0.11
dataclasses-json        0.5.8
dill                    0.3.6
docstring-to-markdown   0.12
ffmpy                   0.3.0
filelock                3.12.1
flake8                  6.0.0
frozenlist              1.3.3
fsspec                  2023.6.0
future                  0.18.3
google-search-results   2.4.2
gradio                  3.35.2
gradio_client           0.2.7
greenlet                2.0.2
huggingface-hub         0.15.1
humanfriendly           10.0
isort                   5.12.0
itchat                  1.2.32
jedi                    0.18.2
jsonschema              4.17.3
langchain               0.0.198
langchainplus-sdk       0.0.8
latex2mathml            3.76.0
lazy-object-proxy       1.9.0
linkify-it-py           2.0.2
lxml                    4.9.2
markdown-it-py          2.2.0
marshmallow             3.19.0
marshmallow-enum        1.5.1
mccabe                  0.7.0
mdit-py-plugins         0.3.3
mdtex2html              1.2.0
mdurl                   0.1.2
multidict               6.0.4
mypy-extensions         1.0.0
numexpr                 2.8.4
openai                  0.27.8
openapi-schema-pydantic 1.2.4
orjson                  3.9.1
packaging               23.1
parso                   0.8.3
pip                     23.1.2
platformdirs            3.2.0
pluggy                  1.0.0
psutil                  5.9.5
pycodestyle             2.10.0
pydantic                1.10.9
pydocstyle              6.2.3
pydub                   0.25.1
pyflakes                3.0.1
pylint                  2.17.1
pypdf                   3.9.1
pypdfium2               4.15.0
pypiwin32               223
pypng                   0.20220715.0
PyQRCode                1.2.1
pyreadline3             3.4.1
pyrsistent              0.19.3
python-dotenv           1.0.0
python-lsp-jsonrpc      1.0.0
python-lsp-server       1.7.1
pytoolconfig            1.2.5
pywin32                 306
PyYAML                  6.0
qrcode                  7.4.2
requests-toolbelt       1.0.0
rope                    1.7.0
semantic-version        2.10.0
sentencepiece           0.1.99
setuptools              67.8.0
snowballstemmer         2.2.0
SQLAlchemy              2.0.16
tenacity                8.2.2
toml                    0.10.2
tomli                   2.0.1
tomlkit                 0.11.7
typing_extensions       4.5.0
typing-inspect          0.9.0
uc-micro-py             1.0.2
ujson                   5.7.0
whatthepatch            1.0.4
wheel                   0.38.4
wxpy                    0.3.9.8
yapf                    0.32.0
yarl                    1.9.2

很显然，conda没有帮我们安装pytorch以及其他的包，需要我们手动下载

chatglm在windows上私有化部署 | 零基础小白爬坑

安装module并处理冲突

请forkchatglm-6b的repo 后clone到本地，进入该文件夹后执行：

pip install -r requirements.txt

这时我们看一下环境里的包列表

(chatglm_env) PS D:\workspace_valeria\ChatGLM-6B_git> pip list
Package                 Version
----------------------- ------------
accelerate              0.20.3
aiofiles                23.1.0
aiohttp                 3.8.4
aiosignal               1.3.1
altair                  5.0.1
anyio                   3.7.0
astroid                 2.15.1
async-timeout           4.0.2
attrs                   23.1.0
autopep8                1.6.0
certifi                 2022.12.7
charset-normalizer      2.1.1
click                   8.1.3
colorama                0.4.6
coloredlogs             15.0.1
contourpy               1.1.0
cpm-kernels             1.0.11
cycler                  0.11.0
dataclasses-json        0.5.8
dill                    0.3.6
docstring-to-markdown   0.12
exceptiongroup          1.1.1
fastapi                 0.97.0
ffmpy                   0.3.0
filelock                3.12.1
flake8                  6.0.0
fonttools               4.40.0
frozenlist              1.3.3
fsspec                  2023.6.0
future                  0.18.3
google-search-results   2.4.2
gradio                  3.35.2
gradio_client           0.2.7
greenlet                2.0.2
h11                     0.14.0
httpcore                0.17.2
httpx                   0.24.1
huggingface-hub         0.15.1
humanfriendly           10.0
idna                    3.4
importlib-metadata      6.6.0
importlib-resources     5.12.0
isort                   5.12.0
itchat                  1.2.32
jedi                    0.18.2
Jinja2                  3.1.2
jsonschema              4.17.3
kiwisolver              1.4.4
langchain               0.0.198
langchainplus-sdk       0.0.8
latex2mathml            3.76.0
lazy-object-proxy       1.9.0
linkify-it-py           2.0.2
lxml                    4.9.2
Markdown                3.4.3
markdown-it-py          2.2.0
MarkupSafe              2.1.2
marshmallow             3.19.0
marshmallow-enum        1.5.1
matplotlib              3.7.1
mccabe                  0.7.0
mdit-py-plugins         0.3.3
mdtex2html              1.2.0
mdurl                   0.1.2
mpmath                  1.2.1
multidict               6.0.4
mypy-extensions         1.0.0
networkx                3.0
numexpr                 2.8.4
numpy                   1.24.1
openai                  0.27.8
openapi-schema-pydantic 1.2.4
orjson                  3.9.1
packaging               23.1
pandas                  2.0.2
parso                   0.8.3
Pillow                  9.3.0
pip                     23.1.2
platformdirs            3.2.0
pluggy                  1.0.0
protobuf                4.23.3
psutil                  5.9.5
pycodestyle             2.10.0
pydantic                1.10.9
pydocstyle              6.2.3
pydub                   0.25.1
pyflakes                3.0.1
Pygments                2.15.1
pylint                  2.17.1
pyparsing               3.0.9
pypdf                   3.9.1
pypdfium2               4.15.0
pypiwin32               223
pypng                   0.20220715.0
PyQRCode                1.2.1
pyreadline3             3.4.1
pyrsistent              0.19.3
python-dateutil         2.8.2
python-dotenv           1.0.0
python-lsp-jsonrpc      1.0.0
python-lsp-server       1.7.1
python-multipart        0.0.6
pytoolconfig            1.2.5
pytz                    2023.3
pywin32                 306
PyYAML                  6.0
qrcode                  7.4.2
regex                   2023.6.3
requests                2.28.1
requests-toolbelt       1.0.0
rope                    1.7.0
semantic-version        2.10.0
sentencepiece           0.1.99
setuptools              67.8.0
six                     1.16.0
sniffio                 1.3.0
snowballstemmer         2.2.0
SQLAlchemy              2.0.16
starlette               0.27.0
sympy                   1.11.1
tenacity                8.2.2
tokenizers              0.13.3
toml                    0.10.2
tomli                   2.0.1
tomlkit                 0.11.7
toolz                   0.12.0
torch                   2.0.1
tqdm                    4.65.0
transformers            4.27.1
typing_extensions       4.5.0
typing-inspect          0.9.0
tzdata                  2023.3
uc-micro-py             1.0.2
ujson                   5.7.0
urllib3                 1.26.13
uvicorn                 0.22.0
websockets              11.0.3
whatthepatch            1.0.4
wheel                   0.38.4
wxpy                    0.3.9.8
yapf                    0.32.0
yarl                    1.9.2
zipp                    3.15.0

我们看到 torch 2.0.1, 这里有个问题，torch设置的是torch≥1.10，但并不一定和cuda匹配，这样其实是会报错的。

这里涉及到cuda和torch版本匹配的问题

解决方法：

查看你的torch是否支持cuda：


import torch 
torch.cuda.is_available()

# 如果匹配会输出True，但我现在是false

如果输出false怎么办？

查看cuda版本

可以使用命令行查看


nvidia-smi

chatglm在windows上私有化部署 | 零基础小白爬坑

可以看到我的CUDA版本是12.1

尽管pytorch官网上目前只给出了11.8的Cuda支持，但是社区明确表明了兼容高版本Cuda。

对应的pytorch，应是最新版本

 pip uninstall torch torchvision torchaudio
 pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
 
 // 你也可以使用
 `pip install torch==2.0.1+cu118 torchvision==0.15.2+cu118 torchaudio===2.0.2+cu118 -f [https://download.pytorch.org/whl/torch_stable.html](https://download.pytorch.org/whl/torch_stable.html)`

现在的包如下:

torch                   2.0.1+cu118
torchaudio              2.0.2+cu118
torchvision             0.15.2+cu118

验证:


import torch
torch.cuda.is_available()
>> True

下载模型

墙裂推荐手动下载模型参数！！！

自动下载模型参数的话，你只需要在本地跑：

 >>> from transformers import AutoTokenizer, AutoModel
>>> tokenizer = AutoTokenizer.from_pretrained("THUDM/chatglm-6b", trust_remote_code=True)
>>> model = AutoModel.from_pretrained("THUDM/chatglm-6b", trust_remote_code=True).half().cuda()
>>> model = model.eval()
>>> response, history = model.chat(tokenizer, "你好", history=[])
>>> print(response)
你好👋!我是人工智能助手 ChatGLM-6B,很高兴见到你,欢迎问我任何问题。
>>> response, history = model.chat(tokenizer, "晚上睡不着应该怎么办", history=history)
>>> print(response)
晚上睡不着可能会让你感到焦虑或不舒服,但以下是一些可以帮助你入睡的方法:

1. 制定规律的睡眠时间表:保持规律的睡眠时间表可以帮助你建立健康的睡眠习惯,使你更容易入睡。尽量在每天的相同时间上床,并在同一时间起床。
2. 创造一个舒适的睡眠环境:确保睡眠环境舒适,安静,黑暗且温度适宜。可以使用舒适的床上用品,并保持房间通风。
3. 放松身心:在睡前做些放松的活动,例如泡个热水澡,听些轻柔的音乐,阅读一些有趣的书籍等,有助于缓解紧张和焦虑,使你更容易入睡。
4. 避免饮用含有咖啡因的饮料:咖啡因是一种刺激性物质,会影响你的睡眠质量。尽量避免在睡前饮用含有咖啡因的饮料,例如咖啡,茶和可乐。
5. 避免在床上做与睡眠无关的事情:在床上做些与睡眠无关的事情,例如看电影,玩游戏或工作等,可能会干扰你的睡眠。
6. 尝试呼吸技巧:深呼吸是一种放松技巧,可以帮助你缓解紧张和焦虑,使你更容易入睡。试着慢慢吸气,保持几秒钟,然后缓慢呼气。

如果这些方法无法帮助你入睡,你可以考虑咨询医生或睡眠专家,寻求进一步的建议。

但是一般来说网络不好，所以推荐你手动下载并从本地加载模型！

我这里使用git lfs下载会报错~

你可以在 (这里)找到你需要的参数

完整模型把这里都下载，然后拖到你clone的那个repo里
chatglm-6b-int8 把这里都下载，然后拖到你clone的那个repo（chatglm-6b-int8) 里
chatglm-6b-int4 把这里都下载，然后拖到你clone的那个repo（chatglm-6b-int4) 里

但只有这些参数还不够,没有model_config.json等必要信息。所以其实你真的应该下载的地址是这个！

完整模型把这里都下载，然后拖到你clone的那个repo里
chatglm-6b-int8 把这里都下载，然后拖到你clone的那个repo（chatglm-6b-int8) 里
chatglm-6b-int4 把这里都下载，然后拖到你clone的那个repo（chatglm-6b-int4) 里

运行量化模型的准备

官网文本：

环境安装 使用 pip 安装依赖：pip install -r requirements.txt，其中 transformers 库版本推荐为 4.27.1，但理论上不低于 4.23.1 即可。此外，如果需要在 cpu 上运行量化后的模型，还需要安装 gcc 与 openmp。多数 Linux 发行版默认已安装。对于 Windows ，可在安装 TDM-GCC 时勾选 openmp。 Windows 测试环境 gcc 版本为 TDM-GCC 10.3.0， Linux 为 gcc 11.3.0。在 MacOS 上请参考 Q1。

下载gcc

jmeubank.github.io/tdm-gcc/

一般选择10.3.0版本，选择中间的最大的那个tdm64-gcc-10.3.0-2.exe 就可以了

chatglm在windows上私有化部署 | 零基础小白爬坑选择安装all packages版本。

这里默认是已经添加到path了，但还是请你double check一下，是否有重复的gcc path，有可能上一次你安装的时候并没有选择all packages哦~如果有的话，把之前的那个删掉。

chatglm在windows上私有化部署 | 零基础小白爬坑

现在我们可以代码调用了！

从本地加载代码:

from transformers import AutoTokenizer, AutoModel
tokenizer = AutoTokenizer.from_pretrained("D:\\workspace_valeria\\ChatGLM-6b_int4",trust_remote_code=True,revision="v1.1.0")
model = AutoModel.from_pretrained("D:\\workspace_valeria\\ChatGLM-6b_int4",trust_remote_code=True,revision="v1.1.0").half().cuda()
model = model.eval()
response, history = model.chat(tokenizer, "你好", history=[])
print(response)

# 请注意 D:\\workspace_valeria\\ChatGLM-6b_int4 是我自己电脑上的repo位置，你要替换成你自己的
# 如果你的路径错误，没有使用\\双反斜杠，比如D:\workspace\chatglm_6b_int4 ，它会把它当成huggingface的路径去远程下载，并报错：
huggingface_hub.utils._validators.HFValidationError: Repo id must use alphanumeric chars or '-', '_', '.', '--' and '..' are forbidden, '-' and '.' cannot start or end the name, max length is 96: 'D:\workspace\chatglm_6b_int4'.

解决bug

huggingface_hub.utils.validators.HFValidationError: Repo id must use alphanumeric chars or '-', '', '.', '--' and '..' are forbidden, '-' and '.' cannot start or end the name, max length is 96: 'D:\workspace\chatglm_6b_int4'.
这个错会报在

`tokenizer = AutoTokenizer.from_pretrained("YOUR_REPO_ABS_PATH",trust_remote_code=True,revision="v1.1.0")`这里。

原因：

路径错误，没有使用\双反斜杠，比如D:\workspace\chatglm_6b_int4 ，它会把它当成huggingface的路径去远程下载，并报错

解决方法：

正确使用本地路径：D:\workspace_valeria\ChatGLM-6b_int4

[BUG/Help] Windows环境下使用GPU加载INT-4模型报错

报错如下:

>>> model = AutoModel.from_pretrained("D:\\workspace_valeria\\ChatGLM-6b_int4",trust_remote_code=True,revision="v1.1.0").half().cuda()
No compiled kernel found.
Compiling kernels : C:\Users\godli\.cache\huggingface\modules\transformers_modules\local\quantization_kernels_parallel.c
Compiling gcc -O3 -fPIC -pthread -fopenmp -std=c99 C:\Users\godli\.cache\huggingface\modules\transformers_modules\local\quantization_kernels_parallel.c -shared -o C:\Users\godli\.cache\huggingface\modules\transformers_modules\local\quantization_kernels_parallel.so
c:/mingw/bin/../lib/gcc/mingw32/6.3.0/../../../../mingw32/bin/ld.exe: cannot find -lpthread
collect2.exe: error: ld returned 1 exit status
Compile default cpu kernel failed, using default cpu kernel code.
Compiling gcc -O3 -fPIC -std=c99 C:\Users\godli\.cache\huggingface\modules\transformers_modules\local\quantization_kernels.c -shared -o C:\Users\godli\.cache\huggingface\modules\transformers_modules\local\quantization_kernels.so
Load default cpu kernel failed:
Traceback (most recent call last):
  File "C:\Users\godli/.cache\huggingface\modules\transformers_modules\local\quantization.py", line 178, in __init__
    kernels = ctypes.cdll.LoadLibrary(kernel_file)
  File "D:\Python\Python39\lib\ctypes\__init__.py", line 452, in LoadLibrary
    return self._dlltype(name)
  File "D:\Python\Python39\lib\ctypes\__init__.py", line 374, in __init__
    self._handle = _dlopen(self._name, mode)
OSError: [WinError 193] %1 不是有效的 Win32 应用程序。

Failed to load kernel.
Cannot load cpu kernel, don't use quantized model on cpu.
Using quantization cache
Applying quantization to glm layers

最终解决方案（可以直接看这里,当然下面也写了原因和解决方案）

其实你在下载gcc时应该下载all packages版本

并注意把path中的32位C:\MinGW\bin去除 ,保留D:\TDM-GCC-64\bin

原因：


Compiling gcc -O3 -fPIC -pthread -fopenmp -std=c99 C:\Users\godli\.cache\huggingface\modules\transformers_modules\local\quantization_kernels_parallel.c -shared -o C:\Users\godli\.cache\huggingface\modules\transformers_modules\local\quantization_kernels_parallel.so

这行日志表示，程序正在尝试使用GCC编译器来编译一个C语言的扩展库。然而，这个编译命令失败了，因为链接器找不到-lpthread库。这是因为在Windows环境下，GCC编译器默认不支持-pthread选项。

然后，程序尝试使用默认的CPU内核代码来编译这个扩展库，但是这个尝试也失败了。最后，程序尝试加载这个扩展库，但是这个尝试也失败了，错误信息是[WinError 193] %1 不是有效的 Win32 应用程序。这个错误通常表示尝试加载的文件不是一个有效的可执行文件或者动态链接库。

解决：

安装和配置GCC编译器，或者在一个支持GCC和-pthread选项的环境（例如Linux）中运行你的代码。

在Windows环境下，怎样让GCC编译器支持-pthread选项。

在Windows环境下，GCC编译器默认不支持-pthread选项，因为Windows并不直接支持POSIX线程（Pthreads）。然而，有一些第三方库可以在Windows上提供Pthreads的支持，例如pthreads-w32。

以下是在Windows环境下安装和使用pthreads-w32的步骤：

下载pthreads-w32。你可以从以下链接下载：www.sourceware.org/pthreads-wi…
这个页面提供的是ftp链接，是美国的链接，你可以使用镜像站mirrors.tuna.tsinghua.edu.cn/sourceware/…
选择最新版2.9.1 pthreads-w32-2-9-1-release.zip
解压下载的文件。你会得到一个包含pthread.h、sched.h、semaphore.h等头文件的目录，以及一个包含pthreadGC2.dll和pthreadVC2.dll等动态链接库的目录。
将头文件的目录添加到你的包含路径（Include Path）中。你可以通过修改环境变量C_INCLUDE_PATH来实现这一点。
包含路径（Include Path）：这是编译器在查找头文件（例如 #include <pthread.h>）时会搜索的目录列表。你可以通过修改环境变量 C_INCLUDE_PATH 来添加新的目录到包含路径中。
多个路径，你可以使用分号（;）来分隔它们。
将动态链接库的目录添加到你的库路径（Library Path）中。你可以通过修改环境变量LIBRARY_PATH来实现这一点。
库路径（Library Path）：这是链接器在查找库文件（例如 -lpthreadGC2）时会搜索的目录列表。你可以通过修改环境变量 LIBRARY_PATH 来添加新的目录到库路径中。
多个路径，你可以使用分号（;）来分隔它们。
在你的GCC编译命令中，使用-lpthreadGC2选项代替-pthread选项。

gcc -O3 -fPIC -lpthreadGC2 -fopenmp -std=c99 C:\Users\godli\.cache\huggingface\modules\transformers_modules\local\quantization_kernels_parallel.c -shared -o C:\Users\godli\.cache\huggingface\modules\transformers_modules\local\quantization_kernels_parallel


c:/mingw/bin/../lib/gcc/mingw32/6.3.0/../../../../mingw32/bin/ld.exe: cannot find -lpthreadGC2
c:/mingw/bin/../lib/gcc/mingw32/6.3.0/../../../../mingw32/bin/ld.exe: cannot find -lpthread
collect2.exe: error: ld returned 1 exit status

又试了一些方法卡住了，准备尝试wsl的时候已经很晚了，但后来交流了一下是重复下载了gcc，之前的同伴gcc没有勾选pthread率先添加到path了，path又是从上往下读的，所以会造成这个bug

我把path中的32位C:\MinGW\bin去除 ,保留D:\TDM-GCC-64\bin(all packages)

现在我们可以成功地跑通 model = model.eval()了~可喜可贺！

AttributeError: 'Logger' object has no attribute 'warning_once'

gcc -O3 -fPIC -lpthreadGC2 -fopenmp -std=c99 C:\Users\godli\.cache\huggingface\modules\transformers_modules\local\quantization_kernels_parallel.c -shared -o C:\Users\godli\.cache\huggingface\modules\transformers_modules\local\quantization_kernels_parallel


c:/mingw/bin/../lib/gcc/mingw32/6.3.0/../../../../mingw32/bin/ld.exe: cannot find -lpthreadGC2
c:/mingw/bin/../lib/gcc/mingw32/6.3.0/../../../../mingw32/bin/ld.exe: cannot find -lpthread
collect2.exe: error: ld returned 1 exit status

原因

transformer的版本不是4.27.1 , 这时的版本是4.26.1

解决

把transformer升级为4.27.1解决问题！

pip install protobuf transformers==4.27.1 cpm_kernels

>>> response, history = model.chat(tokenizer, "你好", history=[])
The dtype of attention mask (torch.int64) is not bool
>>> print(response)
你好👋！我是人工智能助手 ChatGLM-6B，很高兴见到你，欢迎问我任何问题。
>>> response, history = model.chat(tokenizer, "你好", history=[])
>>> print(response)
你好👋！我是人工智能助手 ChatGLM-6B，很高兴见到你，欢迎问我任何问题。
>>> response, history = model.chat(tokenizer, "你是谷歌开发的", history=[])
>>> print(response)
我不是谷歌开发的。我是一个名为 ChatGLM-6B 的人工智能助手，是由清华大学 KEG 实验室和智谱 AI 公司于 2023 年共同训练的语言 模型开发的。我的任务是针对用户的问题和要求提供适当的答复和支持。
>>>

🎉🎉🎉🎉🎉🎉完结撒花!

参考资料：

zhuanlan.zhihu.com/p/620455056

github.com/THUDM/ChatG…

blog.csdn.net/AiTanXiing/…

转载自:https://juejin.cn/post/7245682364932669496