第一步:先向昇騰方申請設備,申請到 Atlas 800 9000 服務器,使用昇騰官方提供的賬號和密碼保證可以登錄上服務器;
(1) 更新一下驅動,因為昇騰官方的提供的鏡像需要指定版本的驅動固件,下載安裝更新 Version: 23.0.rc2 將會變更為 Version: 23.0.0,下載地址:社區版 - 固件與驅動 - 昇騰社區
更新安裝固件,并更新固件,重啟設備,一切以昇騰官方的最新驅動和公告為準
[root@dify HwHiAiUser]# pwd /home/HwHiAiUser [root@dify HwHiAiUser]# ls -l total 131112 -rw------- 1 root root 134251528 Dec 7 16:16 Ascend-hdk-910-npu-driver_23.0.0_linux-aarch64.run [root@dify HwHiAiUser]# chmod 777 Ascend-hdk-910-npu-driver_23.0.0_linux-aarch64.run [root@dify HwHiAiUser]# ls Ascend-hdk-910-npu-driver_23.0.0_linux-aarch64.run [root@dify HwHiAiUser]# sudo ./Ascend-hdk-910-npu-driver_23.0.0_linux-aarch64.run --full --force Verifying archive integrity... 100% SHA256 checksums are OK. All good. Uncompressing ASCEND DRIVER RUN PACKAGE 100% [Driver] [2025-02-23 15:46:26] [INFO]Start time: 2025-02-23 15:46:26 [Driver] [2025-02-23 15:46:26] [INFO]LogFile: /var/log/ascend_seclog/ascend_install.log [Driver] [2025-02-23 15:46:26] [INFO]OperationLogFile: /var/log/ascend_seclog/operation.log [Driver] [2025-02-23 15:46:26] [INFO]base version is 23.0.rc2. [Driver] [2025-02-23 15:46:26] [WARNING]Do not power off or restart the system during the installation/upgrade [Driver] [2025-02-23 15:46:26] [INFO]set username and usergroup, HwHiAiUser:HwHiAiUser [Driver] [2025-02-23 15:46:26] [INFO]Driver package has been installed on the path /usr/local/Ascend, the version is 23.0.rc2, and the version of this package is 23.0.0,do you want to continue? [y/n] y [Driver] [2025-02-23 15:46:36] [INFO]driver install type: Direct [Driver] [2025-02-23 15:46:36] [INFO]upgradePercentage:10% [Driver] [2025-02-23 15:46:40] [INFO]upgradePercentage:30% [Driver] [2025-02-23 15:46:40] [INFO]upgradePercentage:40% [Driver] [2025-02-23 15:46:42] [INFO]upgradePercentage:90% [Driver] [2025-02-23 15:46:45] [INFO]upgradePercentage:100% [Driver] [2025-02-23 15:46:45] [INFO]Driver package installed successfully! Reboot needed for installation/upgrade to take effect! [Driver] [2025-02-23 15:46:45] [INFO]End time: 2025-02-23 15:46:45 [root@dify HwHiAiUser]# sudo reboot
固件更新完成,查看驅動版本為 Version: 23.0.0
(2) 將基礎模型先下載下來,一會進行掛載推理模型,分詞模型、到排序模型,進行使用,可以去魔搭社區下載 ModelScope 魔搭社區,先下載模型:DeepSeek-R1-Distill-Qwen-32B , 下載使用方式參考官方指導方式即可;
使用 python 腳本下載模型
[root@dify HwHiAiUser]# pwd /home/HwHiAiUser [root@dify HwHiAiUser]# pip3 install modelscope==1.18.0 -ihttps://mirrors.tuna.tsinghua.edu.cn/pypi/web/simple [root@dify HwHiAiUser]# python3 Python 3.7.0 (default, May 11 2024, 10:32:14) [GCC 7.3.0] on linux Type "help", "copyright", "credits" or "license" for more information. >>> import modelscope >>> exit() [root@dify HwHiAiUser]# cat down.py #模型下載 from modelscope import snapshot_download model_dir = snapshot_download('deepseek-ai/DeepSeek-R1-Distill-Qwen-32B',cache_dir=".") [root@dify HwHiAiUser]# python3 down.py Downloading [figures/benchmark.jpg]: 100%|██████████████████████████████████████████████████████████████████████| 759k/759k [00:00<00:00, 1.78MB/s] Downloading [config.json]: 100%|██████████████████████████████████████████████████████████████████████████████████| 664/664 [00:00<00:00, 2.10kB/s] Downloading [configuration.json]: 100%|███████████████████████████████████████████████████████████████████████████| 73.0/73.0 [00:00<00:00, 233B/s] Downloading [generation_config.json]: 100%|█████████████████████████████████████████████████████████████████████████| 181/181 [00:00<00:00, 686B/s] Downloading [LICENSE]: 100%|██████████████████████████████████████████████████████████████████████████████████| 1.04k/1.04k [00:00<00:00, 2.92kB/s] Downloading [model-00001-of-000008.safetensors]: 0%| | 1.00M/8.19G [00:00<59:21, 2.47MB/s]Downloading [model-00001-of-000008.safetensors]: 0%| | 16.0M/8.19G [00:00<03:43, 39.3MB/s]
下載完成,查看權重目錄
[root@dify HwHiAiUser]# pwd /home/HwHiAiUser [root@dify HwHiAiUser]# tree -L 2 ├── Ascend-hdk-910-npu-driver_23.0.0_linux-aarch64.run ├── deepseek-ai │ ├── DeepSeek-R1-Distill-Qwen-32B │ └── down.py 3 directories, 3 files
二、使用官方鏡像 昇騰鏡像倉庫詳情(https://www.hiascend.com/developer/ascendhub/detail/mindie),進行昇騰 MindIE 環境構建,因為計劃測試 DeepSeek-R1-Distill-Qwen-32B-W8A8 模型,所以記得創建容器掛載兩張卡即可
(1)拉取 Atals 800 9000 鏡像,建議從官方拉取,自己要根據自己的機型拉取對應的鏡像,一切以官方為主,青島的鏡像也是在官方鏡像上,打包做過細微不影響運行的修改
也可以從下面的公開鏈接拉取鏡像,創建雙卡容器
[root@dify HwHiAiUser]#yum install docker [root@dify HwHiAiUser]# docker pull swr.cn-east-317.qdrgznjszx.com/sxj731533730/mindie:atlas_800_9000 Error response from daemon: Get https://swr.cn-east-317.qdrgznjszx.com/v2/: x509: certificate signed by unknown authority [root@dify HwHiAiUser]#
修改配置源,添加 mindie 的鏡像源;
解決辦法: [root@dify HwHiAiUser]#vim /etc/docker/daemon.json 填入內容 { "insecure-registries": ["https://swr.cn-east-317.qdrgznjszx.com"/], "registry-mirrors": ["https://docker.mirrors.ustc.edu.cn"/] } 保存退出、然后重啟 docker 即可 [root@dify HwHiAiUser]# systemctl restart docker.service [root@dify HwHiAiUser]# docker pull swr.cn-east-317.qdrgznjszx.com/sxj731533730/mindie:atlas_800_9000 atlas_800_9000: Pulling from qd-aicc/mindie edab87ea811e: Pull complete 72906c864c93: Pull complete 98f62a370e96: Pull complete Digest: sha256:6ceefe4506f58084717ec9bed7df75e51032fdd709d791a627084fe4bd92abea Status: Downloaded newer image for swr.cn-east-317.qdrgznjszx.com/qd-aicc/mindie:atlas_800_9000 [root@dify HwHiAiUser]#
創建容器,進入容器,計劃使用兩張昇騰 NPU 卡推理 DeepSeek-R1-Distill-Qwen-32B 的 W8A8 模型,所以構建的容器用兩張卡,選 6、7 卡吧,0-6 號卡可以跑文本嵌入模型、重排序模型;創建容器腳本
[root@dify ~]# cd /home/HwHiAiUser/ [root@dify HwHiAiUser]# ls Ascend-hdk-910-npu-driver_23.0.0_linux-aarch64.run deepseek-ai down.py [root@dify HwHiAiUser]# docker images REPOSITORY TAG IMAGE ID CREATED SIZE swr.cn-east-317.qdrgznjszx.com/sxj731533730/mindie atlas_800_9000 69f30d0c15be 5 weeks ago 16.5GB [root@dify HwHiAiUser]# vim docker_run.sh [root@dify HwHiAiUser]# vim docker_run.sh [root@dify HwHiAiUser]# vim docker_run.sh [root@dify HwHiAiUser]# cat docker_run.sh #!/bin/bash docker_images=swr.cn-east-317.qdrgznjszx.com/sxj731533730/mindie:atlas_800_9000 model_dir=/home/HwHiAiUser #根據實際情況修改掛載目錄 docker run -it --name qdaicc --ipc=host --net=host \ --device=/dev/davinci6 \ --device=/dev/davinci7 \ --device=/dev/davinci_manager \ --device=/dev/devmm_svm \ --device=/dev/hisi_hdc \ -v /usr/local/dcmi:/usr/local/dcmi \ -v /usr/local/bin/npu-smi:/usr/local/bin/npu-smi \ -v /usr/local/Ascend/driver/lib64/common:/usr/local/Ascend/driver/lib64/common \ -v /usr/local/Ascend/driver/lib64/driver:/usr/local/Ascend/driver/lib64/driver \ -v /etc/ascend_install.info:/etc/ascend_install.info \ -v /etc/vnpu.cfg:/etc/vnpu.cfg \ -v /usr/local/Ascend/driver/version.info:/usr/local/Ascend/driver/version.info \ -v ${model_dir}:${model_dir} \ -v /var/log/npu:/usr/slog ${docker_images} \ /bin/bash [root@dify HwHiAiUser]#
填進去內容如上,啟動鏡像
[root@dify HwHiAiUser]# bash docker_run.sh (Python310) root@dify:/usr/local/Ascend/atb-models# cd /home/HwHiAiUser/ (Python310) root@dify:/home/HwHiAiUser# ls Ascend-hdk-910-npu-driver_23.0.0_linux-aarch64.run deepseek-ai docker_run.sh down.py
因為之前掛在的目錄是 /home/HwHiAiUser/ ,所以可以在 docker 里面看到物理機的下載權重,再查看一下卡數是兩張
(2)進行模型量化 Ascend/ModelZoo-PyTorch - Gitee.com(https://gitee.com/ascend/ModelZoo-PyTorch/tree/master/MindIE/LLM/DeepSeek/DeepSeek-R1-Distill-Qwen-32B) 直接進入量化階段,在容器外面操作即可,環境不用管,因為系統已經默認配置了環境,直接跳到 權重量化 階段,安裝過程缺什么,在 docker 外面 git 下源碼,進入容器內部進行量化,這里的容器建議在創建個 8 卡的容器,雙卡容器量化會顯示 npu 顯存不夠,除非你用 cpu 轉模型,我就懶得創建容器了,使用 cpu 量化吧;
[root@dify HwHiAiUser]# pwd /home/HwHiAiUser [root@dify HwHiAiUser]# git clone https://gitee.com/ascend/msit.git Cloning into 'msit'... remote: Enumerating objects: 81125, done. remote: Total 81125 (delta 0), reused 0 (delta 0), pack-reused 81125 Receiving objects: 100% (81125/81125), 71.73 MiB | 12.14 MiB/s, done. Resolving deltas: 100% (59704/59704), done. [root@dify HwHiAiUser]# cd msit/ .git/ .gitee/ msit/ msmodelslim/ msserviceprofiler/ [root@dify Qwen]# docker start b5399c4da202 b5399c4da202 [root@dify Qwen]# docker exec -it b5399c4da202 /bin/bash (Python310) root@dify:/home/HwHiAiUser/msit# cd msmodelslim/ (Python310) root@dify:/home/HwHiAiUser/msit/msmodelslim# bash install.sh #安裝成功,pip 缺啥安裝啥 (Python310) root@dify:/home/HwHiAiUser# cd /home/HwHiAiUser/msit/msmodelslim/example/Qwen #量化模型 (Python310) root@dify:/home/HwHiAiUser/msit/msmodelslim/example/Qwen# python3 quant_qwen.py --model_path /home/HwHiAiUser/deepseek-ai/DeepSeek-R1-Distill-Qwen-32B/ --save_directory /home/HwHiAiUser/deepseek-ai/DeepSeek-R1-Distill-Qwen-32B-W8A8 --calib_file ../common/boolq.jsonl --w_bit 8 --a_bit 8 --device_type npu 2025-02-23 18:15:25,404 - msmodelslim-logger - WARNING - The current CANN version does not support LayerSelector quantile method. 或者 cpu 處理 (Python310) root@dify:/home/HwHiAiUser/msit/msmodelslim/example/Qwen# python3 quant_qwen.py --model_path /home/HwHiAiUser/deepseek-ai/DeepSeek-R1-Distill-Qwen-32B/ --save_directory /home/HwHiAiUser/deepseek-ai/DeepSeek-R1-Distill-Qwen-32B-W8A8 --calib_file ../common/boolq.jsonl --w_bit 8 --a_bit 8 --device_type cpu 2025-02-23 18:25:10,776 - msmodelslim-logger - WARNING - The current CANN version does not support LayerSelector quantile method.2025-02-23 18:25:10,783 - msmodelslim-logger - WARNING - `cpu` is set as `dev_type`, `dev_id` cannot be specified manually!
轉換完成之后生成權重文件
(Python310) root@dify:/home/HwHiAiUser/deepseek-ai# cd /home/HwHiAiUser/msit/msmodelslim/example/Qwen (Python310) root@dify:/home/HwHiAiUser/msit/msmodelslim/example/Qwen# ls /home/HwHiAiUser/deepseek-ai/ DeepSeek-R1-Distill-Qwen-32B DeepSeek-R1-Distill-Qwen-32B-W8A8 (Python310) root@dify:/home/HwHiAiUser/msit/msmodelslim/example/Qwen# 因為 Atlas 800 9000 不支持 bf16, 所以修改 float16, 其它設備參考昇騰手冊 (Python310) root@dify:/home/HwHiAiUser/msit/msmodelslim/example/Qwen# vim /home/HwHiAiUser/deepseek-ai/DeepSeek-R1-Distill-Qwen-32B-W8A8/config.json
(3)啟動 MindIE 服務,先記錄本機的 ip 地址,模型路徑和以及模型名字
模型路徑權重: /home/HwHiAiUser/deepseek-ai/DeepSeek-R1-Distill-Qwen-32B-W8A8/
模型名字:DeepSeek-R1-Distill-Qwen-32B-W8A8
修改配置文件
(Python310) root@dify:/usr/local/Ascend/mindie/latest/mindie-service# pwd /usr/local/Ascend/mindie/latest/mindie-service (Python310) root@dify:/usr/local/Ascend/mindie/latest/mindie-service# vim conf/config.json
修改解釋一下,ipAddress, 主要為了后面搭建 dify 使用的推理引擎模型,其它參考 mindie 手冊
MindSpore Models 服務化使用 - MindSpore Models 使用 - 模型推理使用流程 - MindIE LLM 開發指南 - 大模型開發 - MindIE1.0.0 開發文檔 - 昇騰社區
https://www.hiascend.com/document/detail/zh/mindie/100/mindiellm/llmdev/mindie_llm0012.html
單機推理 - 配置 MindIE Server - 配置 MindIE-MindIE 安裝指南 - 環境準備 - MindIE1.0.0 開發文檔 - 昇騰社區
https://www.hiascend.com/document/detail/zh/mindie/100/envdeployment/instg/mindie_instg_0026.html
"ipAddress" : "192.168.1.115", 改為本地地址 "httpsEnabled" : false, "npuDeviceIds" : [[0,1]], "modelName" : "DeepSeek-R1-Distill-Qwen-32B-W8A8", "modelWeightPath" : "/home/HwHiAiUser/deepseek-ai/DeepSeek-R1-Distill-Qwen-32B-W8A8/", "maxInputTokenLen" : 4096, "maxIterTimes" : 4096, "truncation" : true,
修改內容如下
(Python310) root@dify:/usr/local/Ascend/mindie/latest/mindie-service# cat conf/config.json { "Version" : "1.0.0", "LogConfig" : { "logLevel" : "Info", "logFileSize" : 20, "logFileNum" : 20, "logPath" : "logs/mindie-server.log" }, "ServerConfig" : { "ipAddress" : "192.168.1.115", "managementIpAddress" : "127.0.0.2", "port" : 1025, "managementPort" : 1026, "metricsPort" : 1027, "allowAllZeroIpListening" : false, "maxLinkNum" : 1000, "httpsEnabled" : false, "fullTextEnabled" : false, "tlsCaPath" : "security/ca/", "tlsCaFile" : ["ca.pem"], "tlsCert" : "security/certs/server.pem", "tlsPk" : "security/keys/server.key.pem", "tlsPkPwd" : "security/pass/key_pwd.txt", "tlsCrlPath" : "security/certs/", "tlsCrlFiles" : ["server_crl.pem"], "managementTlsCaFile" : ["management_ca.pem"], "managementTlsCert" : "security/certs/management/server.pem", "managementTlsPk" : "security/keys/management/server.key.pem", "managementTlsPkPwd" : "security/pass/management/key_pwd.txt", "managementTlsCrlPath" : "security/management/certs/", "managementTlsCrlFiles" : ["server_crl.pem"], "kmcKsfMaster" : "tools/pmt/master/ksfa", "kmcKsfStandby" : "tools/pmt/standby/ksfb", "inferMode" : "standard", "interCommTLSEnabled" : true, "interCommPort" : 1121, "interCommTlsCaPath" : "security/grpc/ca/", "interCommTlsCaFiles" : ["ca.pem"], "interCommTlsCert" : "security/grpc/certs/server.pem", "interCommPk" : "security/grpc/keys/server.key.pem", "interCommPkPwd" : "security/grpc/pass/key_pwd.txt", "interCommTlsCrlPath" : "security/grpc/certs/", "interCommTlsCrlFiles" : ["server_crl.pem"], "openAiSupport" : "vllm" }, "BackendConfig" : { "backendName" : "mindieservice_llm_engine", "modelInstanceNumber" : 1, "npuDeviceIds" : [[0,1]], "tokenizerProcessNumber" : 8, "multiNodesInferEnabled" : false, "multiNodesInferPort" : 1120, "interNodeTLSEnabled" : true, "interNodeTlsCaPath" : "security/grpc/ca/", "interNodeTlsCaFiles" : ["ca.pem"], "interNodeTlsCert" : "security/grpc/certs/server.pem", "interNodeTlsPk" : "security/grpc/keys/server.key.pem", "interNodeTlsPkPwd" : "security/grpc/pass/mindie_server_key_pwd.txt", "interNodeTlsCrlPath" : "security/grpc/certs/", "interNodeTlsCrlFiles" : ["server_crl.pem"], "interNodeKmcKsfMaster" : "tools/pmt/master/ksfa", "interNodeKmcKsfStandby" : "tools/pmt/standby/ksfb", "ModelDeployConfig" : { "maxSeqLen" : 2560, "maxInputTokenLen" : 4096, "truncation" : true, "ModelConfig" : [ { "modelInstanceType" : "Standard", "modelName" : "DeepSeek-R1-Distill-Qwen-32B-W8A8", "modelWeightPath" : "/home/HwHiAiUser/deepseek-ai/DeepSeek-R1-Distill-Qwen-32B-W8A8/", "worldSize" : 2, "cpuMemSize" : 5, "npuMemSize" : -1, "backendType" : "atb", "trustRemoteCode" : false } ] }, "ScheduleConfig" : { "templateType" : "Standard", "templateName" : "Standard_LLM", "cacheBlockSize" : 128, "maxPrefillBatchSize" : 50, "maxPrefillTokens" : 8192, "prefillTimeMsPerReq" : 150, "prefillPolicyType" : 0, "decodeTimeMsPerReq" : 50, "decodePolicyType" : 0, "maxBatchSize" : 200, "maxIterTimes" : 4096, "maxPreemptCount" : 0, "supportSelectBatch" : false, "maxQueueDelayMicroseconds" : 5000 } } }
修改模型權限,啟動服務
(Python310) root@dify:/usr/local/Ascend/mindie/latest/mindie-service# chmod -R 750 /home/HwHiAiUser/deepseek-ai/DeepSeek-R1-Distill-Qwen-32B-W8A8/ (Python310) root@dify:/usr/local/Ascend/mindie/latest/mindie-service# ./bin/mindieservice_daemon Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained. [2025-02-23 19:04:44,279] [89160] [281464373506464] [llm] [INFO][logging.py-227] : Skip binding cpu. Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained. Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained. Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained. Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained. Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained. Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained. Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained. Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained. Daemon start success!
重啟一個終端,查看 npu 使用狀況
本機測試 [root@dify ~]# curl -H "Accept: application/json" -H "Content-type: application/json" -X POST -d '{"inputs":"如何賺大錢","parameters":{"decoder_input_details":true,"details":true,"do_sample":true,"max_new_tokens":50,"repetition_penalty":1.03,"return_full_text":false,"seed":null,"temperature":0.5,"top_k":10,"top_p":0.95,"truncate":null,"typical_p":0.5,"watermark":false}}' http://192.168.1.115:1025/generate {"details":{"prompt_tokens":5,"finish_reason":"length","generated_tokens":50,"prefill":[{"id":151646,"logprob":null,"special":null,"text":null},{"id":100007,"logprob":null,"special":null,"text":null},{"id":102223,"logprob":null,"special":null,"text":null},{"id":26288,"logprob":null,"special":null,"text":null},{"id":99428,"logprob":null,"special":null,"text":null}],"seed":2240260787,"tokens":[{"id":26850,"logprob":null,"special":null,"text":null},{"id":100007,"logprob":null,"special":null,"text":null},{"id":102223,"logprob":null,"special":null,"text":null},{"id":26288,"logprob":null,"special":null,"text":null},{"id":99428,"logprob":null,"special":null,"text":null},{"id":11319,"logprob":null,"special":null,"text":null},{"id":1406,"logprob":null,"special":null,"text":null},{"id":151649,"logprob":null,"special":null,"text":null},{"id":271,"logprob":null,"special":null,"text":null},{"id":102223,"logprob":null,"special":null,"text":null},{"id":26288,"logprob":null,"special":null,"text":null},{"id":99428,"logprob":null,"special":null,"text":null},{"id":102119,"logprob":null,"special":null,"text":null},{"id":85106,"logprob":null,"special":null,"text":null},{"id":100374,"logprob":null,"special":null,"text":null},{"id":99605,"logprob":null,"special":null,"text":null},{"id":9370,"logprob":null,"special":null,"text":null},{"id":101139,"logprob":null,"special":null,"text":null},{"id":5373,"logprob":null,"special":null,"text":null},{"id":85329,"logprob":null,"special":null,"text":null},{"id":33108,"logprob":null,"special":null,"text":null},{"id":99345,"logprob":null,"special":null,"text":null},{"id":101135,"logprob":null,"special":null,"text":null},{"id":1773,"logprob":null,"special":null,"text":null},{"id":87752,"logprob":null,"special":null,"text":null},{"id":99639,"logprob":null,"special":null,"text":null},{"id":97084,"logprob":null,"special":null,"text":null},{"id":102716,"logprob":null,"special":null,"text":null},{"id":39907,"logprob":null,"special":null,"text":null},{"id":48443,"logprob":null,"special":null,"text":null},{"id":14374,"logprob":null,"special":null,"text":null},{"id":220,"logprob":null,"special":null,"text":null},{"id":16,"logprob":null,"special":null,"text":null},{"id":13,"logprob":null,"special":null,"text":null},{"id":3070,"logprob":null,"special":null,"text":null},{"id":99716,"logprob":null,"special":null,"text":null},{"id":102447,"logprob":null,"special":null,"text":null},{"id":1019,"logprob":null,"special":null,"text":null},{"id":256,"logprob":null,"special":null,"text":null},{"id":481,"logprob":null,"special":null,"text":null},{"id":3070,"logprob":null,"special":null,"text":null},{"id":104023,"logprob":null,"special":null,"text":null},{"id":5373,"logprob":null,"special":null,"text":null},{"id":100025,"logprob":null,"special":null,"text":null},{"id":334,"logprob":null,"special":null,"text":null},{"id":5122,"logprob":null,"special":null,"text":null},{"id":67338,"logprob":null,"special":null,"text":null},{"id":101930,"logprob":null,"special":null,"text":null},{"id":99716,"logprob":null,"special":null,"text":null},{"id":101172,"logprob":null,"special":null,"text":null}]},"generated_text":"?\n\n 如何賺大錢?\n\n\n\n\n 賺大錢通常需要結合個人的技能、資源和市場機會。以下是一些常見的方法:\n\n### 1. ** 投資理財 **\n - ** 股票、基金 **:通過長期投資優質"}[root@dify ~]#
三、啟動分詞服務和重排序服務,首先去昇騰倉下載鏡像 昇騰鏡像倉庫詳情(https://www.hiascend.com/developer/ascendhub/detail/mis-tei), 對應自己的設備查找鏡像
(1) 拉取鏡像 Atlas 800 9000,已經要根據自己的硬件版本去官方倉拉取鏡像,進行分詞服務啟動
[root@dify ~]# docker pull swr.cn-east-317.qdrgznjszx.com/sxj731533730/mis-tei:6.0.RC3-910-aarch64 [root@dify ~]# docker images REPOSITORY TAG IMAGE ID CREATED SIZE swr.cn-east-317.qdrgznjszx.com/sxj731533730/mis-tei 6.0.RC3-910-aarch64 affece68b209 2 days ago 22.6GB swr.cn-east-317.qdrgznjszx.com/sxj731533730/mindie atlas_800_9000 69f30d0c15be 5 weeks ago 16.5GB [root@dify ~]#
拉取完鏡像之后,進行必要的權重模型下載
[root@dify ~]# cd /home/HwHiAiUser/ [root@dify HwHiAiUser]# pwd /home/HwHiAiUser [root@dify HwHiAiUser]# vim down.py [root@dify HwHiAiUser]# cat down.py #模型下載 from modelscope import snapshot_download model_dir = snapshot_download('BAAI/bge-m3',cache_dir=".") from modelscope import snapshot_download model_dir = snapshot_download('BAAI/bge-large-zh-v1.5',cache_dir=".") from modelscope import snapshot_download model_dir = snapshot_download('BAAI/bge-reranker-large',cache_dir=".") [root@dify HwHiAiUser]# python3 down.py
下載完模型,修改每一個模型內部的配置項 Atlas800 9000/300I Duo/300V Pro 設備,Atlas 800T A2 等設備不用走該步驟
[root@dify HwHiAiUser]# ls Ascend-hdk-910-npu-driver_23.0.0_linux-aarch64.run BAAI deepseek-ai docker_run.sh down.py msit [root@dify HwHiAiUser]# vim BAAI/bge-large-zh-v1___5/config.json [root@dify HwHiAiUser]# vim BAAI/bge-m3/config.json [root@dify HwHiAiUser]# vim BAAI/bge-reranker-large/config.json "torch_dtype": "float16",
(2)創建三個容器,暫定容器名字是 bge-m3、bge-large-zh-v1___5、bge-reranker-large, 在創建之前,需要聯系昇騰技術人員,開通服務器對外端口,暫定開通的為 8001,8002,8003 和 niginx 轉發端口 - 入方向:| 出方向:TCP/8001,8002,8003,8004,442
將模型拷貝到 /home/data 下,參考官方手冊來即可
[root@dify ~]# cd /home/HwHiAiUser/ [root@dify HwHiAiUser]# ls Ascend-hdk-910-npu-driver_23.0.0_linux-aarch64.run BAAI deepseek-ai docker_run.sh down.py msit [root@dify HwHiAiUser]# pwd /home/HwHiAiUser [root@dify HwHiAiUser]# mkdir -p /home/data [root@dify HwHiAiUser]# cp -r BAAI/* /home/data/ [root@dify HwHiAiUser]# ls /home/data/ bge-large-zh-v1___5 bge-m3 bge-reranker-large [root@dify HwHiAiUser]#
參考官方說明:
ASCEND_VISIBLE_DEVICES 環境變量表示將宿主機上的 npu 卡掛載到容器,如果掛載多張卡使用逗號分隔,如:ASCEND_VISIBLE_DEVICES=0,1,2,3;掛載多張卡到容器時,默認會尋找最優的一張卡調用,如果不希望容器內部自動尋找最優的卡,啟動容器時可通過 TEI_NPU_DEVICE = 卡 id 指定使用哪張卡,注意這里的變量 TEI_NPU_DEVICE 配置從 0 開始取,容器內已將外部卡 id 進行了邏輯映射,編號從 0 連續映射;注意:配置的 ASCEND_VISIBLE_DEVICES 對應的卡不能被其他容器已掛載,否則會報錯
[root@dify ~]# docker run -u root -e TEI_NPU_DEVICE=0 -itd --name=bge-reranker-large --net=host -e HOME=/home/HwHiAiUser --privileged=true -v /home/data:/home/HwHiAiUser/model -v /usr/local/bin/npu-smi:/usr/local/bin/npu-smi -v /usr/local/Ascend/driver:/usr/local/Ascend/driver --entrypoint /home/HwHiAiUser/start.sh swr.cn-east-317.qdrgznjszx.com/sxj731533730/mis-tei:6.0.RC3-910-aarch64 BAAI/bge-reranker-large 192.168.1.115 8001 ef2383785c58ec5a650eb9d852ba965c48eb7b8cc7679cb7c194d2f2d0eb1a0d [root@dify ~]# docker start ef2383785c58ec5a650eb9d852ba965c48eb7b8cc7679cb7c194d2f2d0eb1a0d ef2383785c58ec5a650eb9d852ba965c48eb7b8cc7679cb7c194d2f2d0eb1a0d [root@dify ~]# docker run -u root -e TEI_NPU_DEVICE=1 -itd --name=bge-m3 --net=host -e HOME=/home/HwHiAiUser --privileged=true -v /home/data:/home/HwHiAiUser/model -v /usr/local/bin/npu-smi:/usr/local/bin/npu-smi -v /usr/local/Ascend/driver:/usr/local/Ascend/driver --entrypoint /home/HwHiAiUser/start.sh swr.cn-east-317.qdrgznjszx.com/sxj731533730/mis-tei:6.0.RC3-910-aarch64 BAAI/bge-m3 192.168.1.115 8002 50dd3573f1ae1363211791425a2f681445b220f5a45bbdbe572a361ce974f63a [root@dify ~]# docker start 50dd3573f1ae1363211791425a2f681445b220f5a45bbdbe572a361ce974f63a 50dd3573f1ae1363211791425a2f681445b220f5a45bbdbe572a361ce974f63a bge-large-zh-v1___5 bge-m3 bge-reranker-large [root@dify ~]# docker run -u root -e TEI_NPU_DEVICE=2 -itd --name=bge-large-zh-v1___5 --net=host -e HOME=/home/HwHiAiUser --privileged=true -v /home/data:/home/HwHiAiUser/model -v /usr/local/bin/npu-smi:/usr/local/bin/npu-smi -v /usr/local/Ascend/driver:/usr/local/Ascend/driver --entrypoint /home/HwHiAiUser/start.sh swr.cn-east-317.qdrgznjszx.com/sxj731533730/mis-tei:6.0.RC3-910-aarch64 BAAI/bge-large-zh-v1___5 192.168.1.115 8003 d360f2b558c6556af53e19abd9f0782600f8cab1a7c60dc90fcf0b6061511c96 [root@dify ~]# docker start d360f2b558c6556af53e19abd9f0782600f8cab1a7c60dc90fcf0b6061511c96 d360f2b558c6556af53e19abd9f0782600f8cab1a7c60dc90fcf0b6061511c96
查看一下三個服務,兩個分詞,一個排序模型,當然也可以放在一個 NPU 上運行編輯
記錄一下對外的服務端口 mindie 推理服務 192.168.1.115:1025 ;bge-reranker-large 服務:192.168.1.115:8001 bge-m3 服務:192.168.1.115:8002 bge-large-zh-v1___5 服務: 192.168.1.115:8003
四、部署 dify 環境進行部署配置,部署遇到的最大問題就是昇騰架構使用的 aarch64,gitee 使用 docker 鏡像容器是 x86_64, 所以找鏡像替代即可
(1)拉取 dify 的源碼
[root@dify HwHiAiUser]# git clone https://gitee.com/dify_ai/dify.git Cloning into 'dify'... remote: Enumerating objects: 206836, done. remote: Counting objects: 100% (10350/10350), done. remote: Compressing objects: 100% (5418/5418), done. remote: Total 206836 (delta 6559), reused 7867 (delta 4637), pack-reused 196486 Receiving objects: 100% (206836/206836), 80.47 MiB | 3.03 MiB/s, done. Resolving deltas: 100% (161147/161147), done. [root@dify HwHiAiUser]# cd dify [root@dify dify]# git checkout 0.15.3 Note: checking out '0.15.3'. You are in 'detached HEAD' state. You can look around, make experimental changes and commit them, and you can discard any commits you make in this state without impacting any branches by performing another checkout. If you want to create a new branch to retain commits you create, you may do so (now or later) by using -b with the checkout command again. Example: git checkout -b HEAD is now at ca19bd31d chore(*): Bump version to 0.15.3 (#13308) [root@dify HwHiAiUser]# cd docker/ [root@dify docker]# cp .env.example .env [root@dify docker]# vim .env 修改 848 行、906 行 NGINX_PORT=80 # SSL settings are only applied when HTTPS_ENABLED is true NGINX_SSL_PORT=443 修改 NGINX_PORT=8004 # SSL settings are only applied when HTTPS_ENABLED is true NGINX_SSL_PORT=442 另一處 EXPOSE_NGINX_PORT=80 EXPOSE_NGINX_SSL_PORT=443 修改 EXPOSE_NGINX_PORT=8004 EXPOSE_NGINX_SSL_PORT=442
修改配置文件
[root@dify docker]# vim docker-compose.yaml 第 486 行添加 --ignore-warnings ARM64-COW-BUG
將 492 行 修改 0.2.10 修改為 0.2.1
(2)下載 docker-compose,配置工具
sudo curl -L https://github.com/docker/compose/releases/download/v2.33.0/docker-compose-linux-aarch64-o /usr/local/bin/docker-compose 或者這樣下載 [root@dify docker]# cd /usr/local/bin/ [root@dify bin]# pwd /usr/local/bin [root@dify bin]# wget https://sxj731533730.obs.cn-east-317.qdrgznjszx.com/docker-compose --2025-02-25 21:07:54--https://sxj731533730.obs.cn-east-317.qdrgznjszx.com/docker-compose Resolving sxj731533730.obs.cn-east-317.qdrgznjszx.com (sxj731533730.obs.cn-east-317.qdrgznjszx.com)... 100.125.32.125 Connecting to sxj731533730.obs.cn-east-317.qdrgznjszx.com (sxj731533730.obs.cn-east-317.qdrgznjszx.com)|100.125.32.125|:443... connected. HTTP request sent, awaiting response... 200 OK Length: 71778465 (68M) [application/octet-stream] Saving to: ‘docker-compose’ docker-compose 100%[=====================================================================>] 68.45M 220MB/s in 0.3s 2025-02-25 21:07:54 (220 MB/s) - ‘docker-compose’ saved [71778465/71778465] [root@dify bin]# ls cloud-id cloud-init-per jsondiff jsonpointer modelscope npu-healthcheck.sh tqdm cloud-init docker-compose jsonpatch jsonschema normalizer npu-smi [root@dify bin]# chmod 777 docker-compose [root@dify bin]# docker-compose -v Docker Compose version v2.33.0
(3)拉取鏡像,準備啟動 dify 環境,根據。yaml 找 aarch64 位庫即可
docker pull swr.cn-north-4.myhuaweicloud.com/ddn-k8s/docker.io/langgenius/dify-api:0.15.3-linuxarm64 docker tag swr.cn-north-4.myhuaweicloud.com/ddn-k8s/docker.io/langgenius/dify-api:0.15.3-linuxarm64 docker.io/langgenius/dify-api:0.15.3 docker pull swr.cn-north-4.myhuaweicloud.com/ddn-k8s/docker.io/langgenius/dify-web:0.15.3-linuxarm64 docker tag swr.cn-north-4.myhuaweicloud.com/ddn-k8s/docker.io/langgenius/dify-web:0.15.3-linuxarm64 docker.io/langgenius/dify-web:0.15.3 docker pull swr.cn-north-4.myhuaweicloud.com/ddn-k8s/docker.io/postgres:15-alpine-linuxarm64 docker tag swr.cn-north-4.myhuaweicloud.com/ddn-k8s/docker.io/postgres:15-alpine-linuxarm64 docker.io/postgres:15-alpine docker pull swr.cn-north-4.myhuaweicloud.com/ddn-k8s/docker.io/redis:6-alpine-linuxarm64 docker tag swr.cn-north-4.myhuaweicloud.com/ddn-k8s/docker.io/redis:6-alpine-linuxarm64 docker.io/redis:6-alpine docker pull swr.cn-north-4.myhuaweicloud.com/ddn-k8s/docker.io/langgenius/dify-sandbox:0.2.10-linuxarm64 docker tag swr.cn-north-4.myhuaweicloud.com/ddn-k8s/docker.io/langgenius/dify-sandbox:0.2.10-linuxarm64 docker.io/langgenius/dify-sandbox:0.2.10 docker pull swr.cn-north-4.myhuaweicloud.com/ddn-k8s/docker.io/langgenius/dify-sandbox:0.2.1-linuxarm64 docker tag swr.cn-north-4.myhuaweicloud.com/ddn-k8s/docker.io/langgenius/dify-sandbox:0.2.1-linuxarm64 docker.io/langgenius/dify-sandbox:0.2.1 docker pull swr.cn-north-4.myhuaweicloud.com/ddn-k8s/docker.io/ubuntu/squid:latest-linuxarm64 docker tag swr.cn-north-4.myhuaweicloud.com/ddn-k8s/docker.io/ubuntu/squid:latest-linuxarm64 docker.io/ubuntu/squid:latest docker pull swr.cn-north-4.myhuaweicloud.com/ddn-k8s/docker.io/certbot/certbot:v3.1.0-linuxarm64 docker tag swr.cn-north-4.myhuaweicloud.com/ddn-k8s/docker.io/certbot/certbot:v3.1.0-linuxarm64 docker.io/certbot/certbot:latest docker pull swr.cn-north-4.myhuaweicloud.com/ddn-k8s/docker.io/nginx:latest-linuxarm64 docker tag swr.cn-north-4.myhuaweicloud.com/ddn-k8s/docker.io/nginx:latest-linuxarm64 docker.io/nginx:latest docker pull swr.cn-north-4.myhuaweicloud.com/ddn-k8s/docker.io/pingcap/tidb:v8.4.0-linuxarm64 docker tag swr.cn-north-4.myhuaweicloud.com/ddn-k8s/docker.io/pingcap/tidb:v8.4.0-linuxarm64 docker.io/pingcap/tidb:v8.4.0 docker pull swr.cn-north-4.myhuaweicloud.com/ddn-k8s/docker.io/semitechnologies/weaviate:1.19.0-linuxarm64 docker tag swr.cn-north-4.myhuaweicloud.com/ddn-k8s/docker.io/semitechnologies/weaviate:1.19.0-linuxarm64 docker.io/semitechnologies/weaviate:1.19.0 docker pull swr.cn-north-4.myhuaweicloud.com/ddn-k8s/docker.io/langgenius/qdrant:v1.7.3-linuxarm64 docker tag swr.cn-north-4.myhuaweicloud.com/ddn-k8s/docker.io/langgenius/qdrant:v1.7.3-linuxarm64 docker.io/langgenius/qdrant:v1.7.3 docker pull swr.cn-north-4.myhuaweicloud.com/ddn-k8s/docker.io/pgvector/pgvector:pg16-linuxarm64 docker tag swr.cn-north-4.myhuaweicloud.com/ddn-k8s/docker.io/pgvector/pgvector:pg16-linuxarm64 docker.io/pgvector/pgvector:pg16 docker pull swr.cn-north-4.myhuaweicloud.com/ddn-k8s/docker.io/tensorchord/pgvecto-rs:pg16-v0.3.0-linuxarm64 docker tag swr.cn-north-4.myhuaweicloud.com/ddn-k8s/docker.io/tensorchord/pgvecto-rs:pg16-v0.3.0-linuxarm64 docker.io/tensorchord/pgvecto-rs:pg16-v0.3.0 docker pull swr.cn-north-4.myhuaweicloud.com/ddn-k8s/ghcr.io/chroma-core/chroma:0.5.20-linuxarm64 docker tag swr.cn-north-4.myhuaweicloud.com/ddn-k8s/ghcr.io/chroma-core/chroma:0.5.20-linuxarm64 ghcr.io/chroma-core/chroma:0.5.20 docker pull swr.cn-north-4.myhuaweicloud.com/ddn-k8s/quay.io/oceanbase/oceanbase-ce:4.3.3.0-100000142024101215-linuxarm64 docker tag swr.cn-north-4.myhuaweicloud.com/ddn-k8s/quay.io/oceanbase/oceanbase-ce:4.3.3.0-100000142024101215-linuxarm64 quay.io/oceanbase/oceanbase-ce:4.3.3.0-100000142024101215 docker pull swr.cn-north-4.myhuaweicloud.com/ddn-k8s/docker.io/container-registry.oracle.com/database/free:latest-linuxarm64 docker tag swr.cn-north-4.myhuaweicloud.com/ddn-k8s/docker.io/container-registry.oracle.com/database/free:latest-linuxarm64 docker.io/container-registry.oracle.com/database/free:latest docker pull swr.cn-north-4.myhuaweicloud.com/ddn-k8s/quay.io/coreos/etcd:v3.5.5-linuxarm64 docker tag swr.cn-north-4.myhuaweicloud.com/ddn-k8s/quay.io/coreos/etcd:v3.5.5-linuxarm64 quay.io/coreos/etcd:v3.5.5 docker pull swr.cn-north-4.myhuaweicloud.com/ddn-k8s/docker.io/minio/minio:RELEASE.2023-03-20T20-16-18Z-linuxarm64 docker tag swr.cn-north-4.myhuaweicloud.com/ddn-k8s/docker.io/minio/minio:RELEASE.2023-03-20T20-16-18Z-linuxarm64 docker.io/minio/minio:RELEASE.2023-03-20T20-16-18Z docker pull swr.cn-north-4.myhuaweicloud.com/ddn-k8s/docker.io/milvusdb/milvus:v2.5.0-beta-linuxarm64 docker tag swr.cn-north-4.myhuaweicloud.com/ddn-k8s/docker.io/milvusdb/milvus:v2.5.0-beta-linuxarm64 docker.io/milvusdb/milvus:v2.5.0-beta docker pull swr.cn-north-4.myhuaweicloud.com/ddn-k8s/docker.io/opensearchproject/opensearch:latest-linuxarm64 docker tag swr.cn-north-4.myhuaweicloud.com/ddn-k8s/docker.io/opensearchproject/opensearch:latest-linuxarm64 docker.io/opensearchproject/opensearch:latest docker pull swr.cn-north-4.myhuaweicloud.com/ddn-k8s/docker.io/opensearchproject/opensearch-dashboards:latest-linuxarm64 docker tag swr.cn-north-4.myhuaweicloud.com/ddn-k8s/docker.io/opensearchproject/opensearch-dashboards:latest-linuxarm64 docker.io/opensearchproject/opensearch-dashboards:latest docker pull swr.cn-north-4.myhuaweicloud.com/ddn-k8s/docker.io/myscale/myscaledb:1.6.4-linuxarm64 docker tag swr.cn-north-4.myhuaweicloud.com/ddn-k8s/docker.io/myscale/myscaledb:1.6.4-linuxarm64 docker.io/myscale/myscaledb:1.6.4 docker pull swr.cn-north-4.myhuaweicloud.com/ddn-k8s/docker.elastic.co/elasticsearch/elasticsearch:8.14.3-linuxarm64 docker tag swr.cn-north-4.myhuaweicloud.com/ddn-k8s/docker.elastic.co/elasticsearch/elasticsearch:8.14.3-linuxarm64 docker.elastic.co/elasticsearch/elasticsearch:8.14.3 docker pull swr.cn-north-4.myhuaweicloud.com/ddn-k8s/docker.elastic.co/kibana/kibana:8.14.3-linuxarm64 docker tag swr.cn-north-4.myhuaweicloud.com/ddn-k8s/docker.elastic.co/kibana/kibana:8.14.3-linuxarm64 docker.elastic.co/kibana/kibana:8.14.3 docker pull swr.cn-north-4.myhuaweicloud.com/ddn-k8s/docker.io/robwilkes/unstructured-api:latest-linuxarm64 docker tag swr.cn-north-4.myhuaweicloud.com/ddn-k8s/docker.io/robwilkes/unstructured-api:latest-linuxarm64 docker.io/robwilkes/unstructured-api:latest
然后啟動 dify 成功
[root@dify HwHiAiUser]# cd dify/ [root@dify dify]# cd docker [root@dify docker]# pwd /home/HwHiAiUser/dify/docker [root@dify docker]# docker-compose up -d [+] Running 11/11 ? Network docker_default Created ? Network docker_ssrf_proxy_network Created ? Container docker-sandbox-1 Started ? Container docker-redis-1 Started ? Container docker-web-1 Started ? Container docker-weaviate-1 Started ? Container docker-db-1 Started 1. ? Container docker-ssrf_proxy-1 Started ? Container docker-api-1 Started ? Container docker-worker-1 Started ? Container docker-nginx-1 Started [root@dify docker]#
后臺啟動成功
五、啟動 dify 進行配置界面,在地址欄輸入 http://ip(訪問服務器的 ip 地址):8084 端口,可以刷新出 dify 界面
注冊一下,這個是所有者權限,只能注冊一次,無法修改,如果修改,需要重新拉 dify 服務
使用所有者權限進入賬戶,點擊右邊的設置
選擇模型供應商
在下面的列表中找到這兩個配置項
添加第一個模型 deepseek
OpenAI-API-compatible
類型選 LLM 模型名字對應你的 mindie 的 name:DeepSeek-R1-Distill-Qwen-32B-W8A8 mindie 的 URL:http://192.168.1.115:1025/v1 只要后臺服務啟動中,前端可以保存,就是 ok,秘鑰隨意填
Text Embedding Inference
然后配置排序模型和分詞模型,支持 RAG, 秘鑰隨便寫,只要后臺服務啟動中,前端可以保存,就是 ok
1.1 選擇 RERANK URL 設置 http://192.168.1.115:8001/ 模型名 :bge-reranker-large
1.2 選擇 TEXT EMBEDDING URL 設置 http://192.168.1.115:8002/ 模型名 :bge-large-zh-v1___5
1.3 選擇 TEXT EMBEDDING URL 設置 http://192.168.1.115:8003/ 模型名 :bge-m3
六、實際測試,跑在昇騰上面的 DeepSeek-R1-Distill-Qwen-32B-W8A8 雙卡
測試知識庫 RAG, 看一下知識庫的內容
開始處理文本
測試不掛知識庫結果
測試掛知識庫結果
郵箱分發功能,需要修改源碼 ,修改源碼,從郵箱拿到秘鑰,重啟服務
[root@wuzhoutuili-0003 docker]# vim ../api/tasks/mail_invite_member_task.py [root@wuzhoutuili-0003 docker]# pwd/home/HwHiAiUser/dify/docker
邀約郵件
埋個彩蛋,敬請期待 昇騰服務器部署 one-api+fastgpt, 內測中
特別聲明:以上內容(如有圖片或視頻亦包括在內)為自媒體平臺“網易號”用戶上傳并發布,本平臺僅提供信息存儲服務。
Notice: The content above (including the pictures and videos if any) is uploaded and posted by a user of NetEase Hao, which is a social media platform and only provides information storage services.