WindowsLinux系统 安装 CUDA 和 cuDNN
Windows安装前的准备工作
-
检查硬件兼容性:确认电脑显卡为 NVIDIA GPU。通过快捷键 Win + R 唤出“运行”,输入“control /name Microsoft.DeviceManager”唤出“设备管理器”,点击“显示适配器”查看是否有 NVIDIA 字样。
-
验证 CUDA 支持性:通过快捷键 Win + R 唤出“运行”,输入“cmd”唤出命令行,在命令行中输入“nvidia-smi”,查看右上角显示的 CUDA 版本,该数字表示驱动支持的最高 CUDA 版本,CUDA 版本需与显卡驱动、cuDNN 版本严格匹配,否则会导致兼容性问题。
安装 CUDA
-
下载 CUDA Toolkit:访问 CUDA Toolkit Archive(,CUDA Toolkit Archive | NVIDIA Developerhttps://developer.nvidia.com/cuda-toolkit-archive),根据自己的操作系统版本、显卡型号和需要安装的 CUDA 版本,选择对应的安装包进行下载。
-
运行安装程序:双击下载好的安装程序,根据安装向导提示进行操作。建议选择自定义安装,可根据自己的需求进行相关设置,如安装路径等。
-
配置环境变量:安装完成后,需要将 CUDA 的路径添加到系统环境变量中。在 Windows 操作系统上,可以通过右键点击“计算机”(或“此电脑”)-> 属性 -> 高级系统设置 -> 环境变量,在系统变量中找到“Path”变量并添加 CUDA 的安装路径。一般 CUDA 安装完成后会自动加入到系统环境变量中,如果提示 nvcc 或 nvidia 命令找不到,则需要手动配置。
-
验证安装:打开命令提示符,输入“nvcc -V”,如果能正确输出版本信息,则说明 CUDA 安装成功。
安装 cuDNN
-
下载 cuDNN:访问 cuDNN Archive(https://developer.nvidia.com/rdp/cudnn-archive),选择与已安装的 CUDA 版本相匹配的 cuDNN 版本进行下载。
-
解压并安装:解压下载好的 cuDNN 文件至 CUDA 安装目录。如果是默认安装路径,CUDA 安装目录为“C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.4”,v12.4 为安装的 CUDA 版本。解压文件至 CUDA 安装目录时,系统会提示“替换目标中的文件”,点击替换即可。
-
验证安装:打开命令提示符,进入 CUDA 安装目录下的“bin”文件夹,运行“deviceQuery.exe”,如果结果显示为 pass,则证明 cuDNN 安装成功。
在 Ubuntu 系统上安装 NVIDIA 驱动、CUDA 和 cuDNN 的详细教程:
首先使用docker拉取一个Ubuntu镜像,在容器中运行,不要破坏原环境;
安装docker及docker-compose这步省略
拉取Ubuntu镜像
docker pull docker.m.daocloud.io/ubuntu:20.04
创建目录Ubuntu存放文件并新建docker-compose.yaml文件
services:ubuntu:build: ./buildimage: ubuntu_kcontainer_name: ubuntu_krestart: alwaysruntime: nvidiaprivileged: trueenvironment:# - CUDA_VISIBLE_DEVICES=1- HF_ENDPOINT=https://hf-mirror.com- HF_HUB_ENABLE_HF_TRANSFER=1ports:- 60:22volumes:- ./data:/data- ./root:/roottty: truedeploy:resources:reservations:devices:- driver: nvidiacount: allcapabilities: [gpu]restart_policy:condition: on-failuredelay: 5smax_attempts: 3window: 120s
FROM docker.m.daocloud.io/ubuntu:20.04
MAINTAINER Csars (Csars@qq.com)
ADD ./sources.list /etc/apt/
RUN export DEBIAN_FRONTEND=noninteractive \&& apt-get update \&& apt-get install -y curl \&& apt-get install -y git \&& apt-get install -y openssh-server
# Configure SSH server
RUN mkdir /var/run/sshd
RUN echo 'root:root' | chpasswd
RUN sed -i 's/PermitRootLogin prohibit-password/PermitRootLogin yes/' /etc/ssh/sshd_config# SSH login fix. Otherwise user is kicked off after login
RUN sed 's@session\s*required\s*pam_loginuid.so@session optional pam_loginuid.so@g' -i /etc/pam.d/sshd#ENV NOTVISIBLE "in users profile"
RUN echo "export VISIBLE=now" >> /etc/profile
ADD ./sshd_config /etc/ssh/#RUN npm install -g https://gaccode.com/claudecode/install# Expose the SSH port
EXPOSE 22ENTRYPOINT ["/usr/sbin/sshd", "-D"]
#sources.list 可更换为适用版本的镜像源
deb http://mirrors.aliyun.com/ubuntu/ focal main restricted universe multiverse
deb-src http://mirrors.aliyun.com/ubuntu/ focal main restricted universe multiverse
deb http://mirrors.aliyun.com/ubuntu/ focal-security main restricted universe multiverse
deb-src http://mirrors.aliyun.com/ubuntu/ focal-security main restricted universe multiverse
deb http://mirrors.aliyun.com/ubuntu/ focal-updates main restricted universe multiverse
deb-src http://mirrors.aliyun.com/ubuntu/ focal-updates main restricted universe multiverse
deb http://mirrors.aliyun.com/ubuntu/ focal-proposed main restricted universe multiverse
deb-src http://mirrors.aliyun.com/ubuntu/ focal-proposed main restricted universe multiverse
deb http://mirrors.aliyun.com/ubuntu/ focal-backports main restricted universe multiverse
deb-src http://mirrors.aliyun.com/ubuntu/ focal-backports main restricted universe multiverse
20.04版本更换源文件
第一步:备份源文件:
sudo cp /etc/apt/sources.list /etc/apt/sources.list.backup第二步:编辑/etc/apt/sources.list文件在文件最前面添加以下条目(操作前请做好相应备份):
vi /etc/apt/sources.list网易163源# 默认注释了源码镜像以提高 apt update 速度,如有需要可自行取消注释
deb http://mirrors.163.com/ubuntu/ focal main restricted universe multiverse
deb http://mirrors.163.com/ubuntu/ focal-security main restricted universe multiverse
deb http://mirrors.163.com/ubuntu/ focal-updates main restricted universe multiverse
deb http://mirrors.163.com/ubuntu/ focal-backports main restricted universe multiverse
# deb-src http://mirrors.163.com/ubuntu/ focal main restricted universe multiverse
# deb-src http://mirrors.163.com/ubuntu/ focal-security main restricted universe multiverse
# deb-src http://mirrors.163.com/ubuntu/ focal-updates main restricted universe multiverse
# deb-src http://mirrors.163.com/ubuntu/ focal-backports main restricted universe multiverse
# 预发布软件源,不建议启用
# deb http://mirrors.163.com/ubuntu/ focal-proposed main restricted universe multiverse
# deb-src http://mirrors.163.com/ubuntu/ focal-proposed main restricted universe multiverse第三步:执行更新命令:sudo apt-get update
sudo apt-get upgrade常用国内源:阿里云源deb http://mirrors.aliyun.com/ubuntu/ focal main restricted universe multiverse
deb-src http://mirrors.aliyun.com/ubuntu/ focal main restricted universe multiverse
deb http://mirrors.aliyun.com/ubuntu/ focal-security main restricted universe multiverse
deb-src http://mirrors.aliyun.com/ubuntu/ focal-security main restricted universe multiverse
deb http://mirrors.aliyun.com/ubuntu/ focal-updates main restricted universe multiverse
deb-src http://mirrors.aliyun.com/ubuntu/ focal-updates main restricted universe multiverse
deb http://mirrors.aliyun.com/ubuntu/ focal-proposed main restricted universe multiverse
deb-src http://mirrors.aliyun.com/ubuntu/ focal-proposed main restricted universe multiverse
deb http://mirrors.aliyun.com/ubuntu/ focal-backports main restricted universe multiverse
deb-src http://mirrors.aliyun.com/ubuntu/ focal-backports main restricted universe multiverse清华源# 默认注释了源码镜像以提高 apt update 速度,如有需要可自行取消注释
deb https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ focal main restricted universe multiverse
# deb-src https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ focal main restricted universe multiverse
deb https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ focal-updates main restricted universe multiverse
# deb-src https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ focal-updates main restricted universe multiverse
deb https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ focal-backports main restricted universe multiverse
# deb-src https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ focal-backports main restricted universe multiverse
deb https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ focal-security main restricted universe multiverse
# deb-src https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ focal-security main restricted universe multiverse# 预发布软件源,不建议启用
# deb https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ focal-proposed main restricted universe multiverse
# deb-src https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ focal-proposed main restricted universe multiverse中科大源deb https://mirrors.ustc.edu.cn/ubuntu/ focal main restricted universe multiverse
deb-src https://mirrors.ustc.edu.cn/ubuntu/ focal main restricted universe multiverse
deb https://mirrors.ustc.edu.cn/ubuntu/ focal-updates main restricted universe multiverse
deb-src https://mirrors.ustc.edu.cn/ubuntu/ focal-updates main restricted universe multiverse
deb https://mirrors.ustc.edu.cn/ubuntu/ focal-backports main restricted universe multiverse
deb-src https://mirrors.ustc.edu.cn/ubuntu/ focal-backports main restricted universe multiverse
deb https://mirrors.ustc.edu.cn/ubuntu/ focal-security main restricted universe multiverse
deb-src https://mirrors.ustc.edu.cn/ubuntu/ focal-security main restricted universe multiverse
deb https://mirrors.ustc.edu.cn/ubuntu/ focal-proposed main restricted universe multiverse
deb-src https://mirrors.ustc.edu.cn/ubuntu/ focal-proposed main restricted universe multiverse网易163源deb http://mirrors.163.com/ubuntu/ focal main restricted universe multiverse
deb http://mirrors.163.com/ubuntu/ focal-security main restricted universe multiverse
deb http://mirrors.163.com/ubuntu/ focal-updates main restricted universe multiverse
deb http://mirrors.163.com/ubuntu/ focal-proposed main restricted universe multiverse
deb http://mirrors.163.com/ubuntu/ focal-backports main restricted universe multiverse
deb-src http://mirrors.163.com/ubuntu/ focal main restricted universe multiverse
deb-src http://mirrors.163.com/ubuntu/ focal-security main restricted universe multiverse
deb-src http://mirrors.163.com/ubuntu/ focal-updates main restricted universe multiverse
deb-src http://mirrors.163.com/ubuntu/ focal-proposed main restricted universe multiverse
deb-src http://mirrors.163.com/ubuntu/ focal-backports main restricted universe multiverse
# cat sshd_config配置文件# $OpenBSD: sshd_config,v 1.103 2018/04/09 20:41:22 tj Exp $# This is the sshd server system-wide configuration file. See
# sshd_config(5) for more information.# This sshd was compiled with PATH=/usr/bin:/bin:/usr/sbin:/sbin# The strategy used for options in the default sshd_config shipped with
# OpenSSH is to specify options with their default value where
# possible, but leave them commented. Uncommented options override the
# default value.#Port 22
#AddressFamily any
#ListenAddress 0.0.0.0
#ListenAddress ::#HostKey /etc/ssh/ssh_host_rsa_key
#HostKey /etc/ssh/ssh_host_ecdsa_key
#HostKey /etc/ssh/ssh_host_ed25519_key# Ciphers and keying
#RekeyLimit default none# Logging
#SyslogFacility AUTH
#LogLevel INFO# Authentication:#LoginGraceTime 2m
#PermitRootLogin prohibit-password
PermitRootLogin yes
#StrictModes yes
#MaxAuthTries 6
#MaxSessions 10PubkeyAuthentication yes
#RSAAuthentication yes# Expect .ssh/authorized_keys2 to be disregarded by default in future.
#AuthorizedKeysFile .ssh/authorized_keys .ssh/authorized_keys2#AuthorizedPrincipalsFile none#AuthorizedKeysCommand none
#AuthorizedKeysCommandUser nobody# For this to work you will also need host keys in /etc/ssh/ssh_known_hosts
#HostbasedAuthentication no
# Change to yes if you don't trust ~/.ssh/known_hosts for
# HostbasedAuthentication
#IgnoreUserKnownHosts no
# Don't read the user's ~/.rhosts and ~/.shosts files
#IgnoreRhosts yes# To disable tunneled clear text passwords, change to no here!
#PasswordAuthentication yes
#PermitEmptyPasswords no# Change to yes to enable challenge-response passwords (beware issues with
# some PAM modules and threads)
ChallengeResponseAuthentication no# Kerberos options
#KerberosAuthentication no
#KerberosOrLocalPasswd yes
#KerberosTicketCleanup yes
#KerberosGetAFSToken no# GSSAPI options
#GSSAPIAuthentication no
#GSSAPICleanupCredentials yes
#GSSAPIStrictAcceptorCheck yes
#GSSAPIKeyExchange no# Set this to 'yes' to enable PAM authentication, account processing,
# and session processing. If this is enabled, PAM authentication will
# be allowed through the ChallengeResponseAuthentication and
# PasswordAuthentication. Depending on your PAM configuration,
# PAM authentication via ChallengeResponseAuthentication may bypass
# the setting of "PermitRootLogin without-password".
# If you just want the PAM account and session checks to run without
# PAM authentication, then enable this but set PasswordAuthentication
# and ChallengeResponseAuthentication to 'no'.
UsePAM yes#AllowAgentForwarding yes
#AllowTcpForwarding yes
#GatewayPorts no
X11Forwarding yes
#X11DisplayOffset 10
#X11UseLocalhost yes
#PermitTTY yes
PrintMotd no
#PrintLastLog yes
#TCPKeepAlive yes
#PermitUserEnvironment no
#Compression delayed
#ClientAliveInterval 0
#ClientAliveCountMax 3
#UseDNS no
#PidFile /var/run/sshd.pid
#MaxStartups 10:30:100
#PermitTunnel no
#ChrootDirectory none
#VersionAddendum none# no default banner path
#Banner none# Allow client to pass locale environment variables
AcceptEnv LANG LC_*# override default of no subsystems
Subsystem sftp /usr/lib/openssh/sftp-server# Example of overriding settings on a per-user basis
#Match User anoncvs
# X11Forwarding no
# AllowTcpForwarding no
# PermitTTY no
# ForceCommand cvs server
安装 NVIDIA 驱动
-
检查显卡是否被识别
lspci | grep -i nvidia
如果能看到 NVIDIA 显卡信息,说明系统已识别到显卡。
-
安装内核头文件:
sudo apt-get install linux-headers-$(uname -r)
-
添加 CUDA 仓库并安装驱动:
sudo dpkg -i cuda-keyring_1.1-1_all.deb sudo apt-get update sudo apt-get install nvidia-driver-535 -y sudo reboot
重启后,通过以下命令验证驱动是否安装成功
nvidia-smi
如果能看到驱动版本和 CUDA 版本,说明驱动安装成功。
安装 CUDA Toolkit
-
添加 NVIDIA CUDA 官方软件源
sudo apt-get install -y software-properties-common sudo add-apt-repository ppa:graphics-drivers/ppa sudo apt-get update
-
安装 CUDA Toolkit
sudo apt-get install -y cuda-toolkit-12-5
这里以 CUDA 12.5 为例,安装过程中会自动处理依赖关系,安装匹配的 NVIDIA 驱动。
-
配置系统环境变量: 编辑
~/.bashrc
文件,在文件末尾添加以下内容export PATH=/usr/local/cuda-12.5/bin${PATH:+:${PATH}} export LD_LIBRARY_PATH=/usr/local/cuda-12.5/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
保存并关闭文件后,执行以下命令使配置立即生效
source ~/.bashrc
-
验证 CUDA 安装
nvcc --version
如果能看到版本信息,说明 CUDA 安装成功。
安装 cuDNN
-
下载 cuDNN: 访问 cuDNN Archive,选择与已安装的 CUDA 版本相匹配的 cuDNN 版本进行下载。
-
解压并安装: 解压下载好的 cuDNN 文件至 CUDA 安装目录。例如,CUDA 安装目录为
/usr/local/cuda-12.5
,解压文件至该目录时,系统会提示“替换目标中的文件”,点击替换即可。 -
验证安装: 运行以下命令验证 cuDNN 是否安装成功
sudo ldconfig /usr/local/cuda-12.5/lib64
驱动及 CUDA 安装位置
-
NVIDIA 驱动:通常安装在
/usr/lib/nvidia-<driver-version>
和/usr/lib32/nvidia-<driver-version>
目录下。 -
CUDA Toolkit:默认安装路径为
/usr/local/cuda-<version>
,例如/usr/local/cuda-12.5
。
可选步骤:安装 NVIDIA Container Toolkit(用于 Docker)
为了让 Docker 容器能够使用 GPU,可以安装 NVIDIA Container Toolkit:
-
设置 GPG 密钥和软件源
curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg \ && curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
-
更新软件包列表并安装
sudo apt-get update sudo apt-get install -y nvidia-container-toolkit
-
配置 Docker 守护进程
sudo nvidia-ctk runtime configure
-
重启 Docker 服务
sudo systemctl restart docker
-
验证 Docker 容器是否能调用 GPU
sudo docker run --rm --gpus all nvidia/cuda:12.5.1-base-ubuntu22.04 nvidia-smi
如果命令成功执行,并且在容器的输出中看到了和主机上一样的
nvidia-smi
表格,说明配置成功。