This shows you the differences between two versions of the page.
| Next revision | Previous revision | ||
| tips:k80 [2024/07/18 19:26] – created sscipioni | tips:k80 [2024/07/19 07:36] (current) – [pytorch] sscipioni | ||
|---|---|---|---|
| Line 1: | Line 1: | ||
| ====== K80 ====== | ====== K80 ====== | ||
| - | < | + | <code=" |
| pacman -S nvidia-470xx-dkms nvidia-470xx-settings nvidia-470xx-util | pacman -S nvidia-470xx-dkms nvidia-470xx-settings nvidia-470xx-util | ||
| pacman -U https:// | pacman -U https:// | ||
| + | |||
| </ | </ | ||
| Line 34: | Line 35: | ||
| </ | </ | ||
| + | |||
| + | < | ||
| + | # cat / | ||
| + | NVRM version: NVIDIA UNIX x86_64 Kernel Module | ||
| + | GCC version: | ||
| + | |||
| + | # nvcc -V | ||
| + | nvcc: NVIDIA (R) Cuda compiler driver | ||
| + | Copyright (c) 2005-2021 NVIDIA Corporation | ||
| + | Built on Sun_Aug_15_21: | ||
| + | Cuda compilation tools, release 11.4, V11.4.120 | ||
| + | Build cuda_11.4.r11.4/ | ||
| + | |||
| + | </ | ||
| + | |||
| + | |||
| + | samples | ||
| + | < | ||
| + | git clone --depth 1 --branch v11.4.1 https:// | ||
| + | cd / | ||
| + | cd Samples/ | ||
| + | make | ||
| + | </ | ||
| + | |||
| + | deviceQuery | ||
| + | < | ||
| + | cd / | ||
| + | ./ | ||
| + | |||
| + | ./ | ||
| + | |||
| + | CUDA Device Query (Runtime API) version (CUDART static linking) | ||
| + | |||
| + | Detected 2 CUDA Capable device(s) | ||
| + | |||
| + | Device 0: "Tesla K80" | ||
| + | CUDA Driver Version / Runtime Version | ||
| + | CUDA Capability Major/Minor version number: | ||
| + | Total amount of global memory: | ||
| + | (013) Multiprocessors, | ||
| + | GPU Max Clock rate: 824 MHz (0.82 GHz) | ||
| + | Memory Clock rate: 2505 Mhz | ||
| + | Memory Bus Width: | ||
| + | L2 Cache Size: | ||
| + | Maximum Texture Dimension Size (x, | ||
| + | Maximum Layered 1D Texture Size, (num) layers | ||
| + | Maximum Layered 2D Texture Size, (num) layers | ||
| + | Total amount of constant memory: | ||
| + | Total amount of shared memory per block: | ||
| + | Total shared memory per multiprocessor: | ||
| + | Total number of registers available per block: 65536 | ||
| + | Warp size: 32 | ||
| + | Maximum number of threads per multiprocessor: | ||
| + | Maximum number of threads per block: | ||
| + | Max dimension size of a thread block (x,y,z): (1024, 1024, 64) | ||
| + | Max dimension size of a grid size (x,y,z): (2147483647, | ||
| + | Maximum memory pitch: | ||
| + | Texture alignment: | ||
| + | Concurrent copy and kernel execution: | ||
| + | Run time limit on kernels: | ||
| + | Integrated GPU sharing Host Memory: | ||
| + | Support host page-locked memory mapping: | ||
| + | Alignment requirement for Surfaces: | ||
| + | Device has ECC support: | ||
| + | Device supports Unified Addressing (UVA): | ||
| + | Device supports Managed Memory: | ||
| + | Device supports Compute Preemption: | ||
| + | Supports Cooperative Kernel Launch: | ||
| + | Supports MultiDevice Co-op Kernel Launch: | ||
| + | Device PCI Domain ID / Bus ID / location ID: 0 / 3 / 0 | ||
| + | Compute Mode: | ||
| + | < Default (multiple host threads can use :: | ||
| + | |||
| + | Device 1: "Tesla K80" | ||
| + | CUDA Driver Version / Runtime Version | ||
| + | CUDA Capability Major/Minor version number: | ||
| + | Total amount of global memory: | ||
| + | (013) Multiprocessors, | ||
| + | GPU Max Clock rate: 824 MHz (0.82 GHz) | ||
| + | Memory Clock rate: 2505 Mhz | ||
| + | Memory Bus Width: | ||
| + | L2 Cache Size: | ||
| + | Maximum Texture Dimension Size (x, | ||
| + | Maximum Layered 1D Texture Size, (num) layers | ||
| + | Maximum Layered 2D Texture Size, (num) layers | ||
| + | Total amount of constant memory: | ||
| + | Total amount of shared memory per block: | ||
| + | Total shared memory per multiprocessor: | ||
| + | Total number of registers available per block: 65536 | ||
| + | Warp size: 32 | ||
| + | Maximum number of threads per multiprocessor: | ||
| + | Maximum number of threads per block: | ||
| + | Max dimension size of a thread block (x,y,z): (1024, 1024, 64) | ||
| + | Max dimension size of a grid size (x,y,z): (2147483647, | ||
| + | Maximum memory pitch: | ||
| + | Texture alignment: | ||
| + | Concurrent copy and kernel execution: | ||
| + | Run time limit on kernels: | ||
| + | Integrated GPU sharing Host Memory: | ||
| + | Support host page-locked memory mapping: | ||
| + | Alignment requirement for Surfaces: | ||
| + | Device has ECC support: | ||
| + | Device supports Unified Addressing (UVA): | ||
| + | Device supports Managed Memory: | ||
| + | Device supports Compute Preemption: | ||
| + | Supports Cooperative Kernel Launch: | ||
| + | Supports MultiDevice Co-op Kernel Launch: | ||
| + | Device PCI Domain ID / Bus ID / location ID: 0 / 4 / 0 | ||
| + | Compute Mode: | ||
| + | < Default (multiple host threads can use :: | ||
| + | > Peer access from Tesla K80 (GPU0) -> Tesla K80 (GPU1) : Yes | ||
| + | > Peer access from Tesla K80 (GPU1) -> Tesla K80 (GPU0) : Yes | ||
| + | |||
| + | deviceQuery, | ||
| + | Result = PASS | ||
| + | |||
| + | </ | ||
| + | |||
| + | |||
| + | ===== pytorch ===== | ||
| + | |||
| + | < | ||
| + | pip3 install torch torchvision --index-url https:// | ||
| + | |||
| + | python -c " | ||
| + | # 2.3.1+cu118 | ||
| + | # True | ||
| + | </ | ||
| + | |||
| + | ===== ultralitycs ===== | ||
| + | |||
| + | < | ||
| + | pip install ultralytics | ||
| + | |||
| + | python -c " | ||
| + | # True | ||
| + | </ | ||
| + | |||