ML Tools for legacy Nvidia Kepler Card some with working docker container.
Supported Cards
Currently using is GT 720.
Latest NVIDIA Driver: 470.256.02
Latest CUDA Version: 11.4
Latest CUDA Compute Compatibility: 3.5
Additional supported cards can refer to the link below. It should work the same as long as have the same CUDA Compatibility version.
https://en.wikipedia.org/wiki/Kepler_(microarchitecture)
Tested ML Tools
Ollama → Currently not working due to incompatibility CUDA Compute version, but progress can be track here https://github.com/ollama/ollama/issues/1756
llama.cpp → Working with force Environment Variable to use CUDA Compute 3.5, and CUDA 11.4.
Working under docker container https://hub.docker.com/r/liej6799/llama-kepler can supply your gguf model
docker run -it -v /root/models:/models -p 8080:8080 liej6799/llama-kepler --host 0.0.0.0 --port 8080 -m /models/qwen2.5-0.5b-instruct-fp16.gguf
Change the volume and models to use.
PyTorch (Torch and TorchVision)
Special thanks to Nelson: https://blog.nelsonliu.me/2020/10/13/newer-pytorch-binaries-for-older-gpus/ for building newer torch version with older compatibility
Check your CUDA Version by running nvidia-smi
How to read this name?
torch-1.12.1+cu113-cp310-cp310-linux_x86_64.whl
torch-1.12.1 is the pytorch version
cu113 is the CUDA Version (11.3) which is the latest for me (11.4)
cp310 is the python version (Need to match when installing with pip)
Here is the sample running the official mnist training example https://github.com/pytorch/examples/blob/main/mnist/main.py
Due to older pytorch version, might need to disable mps when running the official example.
python3 main.py --no-mps
Final pip Requirement Looks like
certifi==2025.1.31
charset-normalizer==3.4.1
idna==3.10
numpy==1.26.4
pillow==11.1.0
requests==2.32.3
torch @ file:///torch-1.11.0%2Bcu113-cp310-cp310-linux_x86_64.whl
torchvision==0.12.0
typing_extensions==4.12.2
urllib3==2.3.0
To Note:
numpy version need to be installed before 2.0.0. torchvision need to match with the torch version
onnxruntime
Able to find for CUDA 11.4 version for onnxruntime and install, but it needs CUDA Toolkit installed and missing CUDA library file. Currently still not working, possible to work if there is a container with that CUDA Toolkit 11.4.
https://onnxruntime.ai/docs/execution-providers/CUDA-ExecutionProvider.html
BONUS : GPU Sharing for multiple Docker/Kubernetes Container
As this GPU is still supported by the NVIDIA Container Toolkit, it is possible to share GPU either in Docker / Kubernetes.
Currently the official NVIDIA Device Plugins https://github.com/NVIDIA/k8s-device-plugin, does not allow sharing GPU on each node.
Workaround:
For Docker
It is possible to share GPU Resources with Docker and NVIDIA Container Toolkit
Can try to pass --runtime=nvidia --gpus all on ubuntu container, but dont exit.
sudo docker run --rm --runtime=nvidia --gpus all ubuntu nvidia-smi
For Kubernetes
As the system is limited to one gpu per node, we can create mutliple LXC with GPU Access, this way each LXC will be treated as seperate node.
If you use k3s for the kubenetes, use docker as the backend.
REF https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/sample-workload.html