Demo

Share the GPU for multiple kubernetes application, streaming and gaming with sunshine and moonlight, create a clean container for ml development (trying out latest rocm driver with onnxruntime, pytorch, tinygrad).
Current Limitation on Kubernetes
Due to the limitation of the current implemented device driver for kubernetes, it is not possible to share GPUs across different pods on a single node. This limitation applies to both AMD and NVIDIA GPUs.
one gpu instances → one nodes
https://github.com/NVIDIA/k8s-device-plugin
https://github.com/ROCm/k8s-device-plugin
Current Tested Cards
NVIDIA GeForce GT 720
AMD Radeon RX 6600
Workaround
Creates an LXC on Proxmox and pass the GPU to the container, you can create multiple container with the same GPU passed in. As both of them treated as seperate system with GPU passed in, you can create kubernetes nodes and pass the single GPU.
So if you want to run 3 GPU Apps like
- jellyfin (for hardware transcoding)
- ollama (for running llm models)
- onnxruntime for running models
You will create 3 LXC Instances, with the same driver installed on all 3 container. Then manuualy join those container into the kubernetes cluster.
Bonus
Other than kubernetes, you can also create an LXC container for running headless steam games and stream over with sunshine + moonlight.
https://github.com/games-on-whales/gow
Other than gaming, it is possible to run xfce application also. (But currently not able to run as root), this will replace existing remote desktop application such as (rdp, spice, vnc, etc) with hardware accelerated encoding in h265 format.