@vane

vane@lemmy.world · edit-2 3 days ago

I run this one. https://ollama.com/library/deepseek-r1:32b-qwen-distill-q4_K_M with this frontend https://github.com/open-webui/open-webui on single rtx 3090 hardware 64gb ram. It works quite well for what I wanted it to do. I wanted to connect 2x 3090 cards with slurm to run 70b models but haven’t found time to do it.

vane@lemmy.world · edit-2 3 days ago

I believe you can run 30B models on single used rtx 3090 24GB at least I run 32B deepseek-r1 on it using ollama. Just make sure you have enought ram > 24GB.

vane@lemmy.world · edit-2 3 days ago

If you want to use supercomputer software, setup SLURM scheduler on those machines. There are many tutorials how to do distributed gpu computing with slurm. I have it on my todo list.
https://github.com/SchedMD/slurm
https://slurm.schedmd.com/