dedicated GPU for HW acceleration, machine learning?

PsychoWards · December 29, 2024

Has anyone some experience or tips and recommendation which GPU or similar to get for HW acceleration and/or machine learning?

Immich for example has some machine learning stuff, which I've currently offloaded to my PC, however I'm looking to get a dedicated GPU which I can directly plug in the server. If anyone has some recommendations on what to get and some tips I would greatly appreciate it.

I'm not planning on running my full blown ChatGPT replacing or sth. similar, just to use the machine learning in Immich and to be ready for future lightweight applications which might need some raw power.

Ornival · December 29, 2024

53 minutes ago, PsychoWards said:

Has anyone some experience or tips and recommendation which GPU or similar to get for HW acceleration and/or machine learning?...

...and to be ready for future lightweight applications which might need some raw power.

(I don't dabble in AI, but I have a couple of clusters running model rendering and CAM/CFD simulation jobs)

If current setup fulfils your need, no need for extra HW or peripheral for offloading. Without knowing your current setup, I think you will have to run against some issues before needing more processing power. Intel, AMD and Nvidia current gen can all process some sort of AI/HPC offloading. Once you do expect a bottleneck, you'll probably have a much better idea what process needs more offloading/acceleration. It really depends on your use case(s). You can then lookup what you need.
If you really want to buy something now and price is of no concern: You can't go wrong with CPU processing power (higher core/thread count), RAM and any current gen GPU. Intel Arc B-series, AMD RX 7000 and Nvidia RTX 4000 have acceleration and offloading capabilities (depending on your field or software requirements) and should be mentioned in any software usage guide lines. Nvidia has better support in general, so an RTX (or RTX Quadro) will always be useful. Usually more cores (TMU/ROP), more VRAM is better.

On the other hand: buying (expensive) equipment now, really doesn't save money in the future or even yield better results, if any. If you don't need it now, you won't need it in the near future, and on the horizon there may be better options available. It depends mostly on the software support and requirements, I guess.
Just beware: It (again) really makes no sense buying a RTX 3080 (for example) if you have no use for it now, because your application or software stack might not be optimal. Current lightweight applications rarely diverge into resource hogs...

Plex/ffmpeg seems a good example: Intel iGPU. up to current gen already outperforms any dedicated AMD/Nvidia GPU in realtime transcoding quality, and even when AI optimisation/processing comes into the picture, you can still offload the processing (via docker application container) to a remote processing node (wether CPU or GPU supported). I don't see a future path, where these applications by themselves would support/warrant realtime image enhancements via AI.
Or in the case of Immich: both CPU and GPU (Nvidia) load balancing is supported, you are most likely already to be using your current setup without bottlenecks. I find It unlikely that you would put a better CPU/GPU in your Immich server, just to "future proof" your current server machine. You are more likely to just put your current GPU in the server, and have it do it's thing in the background. You can then treat yourself to a shiny new GPU for gaming, and use that if you. need to help Immich a bit, but for the most part Immich can do it's thing on CPU just fine.

Maybe your Immich example is just the wrong example, but my advice, in case that was not expected 😉 is to just wait until you have a clear target. Machine learning is already ubiquitous, long before AI and machine learning became trendy to the general public. You are not missing out on anything, because you are already on the train and unless you are unhappy with your seat, there is no need to upgrade your seat ticket.

PsychoWards · December 30, 2024

Hey, thank you so much for this detailed answer.

My current setup consists of an i5 14600K and no dedicated GPU in the server and Immich is connecting to my PC to have the ML stuff run on my 3080Ti. The "problem" however is that my PC is not running 24/7 and that I don't want Immich to hit my GPU while i'm gaming. Therefore I thought maybe a dedicated GPU would be helpful for this part.

I don't want to upgrade my GPU, especially since it has a water cooling block installed, and I don't want to put a customer water cooling loop in my server.

My only current usecase is indeed Immich, (which does not support ML on the iGPU) but in the feature some AI recognition for security cameras might be added as well. I'm not looking to get a high end GPU, but maybe something lower class or used Nvidia P model GPUs (P40) or sth. Similar in the 300€/$ price range, since they have way faster single and double FP performance then any high end 4000/5000 series GPU.

For transcoding I'm using my iGPU anyway.

Ornival · December 30, 2024

Ah, thanks for the extra info and now I think I get where you're coming from! You did do some research! 🙌

I am not sure if you're game for some tinkering, but P40 has better FP performance due to Quadro drivers and you might find a better deal if you switch to a hypervisor and unlock the potential of the vGPU capabilities with patched drivers on a consumer card. (Mentioned Tesla P40 has GP102 die with native vGPU driver support.)

For reference: one machine hosts PROXMOX hypervisor on i5-12500T, 128GB RAM and GTX1080 Ti (also GP102). Main reason for using GTX1080 Ti is exactly because of the similarity with Tesla P40, albeit with only 11GB vs 24GB for P40! As long as your ML jobs or models don't exceed 11GB, you won't experience severe penalties for it, if any.

The GPU IDs as a P40, assigned within PROXMOX to a VM, where I specified several mdevs for different use cases in different VMs.. My host also serves my HexOS testVM (containing Immich), dedicated Plex (iGPU passthrough), Home Assistant OS, docker/kubernetes and many more.
Since you CPU is (much) more powerful, I expect much better results/performance in comparison to my machine.

Only buy Tesla P10/P40/T10/T40/Quadro P6000/RTX6000/RTX8000 if you can get a sweet deal on it or really in need of 24GB, otherwise 1080Ti 11GB, Titan X Pascal (both GP102), 2070 Super (TU104 version!), 2080 non-TI/Super (TU104) or even 2080 Ti (TU102). Here in the Netherlands a used 1080 Ti is about € 175-200,- and 2080 Ti about € 300-400,-

But still: only buy extra GPU if you need it. My 12500T has much less performance, and still my Immich app in HexOS processes al my pictures just fine on CPU alone. Only time when HexOS feels 'laggy', is when I haven't synched my pictures for a while and the bulk upload strains my wireless network. The dedicated GPU became "free" after my upgrade to AMD RX 6950XT, and just haven't found a need to assign a mdev to a VM yet.
(All of the above only applies to your/my mentioned use case, of course. I have triples of Nvidia Tesla M40 24GB and AMD MI25 16GB for jobs/applicatons that do need more or benefit from more VRAM.)

Ornival · December 30, 2024

Wow! That quick reply escalated rather fast...My experience is just n=1 and should not be applied in general ML/HPC considerations.
@PsychoWardsUhm...just follow your original idea as you already put some thought in it...nothing wrong with your initial approach 😉

PsychoWards · December 31, 2024

@Ornival thank you so much for the very detailed explanations and remarks. This helps me a ton.

Funny enough, my Server is already running in Proxmox anyway, so that part is already done =D I forgot to mention this earlier, but my Immich is running in Docker in an Ubuntu VM, it's not running in Hexos itself.

I still have an 1060 with 3GB VRAM (which is really week both GPU an VRAM) and an 2080Ti, which however crashes while gaming. But maybe only the rasterization part is causing issues, the CUDA part may be fine, I'll have to test this.

If the 2080Ti isn't usable, I will go on the hunt for one of your recommended GPUs if I find a good deal.

I don't think that a lot of VRAM will be required for my use cases, because must programms with machine learning capabilities for HomeLabs are meant to run on any kind of HW and to not specifically need a high end GPU with a lot of VRAM. Another example would be paperless ngx, which also uses machine learning but doesn't even support HW acceleration.

So I technically don't require a GPU, but the thinkering is the fun part.

Ornival · January 1

No good for hunting then! 2080Ti seems a natural choice for your setup 🙂

Sign In

dedicated GPU for HW acceleration, machine learning?

Recommended Posts

PsychoWards

Ornival

PsychoWards

Ornival

Ornival

PsychoWards

Ornival

Join the conversation

Browse

Activity

Store