Both RTX and GTX cards are equally suited Gpu for deep learning. The only difference is that RTX has 6x more memory than a normal GTX graphics card, this will mostly or fully mitigate the small inefficiency in data transfers between system memory and GPU. For example when training ResNet-50 with 4 GPUs on ImageNet dataset, RTX cards can load all data into GPU memory in 10 minutes with almost no loss of accuracy compared to standard GTX Titan Xp which does it in 20 minutes (1 minute = 1 epoch). Once full speed data transfer becomes a bottleneck, you don’t want to hold your model training because of that. So if you are buying new cards for research purpose, by all means buy RTX ones. Even if you are using existing GTX cards, you can still buy RTX ones and use them as spare parts for those which go faulty.
GTX compared to RTX
RTX cards can be up to 10 times faster than GTX 1080Ti for training. The speed difference between RTX and GTX varies a lot depending on the model you are using, but usually it is around 2-3x. Sometimes you only get 3/4 of expected speed because the current version of CUDA cannot fully utilize RTX capabilities. However, this will likely change very soon with newer CUDA releases from NVIDIA. In other words, if you have been waiting for better performance, then now is the time! Current generation (10xx) of RTX cards provides almost double performance improvement in inference vs previous generations (9xx).
Which one should I buy: GeForce or Quadro?
If you just need a card and don’t care about compute performance, then GeForce is fine. If you require more than 1 TFLOPS for inference or 5 TFLOPS for training, then Quadro will be better choice. For example if you want to run ResNet-50 model on 8x Tesla V100 GPUs with 2 cards per node on Nvidia DGX-2 server, Quadro P5000 can do it while GeForce GTX 1080Ti cannot. If your center doesn’t have many Quadro cards but needs them badly, these can also be bought on demand from Amazon or other online PC shops for much less money than brand new ones. However note that performance of used/rented GPUs varies significantly between different brands / models.
How much gpu ram is enough?
It depends on the data size, but generally more than 6GB of fast memory should be good enough to avoid any serious bottleneck until 2020 or 2022. However, if you are working on a very large dataset then 8GB RTX cards might help sometimes, but this will again mainly affect your model training time rather than inference speed. Also worth noting that GDDR6 memory on Turing GPUs wil have 2x more bandwidth which means 16GB RTX cards might be a real alternative to Tesla V100s in many scenarios.
RTX 2080Ti and Tesla V100
RTX 2080 Ti costs 2 times more than Tesla V100 for similar compute performance, so it depends on how much money you have. However, even though the price per FLOPS seems higher for Tesla, in reality this may not be true because of longer training time with lower capacity cards (e.g. GeForce 1080 Ti). This results in higher cost per unit of training time or space on device where multiple GPUs are used. For example, an 8 GPU DGX-2 server needs 1440 CUDA cores to run ResNet-50 model with batch size 16 at maximum speed, but it can take only P5000 or better cards because other types are too slow. So if you have enough money for RTX 2080 Ti then just get them instead, otherwise go with Tesla V100 if you have limited funding.
For similar performance the cost is half so this may be a cheaper solution. You’ll need 1 GPU per node which means more space on server / desktop and longer training time but this will also give you more flexibility if your center doesn’t have lots of spare 1080 Ti cards to use for research purposes.