Here is a note about video cards for use with Stable Diffusion and some of their characteristics that need to be taken into account for good generation speed.
Long story short, I have a GeForce GTX 1660 Super, which is too slow for playing around with Stable Diffusion. Generating a basic 512×512 image with the SD1.5 model takes around 20–30 seconds – not too bad. However, when using inpaint mode, the time increases to several minutes. It’s fine for experimentation but requires a lot of waiting. So, I decided to review the market to understand what is currently available (as of February 2025) within my budget. Like most people, I don’t want to overspend, so I set my budget limit at 1,250 euros (roughly the same in USD).
The first step was to determine the key characteristics to consider when researching video cards.
First of all, I prefer Nvidia, so I didn’t look at AMD.
The key factors include:
- Generation (e.g., 3xxx vs. 4xxx)
- VRAM amount (8GB – 24GB)
- VRAM frequency
- CUDA core count
- GPU clock speed
- Other factors: card size, number of ports, number of fans, LED backlighting, power consumption, etc.
And, of course, the king – price.
My current video card
Here are the specs of my current GPU:
- Name: GeForce GTX 1660 Super
- Generation: 16xx
- VRAM: 6GB GDDR6
- VRAM frequency: 14GHz
- CUDA Cores: 1408
- GPU clock speed: 1.53GHz
What do we need for Stable Diffusion?
- VRAM – This is used to load models and process images. The VRAM consumption during image processing depends on resolution – higher resolutions require more VRAM.
- SD1.5 (with a 2GB model file) easily maxes out my 6GB VRAM when generating 512×512 images.
- SDXL needs even more since the model itself is around 6GB. The recommended minimum for SDXL is likely 12GB, given that it was trained on 1024×1024 images.
- CUDA Cores – More cores directly affect generation speed.
- For example, if my current card has 1,408 CUDA cores and a new one has 4,500, I can expect roughly 3× faster generation – cutting the time from 20 seconds to about 7 seconds (or even faster).
- Generation – Each new GPU generation introduces better technology, higher memory bandwidth, and improved efficiency.
- Choosing a newer generation is always preferable if all other specs are comparable.
GPUs that fit my requirements
I checked amazon.it and found the following cards that match my filter. I also included some top and popular cards like 3090, just for comparison:
| Model, Nvidia RTX | VRAM DDR6x marked with x | Cuda cores | Cuda cores, x from 1660 | VRAM freq | GPU freq | Avg Price, eur |
| GTX 1660 Super | 6GB – 6 | 1408 | 1.00 | 14 Gbps | 1.53 Ghz | 250 |
| 3060Ti | 8GB – 6 | 4864 | 3.45 | 14 Gbps | 1.41 Ghz | 450 |
| 4060Ti | 8GB – 6 | 4352 | 3.09 | 18 Gbps | 2.54 Ghz | 500 |
| 3060 | 12GB – 6 | 3584 | 2.55 | 15 Gbps | 1.32 Ghz | 300 |
| 3080 | 12GB – 6x | 8704 | 6.18 | 19 Gbps | 1.44 Ghz | 1000 |
| 4070 | 12GB – 6x | 5888 | 4.18 | 21 Gbps | 2.47 Ghz | 650 |
| 5070 | 12GB – 7 | 6144 | 4.36 | 28 Gbps | 2.51 Ghz | 780 |
| 4070 Super | 12GB – 6x | 7168 | 5.09 | 21 Gbps | 1.98 Ghz | 850 |
| 4070Ti | 12GB – 6x | 7680 | 5.46 | 21 Gbps | 2.47 Ghz | 1200 |
| 4060Ti | 16GB – 6x | 4352 | 3.09 | 18 Gbps | 2.54 Ghz | 550 |
| 5060Ti | 16GB – 7 | 4608 | 3.27 | 28 Gbps | 2.4 Ghz | 530 |
| 5070Ti | 16GB – 7 | 7680 | 5.45 | 32 Gbps | 2.67 Ghz | 1100 |
| 4080 | 16GB – 6x | 9728 | 6.91 | 23 Gbps | 2.51 Ghz | 2200 |
| 3090 | 24 GB – 6 | 10496 | 7.45 | 19.5 Gbps | 1.4 Ghz | 1900 |
| 3090Ti | 24 GB – 6 | 10752 | 7.64 | 21 Gbps | 1.86 Ghz | 2800 |
| 5080 | 16 GB – 7 | 10752 | 7.64 | 30 Gbps | 2.62 Ghz | 1300-1600 |
| 5090 | 32 GB – 7 | 21760 | 15.4 | 28 Gbps | 2.62 Ghz | 3400 |
As you can see, the most balanced card is the RTX 4070 Super, which will be approximately 5× faster than my GTX 1660 based on the number of CUDA cores alone. Another more efficient and modern option is the RTX 5070 Ti, but its price is 30% greater, which isn’t worth it if we’re only talking about SD and have a limited budget.
Here is the list of the cards sorted by the Release Year and Generation
| Card Name | VRAM | Release Year | NVIDIA Generation | Avg Price, eur |
|---|---|---|---|---|
| GTX 1660 Super | 6GB | 2019 | Turing | 250 |
| RTX 3060 Ti | 8GB | 2020 | Ampere | 450 |
| RTX 3080 | 12GB | 2020 | Ampere | 1000 |
| RTX 3090 | 24GB | 2020 | Ampere | 1900 |
| RTX 3060 | 12GB | 2021 | Ampere | 300 |
| RTX 3090 Ti | 24GB | 2022 | Ampere | 2800 |
| RTX 4080 | 16GB | 2022 | Ada Lovelace | 2200 |
| RTX 4060 Ti | 8GB | 2023 | Ada Lovelace | 550 |
| RTX 4070 | 12GB | 2023 | Ada Lovelace | 650 |
| RTX 4070 Ti | 12GB | 2023 | Ada Lovelace | 1200 |
| RTX 4060 Ti | 16GB | 2023 | Ada Lovelace | 550 |
| RTX 4070 Super | 12GB | 2024 | Ada Lovelace | 850 |
| RTX 5070 | 12GB | 2025 | Blackwell | 780 |
| RTX 5070 Ti | 16GB | 2025 | Blackwell | 1100 |
| RTX 5080 | 16GB | 2025 | Blackwell | 1300-1600 |
| RTX 5090 | 32GB | 2025 | Blackwell | 3400 |
Finally, I decided to spend more money and bought the PNY GeForce RTX 5080 with 16 GB. It is much better than the 5070 Ti (it has 20% more CUDA cores). Yes, it has only 16 GB of VRAM, which is not really enough for the latest AI models. But with the current VRAM shortage, it doesn’t make sense to wait for a 5090 price drop. I think this is the best choice for my budget.
- PNY GEFORCE RTX™ 5080 16GB Triple Fan DLSS 4 Graphics Card
- VRAM: GDDR 7, 16 GB (256-bit)
- Cuda cores: 10.752
- Release year: 2025
- Nvidia generation: Blackwell
- Price: 1318 eur (jan 2026)