GeForce GTX 1650 to the test. Review of the new Nvidia low-end card with TU117 Turing GPU, flanked by 4GB of GDDR5 memory.
welcome GeForce GTX 1650. It is the third Nvidia GeForce video card of the 16 series after the GTX 1660 Ti and the GTX 1660. Like the other solutions, it is based on the architecture Turing, but in spite of the GeForce RTX cards, does not offer RT core and Tensor core units.
The former are responsible for accelerating the calculations related to ray tracing, while the latter always help with these operations, in addition to allowing the activation of the DLSS, the anti-aliasing "based on artificial intelligence" introduced by Nvidia in recent months .
Apart from these deficiencies, the GTX 1650 it still preserves the other peculiarities of the Turing architecture, which we have explored in a dedicated article. The new Nvidia architecture has several other innovations, such as a renewed Streaming Multiprocessor (SM). Nvidia claims to have improved the shading speed for CUDA core by 50% compared to Pascal. Progress is linked to two fundamental changes to architecture.
The first is the presence of a new integer datapath that can execute instructions simultaneously with floating point calculations. More precisely, there are dedicated cores that process FP32 and integer calculations simultaneously and allow FP16 calculations to be performed at twice the speed of FP32 operations.
What does this mean? According to Nvidia, modern games are increasingly mixing floating point operations with integer instructions. For every 100 instructions in Shadow of the Tomb Raider, for example, an average of 62 are in floating point and 38 are integers. With Pascal and previous GPUs it is not possible to perform different types of calculations simultaneously, with a consequent loss of speed and efficiency.
The GPU is also partitioned into four blocks, each with a warp scheduler and a dispatch unit, a new L0 instruction cache and a 64 KB log file, 16 FP32 cores, 16 INT32 cores and cores dedicated to FP16 operations management. "Modern games are increasingly using FP16 to create effects that do not require a high level of accuracy, such as Far Cry 5" water simulation. Unlike the GTX 1660, in the GeForce RTX the FP16 calculations are managed by the core Tensors.
The second big news is that the memory path of the SM has been redesigned to unify the shared memory, texture caching and memory caching in a unit. This translates into a doubling of bandwidth and available capacity for the L1 cache with common workloads.
Turing's L1 cache is configurable, with a maximum size of 64 KB, combined with 32 KB per SM of shared allocated memory, or can be reduced to 32 KB, leaving 64 KB of allocation for shared memory. "Combining the L1 data cache with shared memory reduces latency and guarantees a higher bandwidth than Pascal GPU implementation", assures the company. According to Nvidia, CoD: BO4 is one of the titles that most benefits from the new Turing GPU caching architecture, with performance up to 50% greater than a GTX 1060 6GB.
Another noteworthy new feature of Turing is the new shading technology called Rate Shading Variable (VRS). The two algorithms that take advantage of VRS are Content Adaptive Shading and Motion Adaptive Shading. With both, the GPU can adjust the shading rate on different regions of the scene or even for specific objects, so areas that do not need to be rendered in full detail can be shaded with fewer samples to improve performance.
Motion Adaptive Shading intervenes on the shading rate based on the amount of movement present in a particular region of the scene. For example, during a racing game the machine is rendered in maximum detail, while the road, stripes and signs can be rendered with less detail – but nothing the eye can perceive – because they are overcome at high speed.
With Content Adaptive Shading, the shading rate is related to the spatial and temporal coherence of colors between frames. In the event that color changes are minimal from one frame to another – for example the sky in a scene or a wall – the shading rate can be lowered between successive images to improve performance. Also in this case Nvidia ensures that the eye does not notice the difference. Among the games that apply this technology is Wolfenstein II: The New Colossus.
Like the other Turing GPUs, the GTX 1650 also has an improved NVENC encoder which adds support for H.265 (HEVC) 8K encoding at 30 fps. The new NVENC encoder guarantees up to 25% bitrate savings for HEVC and up to 15% in H.264. The new NVDEC decoder has been updated to support HEVC 4: 4: 4 decoding with 8/10/12 bit video streams and is also compatible with VP9 10/12 bit HDR like the GPUs Pascal GP102, GP107 and GP108 and Volta GV100.
According to Nvidia, Turing improves the encoding quality compared to previous Pascal GPUs and compared to software encoders. The Turing video encoder exceeds the quality of the x264 software encoder using the fast setting, with significantly less CPU usage. Nvidia also points out that the 4K streaming is too heavy to encode it on the CPU, so the company worked with OBS to offer an improved version of the software capable of using Turing's video enhancements. As a result, players can stream and transmit simultaneously from the same system, without having a PC dedicated to encoding video.