To tackle the demands of mainstream 1440p gaming, Intel has significantly revamped the architecture of its Battlemage codename Xe2 core. This upgrade boosts the processing capabilities of ray tracing units and integrates advanced XeSS 2 acceleration features, resulting in a notable enhancement in overall performance. The Xe2 core GPU is still structured around Render Slices as the fundamental unit. The new Xe2 Render Slice conducts in-depth micro and macro analyses of all graphics acceleration functions, optimizing for reduced latency, elimination of stuttering, and improved hardware-software interaction efficiency.
Each Render Slice contains four second-generation Xe cores (referred to as Xe2 cores). These cores exhibit a reallocation of computational resources within the native SIMD16 engine, which enhances efficiency. Each Xe2 core is equipped with eight 512-bit XVE vector engines and eight 2048-bit XMX AI engines. While this represents half the number of engines compared to the previous Xe-HPG architecture, the processing units within each engine have doubled in capacity, maintaining the total count while significantly improving data processing efficiency. The L1 cache has also been upgraded from 192 KB in the previous generation to 256 KB.
Ray Tracing Unit Enhancements
The ray tracing units in the Xe2 core have seen substantial upgrades, with the performance of the Traversal pipelines and Box intersections (which detect if rays interact with objects) enhanced by 1.5 times. Additionally, Triangle intersections (the intersection of polygons with ray borders) and BVH cache (Bounding Volume Hierarchy cache) have experienced a twofold increase, greatly boosting overall ray tracing performance. The Arc B580 utilizes the BMG-G21 chip, comprising five Render Slices that together host 20 Xe2 cores, 20 ray tracing cores, 160 XMX engines, 20 material samplers, 10 pixel backends, alongside 18 MB of L2 cache and two MFX (multi-format decoder) media engines, all supported by a 192-bit memory controller.
Thanks to these architectural advancements, each Xe2 core's performance has improved by 70% compared to its predecessor, with power efficiency also enhanced by 50%. Data indicates that the actual execution time of the Arc B580 has been reduced by 32.7% compared to the Arc A750, translating to an overall performance increase of 48.6%.
XeSS 2: A Game Changer in Graphics Performance
XeSS 2 integrates super-resolution, frame generation, and low latency, leveraging the robust computational power of the XMX AI engines within the Xe2 cores. XeSS 2 can now simultaneously deliver three acceleration effects: XeSS-SR for super-resolution, XeSS-FG for frame generation, and XeSS-LL for low latency. Starting with XeSS-SR, this approach involves rendering at a lower resolution to boost frame rates and then using AI to calculate a higher-resolution final image. Intel claims that, when targeting a 1440p resolution, XeSS can improve average frames per second by 47% compared to native resolution.
This iteration focuses on enhancing the SDK (Software Development Kit) for XeSS-SR, fully supporting major APIs like DirectX 11, DirectX 12, and Vulkan, which facilitates broader game compatibility. Currently, XeSS-SR is supported in over 150 games. Moving on to XeSS-FG, this feature, similar to NVIDIA's DLSS 3, generates new frames by referencing previous frames, motion vectors, and depth information, resulting in smoother visuals. For instance, in the game F124, combining super-resolution with frame generation could boost FPS by up to 2.8 times; in super performance mode, FPS could soar to 3.9 times.
Low Latency Improvements
XeSS-LL mirrors NVIDIA’s Reflex technology by improving the rendering queue, allowing images to display on-screen sooner and significantly reducing display latency. Using F124 as an example again, applying XeSS-LL can cut the latency time by approximately 45% at 1440p resolution. Furthermore, when XeSS 2 is fully engaged with super-resolution, frame generation, and low latency effects, latency is reduced even further compared to just using XeSS-LL.
Broader Applications of XMX AI Engine
Beyond enhancing gaming performance, the XMX AI engine is designed to support various AI workflows. From construction and optimization to execution, the range of frameworks, tools, and APIs it supports is extensive. Intel's data reveals that the Arc B580 leads in performance for large language models over the GeForce RTX 4060 by 40% to 50%.
Next-Gen Intel Graphics Software Features
The new Intel graphics software not only provides accurate display and 3D graphics settings but also supports performance and overclocking configurations through an intuitive graphical interface, making it easy for users to tweak their settings. The theoretical performance metrics released by Intel indicate that the Arc B580's raster performance surpasses the GeForce RTX 4060 by 32%, with ray tracing performance outpacing it by 25%.
Upcoming Performance Metrics
As for real-world gaming performance, the Arc B580 has shown an average improvement of 24% over the Arc A750 and leads the GeForce RTX 4060 by about 10%. Detailed performance metrics for the Intel Arc B580 are set to be unveiled on December 12 at 22:00 Taiwan time, so stay tuned for our comprehensive review.