In computer graphics, high performance is a guiding principle. In the early days of personal computing, discrete, add-on graphics cards were mostly focused on specialized applications such as CAD/CAM and gaming. Even early on, there was a view that all of this graphics horsepower could be used for more: notably a better user interface and experience. One of the first graphics cards for a PC was called a “Windows Accelerator” from S3 Graphics, which focused on the user experience by moving windows around the screen faster. As graphics hardware evolved, so, too, did the methods that developers use to interact with that hardware.
DirectX is the part of Windows that provides a common application programming interface, or API, that allows developers to use the graphics hardware in the PC to draw text, shapes, and three-dimensional scenes, and display them on the screen. DirectX has also evolved over time in both capabilities and performance characteristics. In the early years, DirectX was focused mainly on games. As applications evolved to provide richer and more graphically-intense user experiences, many of them started to use DirectX as a way to get better performance and richer visuals.
Enter Windows 8
When we started to plan the work we’d undertake for graphics in Windows 8, we knew that we would be creating a new, visually rich way for users to interact with apps and with Windows itself. We also knew that we’d be building a new platform for creating Metro style apps, and that we’d be targeting a more diverse set of hardware than ever before. While we had a great graphics platform to start with, there was more work to do in order to support those efforts. We came up with four main goals:
Ensure that all Metro style experiences are rendered smoothly and quickly.
Provide a hardware-accelerated platform for all Metro style apps.
Add new capabilities to DirectX to enable stunning visual experiences.
Support the widest diversity of graphics hardware ever.
While each of these focus on different aspects of building Windows 8, they all depend on great performance and capabilities from the graphics platform.
Planning for performance
Graphics performance on Windows depends on both the operating system and the hardware system, comprised of the CPU, the GPU (graphics processing unit), and the associated display driver. To ensure that we could deliver a great experience for new Metro style apps, we needed to make sure that both the software platform and the hardware system would deliver great performance.
In the past we’ve used many different benchmarks and apps to measure the performance of DirectX. These have been largely focused on 3D games. While games are still very important, we knew that many of these existing ways to measure graphics performance did not tell us everything we needed to know for graphics-intensive, 2D, mainstream apps.
So we created new scenario-focused tests and metrics to track our progress. The metrics we use are as follows:
1. Frame rate
We express frame rate in frames per second (FPS). This metric is widely reported for gaming benchmarks, and is equally important for video content and other apps. When something is animating on the screen, a rate of 60 FPS makes the animation appear smooth. We target that rate because most computer screens refresh at 60 hertz. With that frame rate, Windows can provide very smooth animations with “stick to your finger” touch interactions.
2. Glitch count
While frame rate is an important metric, it doesn't tell the whole story. For example, running a benchmark for 10 minutes and getting 60 FPS on average sounds perfect. But, it doesn’t tell us how low the frame rate might have dropped during the test. For example, if the frame rate dips down to 10 FPS momentarily during demanding parts, the animations will stutter. The glitch count metric looks for the total number of times that rendering took more than 1/60 of a second, thus resulting in a reduced frame rate. It also looks at the number of concurrent frames missed. The goal here is to have no missed frames during animations.
3. Time to first frame
Most people expect their apps to launch quickly, so initializing DirectX needs to be fast. “Time to first frame” tells us how much time it takes from the moment you tap or click to launch an app until you see the first frame of the app on the screen. To measure this, we created simple apps to help analyze and optimize the graphics system for the time it takes to initialize a graphics device, allocate the required memory, and so on. This helps us ensure that the work to set up DirectX takes very little time.
4. Memory utilization
The more memory our graphics components use, the less memory is available for apps. By ensuring that most of the system’s memory is available for apps, you get the best app performance, and more apps can run at the same time. Apps use a mix of system memory and GPU memory. GPU memory is mostly used for rendering operations such as drawing images, geometric shapes, and text. Additionally there are graphics operations that use the CPU and therefore use system memory.
In order to characterize memory utilization, we measure the memory used by the system for the following scenarios:
The app is idle. That is, it is not doing any work and is not rendering or displaying new information to the screen.
The app is displaying information to the screen. This represents the base memory cost of a simple drawing.
Texture creation. This represents the memory used for creating a large number of image objects on the GPU.
Vertex buffer creation. This represents the memory overhead of creating geometric shapes.
GPU data upload. This measures memory overhead involved in uploading data to the GPU.
Measuring memory usage across many types of apps and these various scenarios has helped us further optimize DirectX and the display drivers.
5. CPU utilization
Most graphics operations utilize the CPU in addition to the GPU. For example, when an app is figuring out what it’s going to draw, it typically does these calculations on the CPU. CPU utilization is important to understand because the higher the percentage of the CPU used by a task, the fewer cycles the CPU can devote to other tasks. For good graphics performance and overall system responsiveness, it is important to effectively balance work between the CPU and the GPU.
These benchmarks and metrics help us ensure that the experiences and apps are smooth and have great performance. They play a big role in our understanding of mainstream apps. Of course, we still utilize industry benchmarks, games, and other ways to measure our overall performance.
Hardware accelerating mainstream graphics
There are many ways to look at mainstream graphics. To ensure that our work would give users the right performance and the right experiences we studied many examples of both Metro style and desktop apps to understand how they used the graphics hardware. In particular, Internet Explorer 9, Windows Live Mail, and Windows Live Messenger make excellent use of DirectX. Because these apps have done great work utilizing DirectX, they're good examples of what other apps might do. This led to a number of investments to ensure mainstream apps were fast and looked great.
Improving text performance
Text is by far the most frequently used graphical element in Windows, so improving text rendering performance goes a long way towards creating a better experience. Web pages, email programs, instant messaging, and other reading apps all benefit from high-quality and high-performance text display.
The Metro style design language is typographically rich and a number of Metro style experiences are focused on providing an excellent reading experience. DirectWrite enables great typographic quality, super-fast processing of font data for rendering, and provides industry-leading global text support. We’ve continued to improve text performance in Windows 8 by optimizing our default text rendering in Metro style apps to deliver better performance and efficiency, while maintaining typographic quality and global text support.
The bar chart below illustrates the performance improvements that result from this work. It includes measurements for the following text scenarios:
Rendering a screen full of reading-size text formatted as paragraphs as you would find in a web page or Word document
Rendering a screen full of small chunks of text at reading sizes as you would find in user interface controls such as button labels or menus
Rendering a screen full of small chunks of heading-sized text as you would see in titles & headings in Metro style apps and as headlines on blog posts and news articles on the web.
The most noticeable performance improvement can be seen when scrolling through a long document on a touch screen. The reduction in time required to render the characters frees up CPU cycles to handle other tasks like processing high-frequency touch input, or displaying more complex document layouts.
Improving geometry rendering performance
Along with text, we also made dramatic performance improvements for 2D geometry rendering. Geometry rendering is the core graphics technology that is used to create things like tables, charts, graphs, diagrams, and user interface elements, as shown in the example below. For Windows 8, our improvements in this area have primarily focused on delivering high-performance implementations of HTML5 Canvas and SVG technologies for use in Metro style apps, and webpages viewed with Internet Explorer 10.
When Direct2D draws geometry, it takes instructions from the app about what to draw in the form of 2D figures (e.g. rectangles, ellipses, and paths), the size and location of the figures, and specifics about the style of rendering, including brush color and stroke style. Then it converts those instructions into a set of triangles and commands that it sends to Direct3D to generate the desired output. We call this conversion process tessellation.
To improve geometry rendering performance in Windows 8, we focused on reducing the CPU cost associated with tessellation in two ways.
First, we optimized our implementation of tessellation when rendering simple geometries like rectangles, lines, rounded rectangles, and ellipses. Below is a chart showing the impact of these improvements.
Second, to improve performance when rendering irregular geometry (e.g. geographical borders on a map), we use a new graphics hardware feature called Target Independent Rasterization, or TIR.
TIR enables Direct2D to spend fewer CPU cycles on tessellation, so it can give drawing instructions to the GPU more quickly and efficiently, without sacrificing visual quality. TIR is available in new GPU hardware designed for Windows 8 that supports DirectX 11.1.
Below is a chart showing the performance improvement for rendering anti-aliased geometry from a variety of SVG files on a DirectX 11.1 GPU supporting TIR: