As someone who analyzes GPUs for a living, one of the more vexing things in my life has been NVIDIA’s Maxwell architecture. The company’s 28nm refresh offered a huge performance-per-watt increase for only a modest die size increase, essentially allowing NVIDIA to offer a full generation’s performance improvement without a corresponding manufacturing improvement. We’ve had architectural updates on the same node before, but never anything quite like Maxwell.

The vexing aspect to me has been that while NVIDIA shared some details about how they improved Maxwell’s efficiency over Kepler, they have never disclosed all of the major improvements under the hood. We know, for example, that Maxwell implemented a significantly altered SM structure that was easier to reach peak utilization on, and thanks to its partitioning wasted much less power on interconnects. We also know that NVIDIA significantly increased the L2 cache size and did a number of low-level (transistor level) optimizations to the design. But NVIDIA has also held back information – the technical advantages that are their secret sauce – so I’ve never had a complete picture of how Maxwell compares to Kepler.

For a while now, a number of people have suspected that one of the ingredients of that secret sauce was that NVIDIA had applied some mobile power efficiency technologies to Maxwell. It was, after all, their original mobile-first GPU architecture, and now we have some data to back that up. Friend of AnandTech and all around tech guru David Kanter of Real World Tech has gone digging through Maxwell/Pascal, and in an article & video published this morning, he outlines how he has uncovered very convincing evidence that NVIDIA implemented a tile based rendering system with Maxwell.

In short, by playing around with some DirectX code specifically designed to look at triangle rasterization, he has come up with some solid evidence that NVIDIA’s handling of tringles has significantly changed since Kepler, and that their current method of triangle handling is consistent with a tile based renderer.

NVIDIA Maxwell Architecture Rasterization Tiling Pattern (Image Courtesy: Real World Tech)

Tile based rendering is something we’ve seen for some time in the mobile space, with both Imagination PowerVR and ARM Mali implementing it. The significance of tiling is that by splitting a scene up into tiles, tiles can be rasterized piece by piece by the GPU almost entirely on die, as opposed to the more memory (and power) intensive process of rasterizing the entire frame at once via immediate mode rendering. The trade-off with tiling, and why it’s a bit surprising to see it here, is that the PC legacy is immediate mode rendering, and this is still how most applications expect PC GPUs to work. So to implement tile based rasterization on Maxwell means that NVIDIA has found a practical means to overcome the drawbacks of the method and the potential compatibility issues.

In any case, Real Word Tech’s article goes into greater detail about what’s going on, so I won’t spoil it further. But with this information in hand, we now have a more complete picture of how Maxwell (and Pascal) work, and consequently how NVIDIA was able to improve over Kepler by so much. Finally, at this point in time Real World Tech believes that NVIDIA is the only PC GPU manufacturer to use tile based rasterization, which also helps to explain some of NVIDIA’s current advantages over Intel’s and AMD’s GPU architectures, and gives us an idea of what we may see them do in the future.

Source: Real World Tech

Comments Locked


View All Comments

  • Yojimbo - Monday, August 1, 2016 - link

    NVIDIA has tile-based rendering patents from Gigapixel via 3dfx.
  • jabbadap - Monday, August 1, 2016 - link

    nvidia has lot's of tiled-based rendering patents. Maybe one of the most interesting is one of the newest one:

    Terms are quite familiar for nvidia's modern architechtures.
  • Yojimbo - Tuesday, August 2, 2016 - link

    Yes, and jumping a bit through the citations and referenced by links a bit I see other patents mentioning tiling by NVIDIA, ARM, Qualcomm, Intel, Apple, and Imagination, but not AMD. AMD and NVIDIA have a patent cross-licensing deal, though. Not sure how that affects things.
  • Jedi2155 - Monday, August 1, 2016 - link

    I'm glad Nvidia has finally figured out how to make use of a decades old technology!

    Also kinda shows you how old I am xD
  • kn00tcn - Monday, August 1, 2016 - link

    i dunno, you could have still read anand at age 10, it's still a potentially large age range
  • wumpus - Wednesday, August 3, 2016 - link

    Oddly enough, before AMD bought ATI it seemed that all the "cool" tech that never went anywhere was bought up by ATI. And no, at least none of the "cool" tech I was interested in ever showed up in ATI or AMD products that I was aware of.

    Some examples: The "bicmos" high speed powerpc design, the leading "mediaprocessor" company ("mediaprocessors's work would eventually be done by GPUs, but via a completely different design and the patents likely didn't help).

    Now that I think about it, it is *possible* that the "bicmos" patents were used by *Intel* (who has access to AMD's patents) to come up with the Pentium 4, which is about the only time that whole bit actually helped AMD.
  • MobiusPizza - Tuesday, August 2, 2016 - link

    So what are the disadvantages of tile based rasterization? Does this affect the IQ? How good is Nvidia in mitigating the disadvantages? These are the questions that need answering

    People in the comment all praising Nvidia for this efficiency, but not until we know whether this change affect IQ, this can be labelled as benign optimization or balant cheating.
  • Scali - Tuesday, August 2, 2016 - link

    "So what are the disadvantages of tile based rasterization? Does this affect the IQ? How good is Nvidia in mitigating the disadvantages? These are the questions that need answering"

    I think they were already answered by the fact that NVidia has been doing this since Maxwell v2 at least, so the hardware has been on the market for about 2 years, until people found out that NVidia renders in a slightly unconventional way.
    If it affected IQ or had any other disadvantages, people would long have noticed.

    Aside from that, you can figure out for yourself that IQ is not affected at all. The order in which (parts of) triangles are rasterized does not affect the end result in a z-buffered renderer. If it did, the depth-sorting problems would be immediately apparent.
  • owan - Tuesday, August 2, 2016 - link

    Your questions are basically exactly the same ones I was asking myself after reading this article, as no mention of the actual disadvantages is made anywhere. I'm curious as to what hurdles they actually had to overcome to implement this and don't feel like watching a 20 minute video when 30s of text reading would have sufficed.

    The second part of your reply is a complete non-sequitur though. If it works it works. Nobody has commented on IQ issues or anything related to it and the cards have been in the market for years. "Blatant cheating"? Why are you even bringing that up? If the end result is the same (or better) quality than a competitive method, why would you even think to call it it cheating? Because its faster than the other company's method? This isn't a sport, there aren't "rules" in the same way as long as it works. If the ball goes in the net I don't give a crap if you throw it or kick it.
  • Wolfpup - Tuesday, August 2, 2016 - link

    Wow, thanks for this article! It's been a while since I've been to realworldtech (I just don't visit websites like I did 10-20 years ago).

    Really interesting...maybe tile based rendering is winning after all, after dying off on PC nearly two decades ago!

Log in

Don't have an account? Sign up now