For the past several days I've been playing around with Futuremark's new 3DMark for Android, as well as Kishonti's GL and DXBenchmark 2.7. All of these tests are scheduled to be available on Android, iOS, Windows RT and Windows 8 - giving us the beginning of a very wonderful thing: a set of benchmarks that allow us to roughly compare mobile hardware across (virtually) all OSes. The computing world is headed for convergence in a major way, and with benchmarks like these we'll be able to better track everyone's progress as the high performance folks go low power, and the low power folks aim for higher performance.

The previous two articles I did on the topic were really focused on comparing smartphones to smartphones, and tablets to tablets. What we've been lacking however has been perspective. On the CPU side we've known how fast Atom was for quite a while. Back in 2008 I concluded that a 1.6GHz single core Atom processor delivered performance similar to that of a 1.2GHz Pentium M, or a mainstream Centrino notebook from 2003. Higher clock speeds and a second core would likely push that performance forward by another year or two at most. Given that most of the ARM based CPU competitors tend to be a bit slower than Atom, you could estimate that any of the current crop of smartphones delivers CPU performance somewhere in the range of a notebook from 2003 - 2005. Not bad. But what about graphics performance?

To find out, I went through my parts closet in search of GPUs from a similar time period. I needed hardware that supported PCIe (to make testbed construction easier), and I needed GPUs that supported DirectX 9, which had me starting at 2004. I don't always keep everything I've ever tested, but I try to keep parts of potential value to future comparisons. Rest assured that back in 2004 - 2007, I didn't think I'd be using these GPUs to put smartphone performance in perspective.

Here's what I dug up:

The Lineup (Configurations as Tested)
  Release Year Pixel Shaders Vertex Shaders Core Clock Memory Data Rate Memory Bus Width Memory Size
NVIDIA GeForce 8500 GT 2007 16 (unified) 520MHz (1040MHz shader clock) 1.4GHz 128-bit 256MB DDR3
NVIDIA GeForce 7900 GTX 2006 24 8 650MHz 1.6GHz 256-bit 512MB DDR3
NVIDIA GeForce 7900 GS 2006 20 7 480MHz 1.4GHz 256-bit 256MB DDR3
NVIDIA GeForce 7800 GT 2005 20 7 400MHz 1GHz 256-bit 256MB DDR3
NVIDIA GeForce 6600 2004 8 3 300MHz 500MHz 128-bit 256MB DDR

I wanted to toss in a GeForce 6600 GT, given just how awesome that card was back in 2004, but alas I had cleared out my old stock of PCIe 6600 GTs long ago. I had an AGP 6600 GT but that would ruin my ability to keep CPU performance in-line with Surface Pro, so I had to resort to a vanilla GeForce 6600. Both core clock and memory bandwidth suffered as a result, with the latter being cut in half from using slower DDR. The core clock on the base 6600 was only 300MHz compared to 500MHz for the GT. What does make the vanilla GeForce 6600 very interesting however is that it delivered similar performance to a very famous card: the Radeon 9700 Pro (chip codename: R300). The Radeon 9700 Pro also had 8 pixel pipes, but 4 vertex shader units, and ran at 325MHz. The 9700 Pro did have substantially higher memory bandwidth, but given the bandwidth-limited target market of our only cross-platform benchmarks we won't always see tons of memory bandwidth put to good use here.

The 7800 GT and 7900 GS/GTX were included to showcase the impacts of scaling up compute units and memory bandwidth, as the architectures aren't fundamentally all that different from the GeForce 6600 - they're just bigger and better. The 7800 GT in particular was exciting as it delivered performance competitive with the previous generation GeForce 6800 Ultra, but at a more attractive price point. Given that the 6800 Ultra was cream of the crop in 2004, the performance of the competitive 7800 GT will be important to look at.

Finally we have a mainstream part from NVIDIA's G8x family: the GeForce 8500 GT. Prior to G80 and its derivatives, NVIDIA used dedicated pixel and vertex shader hardware - similar to what it does today with its ultra mobile GPUs (Tegra 2 - 4). Starting with G80 (and eventually trickling down to G86, the basis of the 8500 GT), NVIDIA embraced a unified shader architecture with a single set of execution resources that could be used to run pixel or vertex shader programs. NVIDIA will make a similar transition in its Tegra lineup with Logan in 2014. The 8500 GT won't outperform the 7900 GTX in most gaming workloads, but it does give us a look at how NVIDIA's unified architecture deals with our two cross-platform benchmarks. Remember that both 3DMark and GL/DXBenchmark 2.7 were designed (mostly) to run on modern hardware. Although hardly modern, the 8500 GT does look a lot more like today's architectures than the G70 based cards.

You'll notice a distinct lack of ATI video cards here - that's not from a lack of trying. I dusted off an old X800 GT and an X1650 Pro, neither of which would complete the first graphics test in 3DMark or DXBenchmark's T-Rex HD test. Drivers seem to be at fault here. ATI dropped support for DX9-only GPUs long ago, the latest Catalyst available for these cards (10.2) was put out well before either benchmark was conceived. Unfortunately I don't have any AMD based ultraportables, but I did grab the old Brazos E-350. As a reminder, the E-350 was a 40nm APU that used two Bobcat cores and featured 80 GPU cores (Radeon HD 6310). While we won't see the E-350 in a tablet, a faster member of its lineage will find its way into tablets beginning this year.

Choosing a Testbed CPU & 3DMark Performance


View All Comments

  • lmcd - Thursday, April 4, 2013 - link

    if the C-50 is about equal to the Z-2760, + 15% IPC + die shrink suddenly AMD is in this, so I'd say you didn't factor it in very well then... Reply
  • whyso - Friday, April 5, 2013 - link

    The asus vivotab smart is pretty much rock bottom in these tests. using the z2760. Reply
  • kyuu - Thursday, April 4, 2013 - link

    No, it's not. The E-350 is actually still stronger than the A15s. The only reason those ARM SoCs get a better score in the physics test is because they are quad-core compared to the E-350's two cores. Also, as ET stated below, performance doesn't scale down linearly with power, so a Bobcat at, say, a 75% lower TDP isn't going to have 75% lower performance. Heck, you can look at 3DMark results for the C-60 to see that it still outperforms the ARM SoCs at a lower TDP (assuming nothing fishy is going on with those results). The Z-60, with a 4.5W TDP, should still have comparable performance to the C-60.

    Plus, Bobcat is a couple years old. When AMD (finally) gets Jaguar out sometime this year, it should handily beat any A15-based SoC.

    Finally, if this shows that AMD is "bad", then it would show Intel as "absolutely pathetic".
  • Wilco1 - Thursday, April 4, 2013 - link

    E-350 certainly looks good indeed, however I would expect that A15 and E-350 will be close in most single threaded benchmarks, especially considering the latest SoCs will have higher frequencies (eg. Tegra 4 is 1.9GHz). On multi-threaded E-350 will lose out against any quad-core as the physics results show.

    A Z-60 at 1GHz will be considerably slower than an E-350 at 1.6GHz, so will lose out against pretty much all A15 SoCs. How well Jaguar will do obviously depends on the IPC improvement and clock speeds. But unless it is quad core, it certainly won't win all benchmarks.
  • Spunjji - Friday, April 5, 2013 - link

    Kabini uses 4 jaguar cores, Temash uses a pair of them. 15% IPC improvement combined with a clock speed bump from the die shrink should see it easily reaching competitive levels with the corresponding A15-based SoCs. Reply
  • milli - Thursday, April 4, 2013 - link

    You're looking at it the wrong way. The E-350 is a desktop/netbook cpu. If you want to make a direct comparison, you should compare it to the C-60. That one is 9W (compared to an estimated 5W for the A6X). It will still beat the A6X on CPU performance and stay close enough in 3D. It's still produced on 40nm (compared to A6's 32nm) and has a much smaller die (75mm² vs 123mm²). AMD 64-bit memory interface, A6X 128-bit. If you look at it that way, AMD isn't doing too bad.
    Jaguar/GCN based Temash will be able to get C-60 performance or better under 4W.
  • lmcd - Thursday, April 4, 2013 - link

    The die size is what really gets me. AMD should be able to push Temash everywhere if they hit the die shrink advantages, push the GPU size up a bit and Jaguar core delivers. Reply
  • jabber - Thursday, April 4, 2013 - link

    Interesting article. I'm always intrigued to know how far we've come.

    I think what would be really handy is an annual "How things have progressed!" round up at the end of the year.

    This would entail picking up all the generations past and present top of the range flagship cards (excluding the custom limited edition bizzaro models) and doing a range of core benchmarks. You could go back as far as the first PCI-e cards (2004 ish).

    Up until a few years ago I was still running a 7900GTX but really have no clue as to how much better current cards are in real hard facts and figures.

    Would be good to see how far we have been progressing over the past 10 years.
  • marc1000 - Thursday, April 4, 2013 - link

    I too would like to see an actual gaming comparison on GPUs from 2004 to now. Even if just on a small number of games, and with limited IQ settings.

    Something like testing 3 games: 1 real light, 1 console port, and 1 current and heavy. All benchmarked at medium settings, on 1280x720 or 1680x1050 resolution. no need to scale IQ, AA settings or triple-monitor situations.

    (pick just DX9 games if needed, most people who are NOT tech enthusiasts can't tell the difference between DX9 and DX11 anyway.)
  • dj christian - Thursday, April 4, 2013 - link

    +1 to that! Reply

Log in

Don't have an account? Sign up now