time
Storage? I am Storage!
I'm concerned that AMD's latest Performance Rating claims originated solely in the marketing department rather than having any basis in engineering.
As background, here are the actual clock speeds of various Athlons:
The "Quantispeed" rating system is not linear, nor does it have an origin of zero. So the XP1600 was clocked a mere 12.5% slower than the equivalent P4, whereas the XP3200 cops a whopping 31.2% handicap.
Huh?
I can see that as the clock speed of each generation of P4 rises, the IPC drops as the CPU starts to run out of memory bandwidth. So you could argue that it's easier for Athlon to keep up. The problem is, the Athlon suffers from exactly the same effect! It may not have the bandwidth requirements of the P4 (although Northwood's larger cache must have reduced that slightly anyway), but higher clock speeds still need associated higher bandwidth.
I'm going to propose a rule of thumb: Athlon performance is significantly impaired when the multiplier exceeds 12. I suggest that 20 could be the equivalent number for Pentium 4. Conversely, decreasing the multiplier is unlikely to have a noticeable effect beyond a certain point.
We already know that the 1GHz Pentium III didn't appreciably benefit from increasing memory bandwidth beyond the theoretical 1066MHz of PC133. Whereas the Athlon 1GHz did, suggesting it preferred a multiplier of 7 or less with SDRAM. So, the Athlon 2600 (256kB Thoroughbred), with 82% of the P4 clock speed, may have the additional impediment of at least 20% less bandwidth than it needs. Not good. It's now quite some time since people discovered that you could significantly improve Athlon performance by increasing the FSB from 133 to 166MHz.
Of course, this is how AMD's marketing department has managed to prop up their outrageous claims for Barton CPUs: they run faster than the strangulated FSB models that they replace. They're ignoring the fact that the previous models were already underperforming. Intel, on the other hand, has simply introduced faster FSB as needed to support faster clock speeds. I'm betting the latest '800FSB' (200MHz QDR) is designed to carry them through to 4GHz, with a tangible benefit for CPUs from 2.8GHz up.
If you look at the table, the CPUs on which many of us based our perception that Athlons were fast, do not have either problem. The XP1600 and XP1700 have tons of bandwidth and run nearly as fast as the equivalent P4 anyway. I put it to you that these were in fact under-rated. Here's another rule of thumb: given an adequate FSB, an Athlon Palomino/Throughbred doesn't need any more than 5/6 of the clock speed of a Pentium 4 to be broadly competitive (faster in some things, slower in others). Or put another way, the PR rating should be at least 20% higher than the clock speed.
Things are probably still okay at XP2000, but going off the rails at XP2400 (of course, a P4 with 100MHz FSB will be even worse). By the time we move up to the XP2800 Thoroughbred, it's hard to see how there could be much less than a 5% variance between AMD's PR and what we actually see (referenced to the slower Athlons). That would make it more comparable to a Pentium 4 2667 (although I guess one of these is on the edge with the 533FSB, and the Athlon would still be better for many applications).
Then along comes Barton. Based on the tests at Aces Hardware, applications that benefit seem to average a boost of about 9% from the bigger 512kB cache. AMD reckons it needs 8% less clock speed for the Barton version of the XP2800, so let's round it off at 10%. That would change the rule of thumb from 83% (5/6) to 75% (3/4).
So if they were both run at a high enough FSB, I'd theorize that the Barton 2500 is about the same as the Thoroughbred 2400 (1833 vs 2000MHz). In practise, the Barton multiplier is a trim 11 compared to a stodgy 15 for its older cousin. You'd expect it to be faster, and it is. But I suspect it's not quite comparable to a P4 2533. After all, the Pentium 4 already whipped the old Athlon in apps that are dependent mainly on raw clock speed. The bigger L2 cache may not help here at all.
The Barton XP2800 looks a little better except for its slight bandwidth restriction.
Applying the 75% rule to the XP3000 places it as almost a 2900. Indeed, Aces' conclusion was that XP2900 would have been a more realistic rating. It also breaks the multiplier limit of 12, so perhaps it should best be compared with a P4 2800 with 800FSB?
Then there's the XP3200. It's actually identical to the XP2500 except for the FSB - same multiplier of 11. So how the hell did it pick up a further 7% performance?
Based on AMD's claimed rating for the XP2500, the XP3200 should be called an XP3000! Comparing it to the XP2800 (and allowing for slightly less bandwidth restriction) brings the same conclusion. The 75% rule of thumb suggests XP2900+. When comparing apples to oranges, or Athlon to Pentium 4 800FSB, I reckon this might be a better bet.
Here endeth the treatise. Thank you for reading this far. Now you can throw the brickbats!
As background, here are the actual clock speeds of various Athlons:
Code:
Model Cache FSB Clock Multiplier %
XP1600 256MB 133MHz 1400MHz 10.5 87.5
XP1700 256MB 133MHz 1467MHz 11.0 86.3
XP1800 256MB 133MHz 1533MHz 11.5 85.2
XP1900 256MB 133MHz 1600MHz 12.0 84.2
XP2000 256MB 133MHz 1667MHz 12.5 83.4
XP2100 256MB 133MHz 1733MHz 13.0 82.5
XP2200 256MB 133MHz 1800MHz 13.5 81.8
XP2400 256MB 133MHz 2000MHz 15.0 83.3
XP2500 512MB 166MHz 1833MHz 11.0 73.3
XP2600 256MB 133MHz 2133MHz 15.5 82.0
XP2600 256MB 166MHz 2083MHz 12.5 80.1
XP2700 256MB 166MHz 2167MHz 13.0 80.3
XP2800 256MB 166MHz 2250MHz 13.5 80.4
XP2800 512MB 166MHz 2083MHz 12.5 74.4
XP3000 512MB 166MHz 2167MHz 13.0 72.2
XP3200 512MB 200MHz 2200MHz 11.0 68.8
The "Quantispeed" rating system is not linear, nor does it have an origin of zero. So the XP1600 was clocked a mere 12.5% slower than the equivalent P4, whereas the XP3200 cops a whopping 31.2% handicap.
Huh?
I can see that as the clock speed of each generation of P4 rises, the IPC drops as the CPU starts to run out of memory bandwidth. So you could argue that it's easier for Athlon to keep up. The problem is, the Athlon suffers from exactly the same effect! It may not have the bandwidth requirements of the P4 (although Northwood's larger cache must have reduced that slightly anyway), but higher clock speeds still need associated higher bandwidth.
I'm going to propose a rule of thumb: Athlon performance is significantly impaired when the multiplier exceeds 12. I suggest that 20 could be the equivalent number for Pentium 4. Conversely, decreasing the multiplier is unlikely to have a noticeable effect beyond a certain point.
We already know that the 1GHz Pentium III didn't appreciably benefit from increasing memory bandwidth beyond the theoretical 1066MHz of PC133. Whereas the Athlon 1GHz did, suggesting it preferred a multiplier of 7 or less with SDRAM. So, the Athlon 2600 (256kB Thoroughbred), with 82% of the P4 clock speed, may have the additional impediment of at least 20% less bandwidth than it needs. Not good. It's now quite some time since people discovered that you could significantly improve Athlon performance by increasing the FSB from 133 to 166MHz.
Of course, this is how AMD's marketing department has managed to prop up their outrageous claims for Barton CPUs: they run faster than the strangulated FSB models that they replace. They're ignoring the fact that the previous models were already underperforming. Intel, on the other hand, has simply introduced faster FSB as needed to support faster clock speeds. I'm betting the latest '800FSB' (200MHz QDR) is designed to carry them through to 4GHz, with a tangible benefit for CPUs from 2.8GHz up.
If you look at the table, the CPUs on which many of us based our perception that Athlons were fast, do not have either problem. The XP1600 and XP1700 have tons of bandwidth and run nearly as fast as the equivalent P4 anyway. I put it to you that these were in fact under-rated. Here's another rule of thumb: given an adequate FSB, an Athlon Palomino/Throughbred doesn't need any more than 5/6 of the clock speed of a Pentium 4 to be broadly competitive (faster in some things, slower in others). Or put another way, the PR rating should be at least 20% higher than the clock speed.
Things are probably still okay at XP2000, but going off the rails at XP2400 (of course, a P4 with 100MHz FSB will be even worse). By the time we move up to the XP2800 Thoroughbred, it's hard to see how there could be much less than a 5% variance between AMD's PR and what we actually see (referenced to the slower Athlons). That would make it more comparable to a Pentium 4 2667 (although I guess one of these is on the edge with the 533FSB, and the Athlon would still be better for many applications).
Then along comes Barton. Based on the tests at Aces Hardware, applications that benefit seem to average a boost of about 9% from the bigger 512kB cache. AMD reckons it needs 8% less clock speed for the Barton version of the XP2800, so let's round it off at 10%. That would change the rule of thumb from 83% (5/6) to 75% (3/4).
So if they were both run at a high enough FSB, I'd theorize that the Barton 2500 is about the same as the Thoroughbred 2400 (1833 vs 2000MHz). In practise, the Barton multiplier is a trim 11 compared to a stodgy 15 for its older cousin. You'd expect it to be faster, and it is. But I suspect it's not quite comparable to a P4 2533. After all, the Pentium 4 already whipped the old Athlon in apps that are dependent mainly on raw clock speed. The bigger L2 cache may not help here at all.
The Barton XP2800 looks a little better except for its slight bandwidth restriction.
Applying the 75% rule to the XP3000 places it as almost a 2900. Indeed, Aces' conclusion was that XP2900 would have been a more realistic rating. It also breaks the multiplier limit of 12, so perhaps it should best be compared with a P4 2800 with 800FSB?
Then there's the XP3200. It's actually identical to the XP2500 except for the FSB - same multiplier of 11. So how the hell did it pick up a further 7% performance?
Based on AMD's claimed rating for the XP2500, the XP3200 should be called an XP3000! Comparing it to the XP2800 (and allowing for slightly less bandwidth restriction) brings the same conclusion. The 75% rule of thumb suggests XP2900+. When comparing apples to oranges, or Athlon to Pentium 4 800FSB, I reckon this might be a better bet.
Here endeth the treatise. Thank you for reading this far. Now you can throw the brickbats!