Intel Hyperthreading discussion.

CougTek · Apr 6, 2002

Hi,

As a follow up to this article at Anandtech that many if not all of you have already read, I would like to hear your comments about the benefits of Hyperthreading, as implemented by Intel within the latest Xeon.

According to what I read, Hyperthreading was pretty much the answer of Intel to the upcoming Barton and, later, ClawHammer from AMD. ClawHammer? Yes, ClawHammer. Even if it will be a 64bits processor, it will stand in the same market segment as Intel's processors with the Hyperthreading feature. SledgeHammer will compete with Intel's McKinley, not ClawHammer.

Looking at the (disappointing) results of the Hyperthreading-capable Xeon versus their Athlon MP opponents, it seems that Intel won't be able to beat the improved Barton and much less the upcoming and supposedly noticeably faster ClawHammer. Or if they want to, they'll have to tremendously increase the clock speed of their Xeon processors line. Thoroughbred will reach ~2GHz frequency with the same architecture as the actual Palomino. Barton will push the limit even further for the 32bits AMD CPU (my guess is that it will hit at least 2.4GHz - GHz, not PR rating). Unless Intel improves their Hyperthreading somehow, even their 3GHz P4 (scheduled for Q4) will be quite miserable against AMD's Barton and ClawHammer.

I agree that the improvement that Hyperthreading allowed wasn't bad, but my point is that it's not enough to compete against AMD's upcoming goodies. I hope for Chipzilla that the applications used in Anandtech's article weren't fully optimized for Hyperthreading. Or else, if that's the best that Intel can offers, they will find the end of the Summer along with all the next Fall and Winter VERY long.

What are your thoughts about it (yeah, I'm even interested to hear what you have to tell about it HellDiver, as long as you make your point without quoting and attacking every line I wrote)?

Bozo · Apr 6, 2002

The problem with AMD in a server application is reliability and stability. Intels Hyperthreading may not be as good as AMDs latest processors, but until there are quality motherboards for AMD, Intel will still outsell them.

About a year ago I assembled a computer that will be used in process control. It runs 24/7. The vendor substituted a Tyan motherboard with a VIA chipset on it. I was assured that is was just as good as the Intel motherboard I ordered. The computer was installed and has been running fine until last month. It seems the video card locks up when displaying some realtime charts and graphs. A new ATI video card was ordered. The first page of the instuctions contains a special warning and instructions if you are using a VIA, SIS, or Aplllo chipset. You must install new chipset drivers and then install the card. Hopefully, installing the new chipset drivers will not trash the NT operating system. If it was an Intel motherboard, you install the card and drivers and you are done.

Until AMD can produce a reliable and stable motherboard for their outstanding CPUs, I don't think Intel has much to worry about in the server market.

Bozo

Pradeep · Apr 6, 2002

I find the dual processor TYAN Tiger MP to be very stable, and all AMD North and Southbridges.

HellDiver · Apr 7, 2002

CougTek said:
What are your thoughts about it (yeah, I'm even interested to hear what you have to tell about it HellDiver, as long as you make your point without quoting and attacking every line I wrote)?

I'm doing that again, ain't I? :evil: Well, I'll tell you what - don't treat this answer as a personal attack on every line you wrote. You asked for people's opinions (including - and even singling out - my own). And I happen to have an opinion differing from yours on certain things. By quoting you I am simply indicating the points I disagree with. I may very well be wrong, I may indeed turn out to be right, but either way that's just my opinion, it in no way denigrades yours... So, without further ado...

According to what I read, Hyperthreading was pretty much the answer of Intel to the upcoming Barton and, later, ClawHammer from AMD.

No offence, but I think right by about now AMD is the one that should have been "answering". And it is very alarming that TBred was delayed. TBred in the currently expected configuration has nothing going for it except for clock frequency and price, which is pretty bad. Athlon does not scale as well as P4 (clock-wise), and Intel has a lot more cards up its sleeve than just the ability to ramp up the clock : HT, large L2 and 533MHz FSB, for instance. I seriously suspect that by mid summer AthlonXP will start losing in quite a few real-life benchmarks, especially to 533MHz P4 + dual-channel PC1066 RDRAM setups, likely to DDR-based ones as well. This may be a calculated move by AMD, done in order to concentrate the engineering effort on the ClawHammer and associated chipsets, but whether this will work out for them or not - only the time will tell.

SledgeHammer will compete with Intel's McKinley, not ClawHammer.

Nope. Not in the next two-three years (if ever). McKinley is a big gun Intel polishes for the expensive high-end systems, an area which AMD may only wish to somehow crawl into. Raw performance takes second seat to RAS in that segment, and as of today any IT professional will laugh in your face if you propose him to move his corporation's 4/8-way systems to AMD, even if such imaginary AMD-based systems already existed. AMD will be very lucky to make their first generation of SHammer platforms top-notch reliable. "A highly localized and specific problem you will witness only if you install that RAID card in this slot or if your PSU was made by Antec or if your name is Tom" just doesn't cut it in mid-range to high-end servers. Either it works 24/7 or it doesn't. As for low-end Linux-based (or Windoze-based if AMD will strike a deal with M$) sector, AMD has indeed a fine chance, but you won't see any McKinleys there, that's fer sure! If IA64 really catches on you may see SHammers attempting a stab at Deerfield, but that too won't happen very soon...
Speaking of Windoze... There's another can of worms AMD will open with SHammer - multi-CPU SHammer systems seem to be NUMAs. A NUMA without proper OS support can easily turn out to be a disaster scalability- (and not only!) wise, and I'm not very sure how server versions of Windoze can handle NUMA as they were originally targeted at SMP only. Unfortunately I've seen extremely little info regarding this issue on the net...

...it seems that Intel won't be able to beat the improved Barton and much less the upcoming and supposedly noticeably faster ClawHammer.

According to the rumors floating around the web Barton may very well be cancelled. Some say it can be due to CHammer shipping slightly earlier (late autumn/early winter?). But even if Barton did show up it wouldn't be in September as initially planned (too close to TBread's expected launch in July), and besides, AFAIK Barton isn't that much "improved". I did hear words like SOI flying through the air, but besides that... If you have any worthy links on the matter I'd love to have a lookie.

Thoroughbred will reach ~2GHz frequency with the same architecture as the actual Palomino.

And that's the problem if you ask me. It's high time AMD did something with its caches if they can't touch the core logic, they're somewhat slow (compared to P4's and even CuMine's) and by today's standards - the L2 is fairly small. Exclusive is all nice and fine, but that's no excuse for having L2 both small and slow when the competition has theirs fast and big.
Lack of SSE2 is also likely to start backfiring towards the end of the year - more and more apps are being optimized for SSE2 (where applicable, of course!). If CHammer (with SSE2 support) is released in Q4 - that won't be a problem if CHammer gets to be positioned and properly supported as a desktop CPU. However, if TBred/Barton (without SSE2 support) will be alone to play on the desktop field by the turn of the year - AMD may start paying some price for their roadmap slippages. By this time next year SSE2 is likely to be supported by a good portion of multimedia (and scientific) apps and the rigs that will have to run it will include those sold during the X-mas rush, so...

Unless Intel improves their Hyperthreading somehow...

Not very likely, IMHO. A couple percent more through some scheduling tweaking - possibly. But not much more. It's fairly good as it is, as good solutions to increasing the performace are pretty much all used up by now and AFAIK SMT won't truely "fly" on a backwards CICS ISA like x86.

...even their 3GHz P4 (scheduled for Q4) will be quite miserable against AMD's Barton and ClawHammer.

<cough> As far as I'm concerned Intel has demonstrated its ability to ramp up the clockspeed of P4 core. Barton seems to be a ghost at the moment, and we are yet to see the CHammer per se (I mean except some A0 stepping samples at CeBIT). So I personally wouldn't have used the word "miserable" in that sentence. In my understanding, by Q3 the P4 core will pretty much reach its target shape, the way it really was supposed to be (if we are to disregard the legends about how it was supposed to sport 1MB 1-clock L1 but was cut down my Intel's marketing), especially once mated with PC1066 RDRAM (or Granite Bay/Plumas/Placer). And once there - it won't be a slouch. AMD better have something up their sleeve I don't know about (at least don't know about for sure).
Also note that both Xeons and AthlonMPs are likely to be clocked lower than desktop versions, so I'm not sure about 3GHz as far as Xeons are concerned, though this will probably be offset by similar downclock on AthlonMPs. What won't be offset is Gallatin's/Foster's fat L3 cache, a must for servers CPUs.

I hope for Chipzilla that the applications used in Anandtech's article weren't fully optimized for Hyperthreading.

AFAIK - no, they weren't. HT-optimized apps are still very scarce, and will probably remain that way for the coming year. Rewriting serious code is a tough exercise, even if not so much in terms of coding as in terms of validation. Also AFAIK, neither Win-2k nor MS SQL Server 2000 weren't patched yet with HT optimizations, and that's to the best of my knowledge what Anand used, though latter is debatable as I have no direct evidence (Anand didn't post his test spec in full in his write-up in question, as you probably know).

Generally speaking, as apps will start being recompiled to support HT you'll start seeing more improvement in performance. That'll take some time, however, I think MMX/3DNow!, SSE and SSE2 introductions proved that beyond reasonable doubt. Various DB and similar loads are likely to benefit the most as data locality in those is pretty poor and cache misses are more than frequent. Once properly implemented into the OS, high-load server environments (even not DB-related, e.g. web-servers) will also enjoy HT a lot, especially those with hefty RAMs (i.e. those in which disk access is less of a bottleneck).

Also, contrary to some people I believe that we may see some HT-related improvements even on a regular home user's desktop as well, as a lot of people enjoy playing some MP3s while composing some document (a Word text or an email) while having a couple (or a dozen) of Flash/DHTML/nested tables bloated webpages open in their IE while downloading some videos through their broadband connect on their fully-themed Windoze XP. When looked upon as separate tasks those are not really CPU-intensive processes, but as Windoze themselves become more bloated than ever with semi-transparent themed windows and little dogs/Clippies that try to "help" are right about to become fully 3D rendered and as on-line content becomes richer their combinations may indeed enjoy the extra HP provided by HT. Though this may be a slightly forward-looking statement...

Whether the performance improvements will be on the order of 10-30% as we've seen so far or on the order of 30-60% as observed by InfoWorld folks (linked to by Flagreen)? I wouldn't be brave enough (or stupid enough) to draw a Gaussian bell on an improvement/cases graph (simply for the lack of info!) at the moment, but my personal guess is that the former estimate is much closer to the truth.

For a tiny bit more thoughts on HT issue have a look at my comments to the news piece I posted. If you'd like to get more input on on-die multithreading and related considerations in general, you may want to look up info on IBM's RS64 IV CPUs ("SStar", used in their iSeries and pSeries servers) - the SStar had working single-core (as opposed to POWER4) on-die multithreading long before Intel.

Besides that... Nothing much else to say at the moment as the info on the web is rather scarce, and pure unfounded speculation is not really my style.

HellDiver · Apr 7, 2002

I think we've just had a confirmation on the fact that AnandTech was using Windoze for their servers... :lol:

I did visit it just now to verify the above info, and the results were the same. I wonder how long will it take them to put it back online... Hurry up!!!

CougTek · Apr 7, 2002

I didn't know Barton was rumored to be cancelled. Thanks for bringing this info.

HellDiver said:
It's high time AMD did something with its caches if they can't touch the core logic, they're somewhat slow (compared to P4's and even CuMine's) and by today's standards - the L2 is fairly small. Exclusive is all nice and fine, but that's no excuse for having L2 both small and slow when the competition has theirs fast and big.

I don't think the problem with the Athlon's L2 cache is that it's slow. It is fast, but it is too narrow (accessed only in 64bits compared to Intel's 256bits bus). I agree that enlarging the bus to L2 cache should be a high priority for AMD in their future procesors. It won't happen with Thoroughbred, but I had hope it would have been part of the improvements found in Barton. But if Barton is to be cancelled as you suspect...

HellDiver said:
However, if TBred/Barton (without SSE2 support) will be alone to play on the desktop field by the turn of the year - AMD may start paying some price for their roadmap slippages.

Unlikely to happen if you ask me. AMD knows that it will need ClawHammer ASAP, especially if they have to rely solely on Thoroughbred untill their 64bits CPU arrives. No samples yet, but support from Taiwanese manufacturers already seems to be assured and probably on time. Compared to the situation prior to the introduction of the first Athlon processor and the need to make all new chipset to support the EV6 bus, there seems to be a lot less improvisation now with ClawHammer then back then.

HellDiver said:
Also, contrary to some people I believe that we may see some HT-related improvements even on a regular home user's desktop as well, as a lot of people enjoy playing some MP3s while composing some document (a Word text or an email) while having a couple (or a dozen) of Flash/DHTML/nested tables bloated webpages open in their IE while downloading some videos through their broadband connect on their fully-themed Windoze XP.

That would be great. I wish Hyper-threading is something that AMD will eventually try to mimic in their future designs, although I don't think it will happen any time soon.

My point in the original post was that the dual CPU platform of Intel wasn't very impressive compared to the actual dual CPU platform from AMD, despite their much hyped Hyper-threading novelty and their new E7500 dual-DDR channel chipset. Intel simply matched AMD, they didn't beat them sharp and clear. Future tech from Intel matches current tech from AMD = problem for Intel when future tech from AMD arrives. My guess was that Barton would be the first part of AMD's upcoming goodies. If it's cancelled, then my logic is screwed. However, Intel will still be in the shit in the fourth quarter if ClawHammer is on time and is as good as the rumors say. But I guess we'll have plenty of time to discuss about this though...

HellDiver · Apr 9, 2002

CougTek said:
I didn't know Barton was rumored to be cancelled. Thanks for bringing this info.

Well, I've seen this tidbit far more than once on the web, and it simply makes sense given the current AMD's roadmap : if T-Bred is to ship in June and AMD swears CHammer will both ship this year (i.e. December? I don't suppose it'll be any earlier than that) and be positioned for desktop (as opposed to servers, that is, it's not even MP-compatible), I hardly see any meaningful role Barton could play - either it'll cut T-Bred's useful life to mere 3-4 months, which is laughable for a core, or it'll interfere with CHammer's introduction, or both...

I don't think the problem with the Athlon's L2 cache is that it's slow.

Now, this is a matter of semantics. What is "slow" anyway? Is cache with 4.6ns latency "slow"? And what about 20ns, is that "slow"? There's no universal measure to which you can tie speed, speeds can only be compared relatively to one another in order to obtain qualifiers such as "slower" or "faster", and as far as CPU caches speeds go - to the respective CPUs. For a 100MHz CPU, a cache with 20ns latency is blazing fast with just 2 clock cycles latency, but for a 1.5GHz CPU it's a snail with a 30 cycles latency. It gets worse, though, since if you compare those latencies' respective clock cycles you'll realize that a 4.6ns cache is "slower" for a 1.5GHz CPU than the 20ns chache for the 100MHz CPU, since 7 cycles is over three times higher latency than 2 cycles.

Now, to the situation at hand : Athlon XP's cache vs P4's vs PIII's cache. You say that you don't see Athlon XP's L2 cache problem in the fact that it's slow, but it's a given that the faster the CPU's caches are - the faster it works. Now let's compare :
- P4's L1 cache is indeed blazing fast - with just 2 cycles latency. But there's a price to pay, it's only 8kB large.
- P4's L2 cache has access latency of 7 cycles (i.e. 9 cycles load-to-use after L1 miss) and it's 512kb large.
- P4 Xeon's L3 cache has latency of 14 cycles (total of about 23 cycles after L1 & L2 misses) and is either 512kB or 1MB in size.
- PIII's L1 cache has latency of of 3 cycles.
- PIII's L2 cache has access latency of 4 cycles (for a total of 7 cycles after L1 miss).
- Athlon XP's L1 cache has latency of 3 cycles, and it's an enormous 128kB (split into 64kB I-cache and 64kB D-cache!) cache.
- Athlon XP's L2 cache is 256kB in size. And here's where things get tricky - it has a total latency (L1 miss + L2 hit) of 11 cycles. Sort of. Kind of. Almost. Well, it does, except it's latency is 20-21 cycles.

What???

How can it be both 11 and 20-21 cycles at the same time, you ask? Quite easy - it's either 11, or 20-21 cycles, depending on the circumstances. Let me explain. Athlon XP has exclusive cache architecture, i.e. L2 acts as copy-back buffer for L1 once data must be evicted from L1. To save time on this copy back, L1 has a victim buffer, so if the data must be evicted, it can be copied into this victim buffer at the same time as the new data is fetched from L2 (if there was an L2 hit at all!), and then copied back from this victim buffer into L2 for later propagation to memory. Sounds brilliant, no? Just 3 cycles for L1 miss + 8 cycles for L2 hit (while also copying stuff into victim buffer) = 11 cycles.

But there's surely a catch : the victim buffer is only 8x64 byte (8 cache lines) large. And if the victim buffer happens to be full, you get the full caches latency : 3 cycles for L1 miss + 8 cycles for copying victim buffer's contents into L2 + 2 cycle L2 turnaround + 8 cycles to put the first data word from L2 = 21 cycles. So why the heck was I mentioning 20-21 cycles? Because on the web I keep seeing the 20 cycles figure, while my own math - as given above - using the figures in AMD's manuals gives 21 cycles. Cachemem also indicates 20 cycles, so either there's one cycle AMD managed to somehow shave off since their docs were published, or the docs were off by one cycle (perhaps in turnaround?). Either way that's not critical, both 20 and 21 cycles are over twice more than P4's L2 latency and approach Xeon's L3 latency to within 2-3 cycles.

Now comes the question - how often does victim buffer gets filled (i.e. "how often do 11 cycles turn into 20")? The rule of thumb suggests that when there's an oddball struct that must be accessed in memory the latency will be 11 cycles as advertized. But on large frontal accesses (as with various multi-media loads etc., I suspect) you're down to the good old 20 cycles, just like the first Athlon models...

So, yes, I maintain that compared to 256kB of 7-cycle L2 cache on PIII and 512kB of 9-cycle L2 cache on P4 an 11/20-cycle L2 of Athlon XP is relatively slow. And relation to other caches is the only way I know of deciding whether a particular cache is "fast" or "slow"...

It is fast, but it is too narrow (accessed only in 64bits compared to Intel's 256bits bus). I agree that enlarging the bus to L2 cache should be a high priority for AMD in their future procesors.

While widening the bus to L2 certainly wouldn't hurt Athlon's performance (granted L2's latency wouldn't get any higher!) AMD keeps maintaining that due to a large L1 the performance benefits would be negligible. And on this one I'm rather inclined to believe them. An increase in size of Athlon's L2 and/or bringing its latency down to PIII's or at least P4's (and making it a uniform latency!!!) would be very nice though... <sigh, as the chances are slim, at least until CHammer - IF even then>

AMD knows that it will need ClawHammer ASAP, especially if they have to rely solely on Thoroughbred untill their 64bits CPU arrives.

Knowing you're in deep doo-doo doesn't necessarily mean you have a way out. CHammer was supposed to be out late last year or at most early this year, yet AMD will be lucky to push it out the door by the end of the year. I can't blame them for their pace of advancement - core development and validation are a PITA, but I hate overly optimistic statements... I do wish them luck, though, as I'd hate if we were back to the years of Intel being the performance leader with everybody else playing catch-up in the low-end...

Compared to the situation prior to the introduction of the first Athlon processor and the need to make all new chipset to support the EV6 bus, there seems to be a lot less improvisation now with ClawHammer then back then.

I beg to differ. CHammer won't be using EV6 bus, so VIA can forget about farting out another KT266 re-touch. The good thing is that long gone will be the days of relying on chipset manufacturers to come up with fast memory controllers. The bad thing is that AMD will still have to rely on other manufacturers to provide high-performing AGP and PCI (Uh-oh! :eekers: ) controller implementations. Unless AMD pull their collective head out of their collective arse and start supporting their CPUs by reliable (yes, including the USB!) feature-rich in-house chipsets.

I wish Hyper-threading is something that AMD will eventually try to mimic in their future designs, although I don't think it will happen any time soon.

It's not so much about mimicking as Intel aren't pioneers in this department and HyperThreading is not the ideal implementation of on-die multithreading, it's rather something all CPU manufacturers are likely to implement in one form or another (spare EPIC-style architectures, I guess).

Generally speaking, there are three methodologies to increasing CPU's speed in today's multi-tasking world (as opposed to DOS, for example) :
1. Increase CPU's clock speed. Pretty much self explanatory, everybody have been doing this forever. But there appears to be a problem with this approach as your CPU begins to turn into a perfectly viable home heating appliance, the die shrinks become harder to make and their benefits are diminishing.
2. Increase CPU's raw computational power. In a generic form this approach has also been around for ever : 80286 executed instructions on average in less clock cycles than 8086, and most other PCU manufacturers have done about the same. Later on, came the idea of making CPUs super-scalar - an attempt to throw raw silicone at the problem. In even more recent times the idea behind vector processors has found its way into scalar (by now - super-scalar) CPUs : SIMD. MMX, SSE, SSE2, AltiVec, VIS and other SIMD implementations are just another way of throwing more silicon at the broblem - but in a more elegant way. But those solutions (spare SIMD perhaps) hit another brick wall almost immediately after their introduction - you can't properly utilize those extra EUs given the existing code.
3. Increase CPU's efficiency. In other words, rise the effective (as opposed to theoretical) IPC of a CPU. Making a CPU super-scalar turned out to be not enough : the dumb programmers (and compiler writers) refuse to line up their instructions in ways suitable for filling all the execution units and have a bad habit of using branches in their code. Thus came about caches, registers renaming, brach prediction, out of order execution, trace caches, TLBs and others - their sole goal being to find ways to keep the execution units working as much time as possible.

Contrary to #1 and #2, solutions to #3 are somewhat difficult to come up with and implement as they require a lot of brainstorming and a lot of testing, not to mention the silicone. The majority of those solutions are already incorporated by pretty much all CPU architectures today, some shine on RISCs some do better on CISCs, but in general they're already used up. Yet still your average effective IPCs for the x86 lines of CPUs are somewhere between 2 and 3, far below maximum theoretical rates.

On-chip multithreading is just another elegant solution of type #3 : an attempt to improve CPU's efficiency. It is already present in slightly different forms in 2 mass-produced CPU lines that I'm aware of and was being implemented in a much more elegant way in a third one which sadly will never see the light of the day thanks to Intel's move to eliminate competition ("Cry, the beloved country!"), not to mention some more exotic designs! Sooner or later a good portion of CPU architectures (that will survive Intel's/Wintel's attack) will probably implement multi-threading in one form or another. So, yes, I presume if AMD lives long enough there's a good chance that one of the later Hammers will sport some SMT capabilities. If ex-Digital folks will still be around by then my crude guess is they'll go with 4-way SMT...

My point in the original post was that the dual CPU platform of Intel wasn't very impressive compared to the actual dual CPU platform from AMD, despite their much hyped Hyper-threading novelty and their new E7500 dual-DDR channel chipset.

"Ye ain't seen nofin' yet, boy, ye hear me? NOFIN'!" I'd love to watch dualie Athlon MPs with their puny 384kB of L1+L2 try to put up a good fight against a pair of 512kB L2 + 1-2MB L3 equipped Xeons running at 2.5-3GHz on a 533MHz bus, preferably with some multi-channel (2 or better yet - 4) Hastings RDRAM platform... Unfortunately I have some doubts about the RDRAM part, but the same on a dual-channel DDR chipset should be about as entertaining...

Intel simply matched AMD, they didn't beat them sharp and clear. Future tech from Intel matches current tech from AMD = problem for Intel when future tech from AMD arrives.

I hope the days of beating "sharp and clear" (by either side) are over - those were the days of expensive CPUs and no alternative. I hope those are never to return, even if the the side taking the beating will be Intel. And those aren't "future tech" anymore - they're here which makes them "present tech".

Worry not, Intel always has something up its sleeve...

As far as AMD's future tech is concerned... Off the top of my head I can't think of any [published] dramatic improvements in CHammer's core as opposed to Palomino/T-Bred, spare the on-die memory controller. To counter this single lower-latency DDR channel by year's end Intel will have 533MHz bus, dual-channel DDR chipsets and (if luck would have it) some 3rd party PC1066 RDRAM chipsets. Perhaps some tweaks to the core as well (according to some info on the web P4's instruction latencies keep decreasing compared to the original P4 as it was released back in 2000, i.e. Intel keeps actively improving the core itself). So, as usual - only the time will tell whose "future" is better...

Come think of it, if x86-64 actually catches on (and that's a big "if"), AMD may in fact do much more harm to the industry than it will do good. x86 ISA (aka the IA-32(R)(TM)) is a brain dead architecture that's been kept alive for the last two decades for the sake of application compatibility - at the sake of CPUs performance. Look at all the impossible tricks CPU designers have to do in order to remain x86 "compatible" yet at the same time create high-performance designs! Quite often I wish IA-32 would just die an agonizing death, we'd all switch over to something more suitable and be done with it. Mac users lived through it once, and nothing horrible happened.

(As a side note, I keep reading rumors that Apple may switch the horses once more to x86-64. Picture this : x86-64 doesn't catch on in the PC mainstream, PCs move on to EPIC-like Itanic successors, and AMD becomes Apple's source of CPUs leaving the PC stage for good due to financial pressure from Intel and inability to come up with a VLIW design that would fly with Itanic compilers. A horror movie, I tell ya, A HORROR MOVIE!!! :lol: Though I've seen stranger things happen to the industry in my lifetime...)

My guess was that Barton would be the first part of AMD's upcoming goodies.

Could you give me some links to some Barton info? I haven't looked too thoroughly, I must admit, but except for SOI and the rumor that it's cancelled I haven't found nothing that would be up to date...

...if ClawHammer is on time and is as good as the rumors say.

Ah, yes, the hype. Well, I'm personally out of the age of believing in elves, so...

I do hope CHammer will give Intel run for its money, even though I have my heavy feelings about x86-64 in general, as I mentioned above.

Yep, that's about all I had to say at this time. :roll: No animals or CPUs were harmed in making of this post.

P.S. Don't know why, but somehow the words "long winded" keep ringing in my head... :lol:

CougTek · Apr 9, 2002

HellDiver said:
Compared to the situation prior to the introduction of the first Athlon processor and the need to make all new chipset to support the EV6 bus, there seems to be a lot less improvisation now with ClawHammer then back then.

Click to expand...

I beg to differ. CHammer won't be using EV6 bus, so VIA can forget about farting out another KT266 re-touch. The good thing is that long gone will be the days of relying on chipset manufacturers to come up with fast memory controllers. The bad thing is that AMD will still have to rely on other manufacturers to provide high-performing AGP and PCI (Uh-oh! :eekers: ) controller implementations. Unless AMD pull their collective head out of their collective arse and start supporting their CPUs by reliable (yes, including the USB!) feature-rich in-house chipsets.

You didn't understand what I meant. I didn't imply that ClawHammer will be using the EV6 bus, I meant that the original Athlon used the EV6 bus and that's why VIA and friends had to make brand new chipsets. Now it's ClawHammer's turn to shuffle the standards, so VIA and all its Taiwan brothers will have to redesign completly new chipsets. Except that the platform shifting process seems to go more smoothly than it did at the time of the introduction of the Athlon.

I admit that relying on VIA for PCI controller is indeed scary, but who knows, maybe they'll finally get it right? I'm eager to see if their upcoming VT8235 south bridge will fix the PCI latency issue (stuck to 64 cycles in the current s/b if you want to insure reliable operation) that made them infamous (ok, they were before, but it didn't help).

As for the AGP performance, they aren't that bad. Their P4X266A for the P4 is in fact a bit quicker than Intel's own i845D chipset when they both use the same RAM speed. The 4in1 driver might be a pain to some (although I got used to it with the time), but they have (finally) reached a certain level of maturity.

As for the chances to see a full feature south bridge from AMD, keep dreaming. They don't seem to be interested to develop such a product. AMD only makes chipsets to promote their CPU, not to compete with the taiwanese manufacturers. Their philosophy about their chipsets seems to be more "as long as it's good enough" rather than "it got to be the absolute best thing on earth".

CougTek · Apr 9, 2002

BTW, do you expect people to read your entire chapter?

timwhit · Apr 9, 2002

I read it all. It's not nearly as long as some of Santilli's posts from before the SR database crash. Or as long as some of Tannin's multi-post books.

HellDiver · Apr 9, 2002

CougTek said:
HellDiver said:

CougTek said:

Compared to the situation prior to the introduction of the first Athlon processor and the need to make all new chipset to support the EV6 bus, there seems to be a lot less improvisation now with ClawHammer then back then.

Click to expand...

I beg to differ. CHammer won't be using EV6 bus, so VIA can forget about farting out another KT266 re-touch.

Click to expand...

You didn't understand what I meant. I didn't imply that ClawHammer will be using the EV6 bus, I meant that the original Athlon used the EV6 bus and that's why VIA and friends had to make brand new chipsets.

Why, I did understand your mention of EV6 just fine, that's why I said that I think otherwise. I don't believe CHammer's launch will be all that much different because all other chipset makers (spare NVidia) never had any HT products, so CHammer's HT N/B will be their first - pretty much the same as it was with EV6. I agree with you that it will be slightly easier due to lack of memory controller on N/B though (slightly because by now all of them - spare ALi - have come up with controllers designs anyway).

As for the AGP performance, they aren't that bad.

Not today. But let's look a little deeper into the history : until God-knows-what revision of 4-in-1 their AGP performance was less than lackluster! And what's all that 4-in-1 mockery to begin with? How can AMD expect to seriously take on Intel if the customers have to download drivers for the chipset twice a week to keep their rigs operational? Remember, Coug, not everyone builds rigs for a living, there are a lot of people buying computers who still think that the part that "thinks" is the one which displays text and suspect that the big rectangular box with the CD-ROM is just a big power supply!

As for the chances to see a full feature south bridge from AMD, keep dreaming.

FYI, I'm running a PIII-600EB, and my money lies far away from AMD's stocks. I don't give a rat's arse about AMD chipsets. AMD should care about their chipsets if they want to make sure that in a year or two their ex-Digital engineers won't be changing flags for a second time - the way their Compaq collegues did! Besides, I don't remember where I got this one from, but I think I can remember reading that this time (CHammer) AMD may actually take a shot at producing and supporting a no-nonsense chipset. But then again - rumors, rumors...

BTW, do you expect people to read your entire chapter?

Come think of it, you're right : it was a fairly stupid idea.

CougTek · Apr 9, 2002

HellDiver said:
How can AMD expect to seriously take on Intel if the customers have to download drivers for the chipset twice a week to keep their rigs operational?

You're pushing a bit don't you? There was a time when the updates of the 4in1 drivers were frequent, but never even once a week as far as I remember. It might have happen that there's been two very close releases, but not twice a week on a regular basis. Or else, we would be far beyond the 38th version.

And you don't need to get the latest 4in1 driver in order to make the system work. The latest driver is only supposed to make it work better. You don't even need to install the 4in1 driver to make a system functional, although you'll miss several chipset features.

But I understand that many people prefer to just plug the damn thing and let it run by itself untill they almost forget that it's there. I don't like this kind of sys admins though. They make us all look like a bunch of lazy ass.

Anyway. My hope would be that SiS produces a better chipset than VIA for once so we would get rid of both the 4in1 AND the PCI latency issue. SiS has never been able to make a low latency memory controller, but since the memory controller of the ClawHammer will be integrated within the CPU core, there's hope.

Pradeep · Apr 9, 2002

HellDiver said:
How can AMD expect to seriously take on Intel if the customers have to download drivers for the chipset twice a week to keep their rigs operational?

What has AMD got to do with Via's 4in1 shenanigans? I haven't needed to upgrade my chipset drivers for my TYAN TigerMP since I first built it. Prob something to do with the all AMD chipset?

GIANT · Apr 9, 2002

. . .
Curse of the 3-lettered-enterprises: . . . ALi, SiS, and Via all made me stay away from AMD.

. . .

GIANT · Apr 9, 2002

. .

CougTek said:
...I would like to hear your comments about the benefits of Hyperthreading, as implemented by Intel within the latest Xeon...

Well, we had an earlier discussion on this topic -- which I nicknamed "Hyperthreading and the Fine Art of Tossing Handgrenades" for some reason -- where we uncovered most of the "hyper"bole surrounding SMT.

http://www.storageforum.net/phpBB2/viewtopic.php?t=9&start=0&postdays=0&postorder=asc&highlight=

. .

Bozo · Apr 9, 2002

GIANT said:
. . .
Curse of the 3-lettered-enterprises: . . . ALi, SiS, and Via all made me stay away from AMD.

. . .

That was my point in the beginning of this thread. Until AMD starts making their own chipsets that are reliable and stable, Intel will be the dominant supplier in the server market. Even if the AMD processor is better. I just ordered parts for a small server and I would have loved to buy AMD. But it's just to risky for a 24/7, 365 day process control server.

Bozo

HellDiver · Apr 9, 2002

CougTek said:
HellDiver said:

How can AMD expect to seriously take on Intel if the customers have to download drivers for the chipset twice a week to keep their rigs operational?

Click to expand...

You're pushing a bit don't you? There was a time when the updates...

That's called "deliberate exaggeration", not "pushing a bit". Figure of speech, if you will. I guess I should have explicitly stated so instead of relying on implicitly hinting it.

You don't even need to install the 4in1 driver to make a system functional, although you'll miss several chipset features.

I'd say that a system that's missing several chipset features is not fully functional. Otherwise a rig without any HDD, floppies, CD-ROMs etc can be called "functional" too - after all, you do see the nice "Energy Star" logo and the memory POST results, you're simply missing such "features" as the ability to boot into OS and work... :roll:

But I understand that many people prefer to just plug the damn thing and let it run by itself untill they almost forget that it's there. I don't like this kind of sys admins though. They make us all look like a bunch of lazy ass.

Tunnel vision. In the first sentence you say "people", which is true. By the second sentence you're already talking about "admins" only. I said it once, and I'll repeat it : not everybody is a system builder or a sysadmin, the overwhelming majority of computer users are regular users. And if it so happened that such a regular user purchased a VIA-based system back when AGP performance was screwy (i.e. the system builder physically couldn't pre-install those 4-in-1s) he'd have to follow the news on PC-related websites to see when the 4-in-1 that finally fixed the problem appeared, then do the same for the PCI problem, etc.

BTW, you said "I understand that many people prefer to just plug the damn thing and let it run by itself". You probably meant that as opposed to people who prefer to spend their entire 8-hours workday or an entire evening after work in formatting, reinstalling, trying out a set of drivers, crashing, reformatting, reinstalling, trying out... etc, right?

Now you understand why tier-1 OEMs don't want to have anything to do with Athlon based systems (spare some shy attempts that practically failed)? They just don't want the additional burden of related support hustle on their shoulders.

Let's cut the details crap though (on which we never seem to agree, which is perfectly normal, BTW) : do you generally agree that the chipsets supporting AMD CPUs have generally been buggy, slow, requiring a lot of attention to get them going, etc, until very recently when SiS released a couple fo chipsets that seem to be very decent in this respect?

HellDiver · Apr 9, 2002

Pradeep said:
What has AMD got to do with Via's 4in1 shenanigans?

1. In the last several years AMD has traditionally relied on 3rd party chipset manufacturers to support its CPUs. Which incidentally proved to be not-very-compatible, occasionally slow and occasionally buggy.
2. If you're designing CPUs, you may have the highest performing CPU on the market - that doesn't mean nothing (not to say "shit"). If your CPU isn't properly supported - your sales will suffer dearly. 10 braindead and buggy chipset versions don't make it "supported". The customer doesn't want to hear "Hey, don't blame us for your system's problems - our CPU runs just fine, it's VIA's problem, go harass them!" Customer wants to hear "Here's your system, sir, rock solid. That'll be $1999, taxes included. Drive safely, sir!" Shifting the blame to another company doesn't help one's sales. If there's some problem surrounding your product - find a solution.

Do you soppose I'm a complete asshole because I'm bashing our beloved underdog AMD this way? That's up to you to decide, but that in itself is simply irrelevant, as those are the rules of the game called "business", I didn't invent them. Those who can - deliver, gain marketshare, become monopolies and then hold the rest of the industry by the testicles. Those who can't - put the blame on others, lose marketshare and are being taken over by thouse who could. Period.

I don't think I have to prove anything to anyone here - in the last couple of years AMD has largely been a top performer. How come it's still selling only about 20-23% of x86 CPUs instead of selling 77-80%? Surely not only because of Intel's pressure on OEMs!

I haven't needed to upgrade my chipset drivers for my TYAN TigerMP since I first built it. Prob something to do with the all AMD chipset?

Yup. Though MPX had some USB-related hardware problem that needed to be resolved through S/B revision (not a driver fix!), unless my memory fails me... But that's not even the point : AMD's chipsets marketshare was (and still is) small compared to 3rd parties. And the market as a whole doesn't draw lines between different chipsets, expecting Joe Schmoe to rememer names of chipset manufacturers and chipset models and revisions when he drops by to the store is a poor marketing tactics...

Now, don't get me wrong, I'm not an AMD hater. It simply hurts me to watch a good CPUs line being denigrated thanks to poor support. I've seen this too many times already - the latest one ending in Alpha's demise and subsequent elimination from the market (though those two faced different kinds of support problems). I don't want AMD to follow suite.

CougTek · Apr 9, 2002

HellDiver said:
Do you soppose I'm a complete asshole because I'm bashing our beloved underdog AMD this way?

Oh no HellDiver, there's a bunch of other reasons. :lol:

More later, I haven't finished to read your post, but I couldn't miss that shot.

HellDiver · Apr 9, 2002

CougTek said:
HellDiver said:

Do you soppose I'm a complete asshole because I'm bashing our beloved underdog AMD this way?

Click to expand...

Oh no HellDiver, there's a bunch of other reasons. :lol:

Oh, that's perfectly fine, I've no problem with that, I were asking about whether my specific action of bashing AMD as in itself would in any way incline people to believe I were an asshole...

Speaking of which, that "sOppose" should certainly read "sUppose". I were typing at an odd angle on an undersized 104-key noiseless, force-less, rubber-crap kbd (with the Windoze- and context- menu popping up every time I missed the Alt key, naturally) that in addition had the Ins-Del-Home-End-PgUp-PgDn block moved down by one space (so that's Ins for every Del hit and Home for every End hit, etc), so it wasn't really my fault - I were lucky to have as few typos as I did... May M$, rubber-dome kbds and whoever produced that undersized-plus-non-standard-layout crap burn in hell! :evil:

CougTek · Apr 9, 2002

Hmmm, interesting. There's a report suggesting the AMD Athlon with the Thoroughbred core for notebook has 512K L2 cache. I wonder if it's true or not.

Regarding your spelling mistake, I remarked it first time I read, but that's not why I included it in the quote. And I should have writen "there are a bunch..." but whatever. I suppose that's why you use "I were" constantly in your answer...

To come back on topic :

HellDiver said:
CougTek said:

But I understand that many people prefer to just plug the damn thing and let it run by itself untill they almost forget that it's there. I don't like this kind of sys admins though. They make us all look like a bunch of lazy ass.

Click to expand...

Tunnel vision. In the first sentence you say "people" , which is true. By the second sentence you're already talking about "admins" only. I said it once, and I'll repeat it : not everybody is a system builder or a sysadmin , the overwhelming majority of computer users are regular users . And if it so happened that such a regular user purchased a VIA-based system back when AGP performance was screwy (i.e. the system builder physically couldn't pre-install those 4-in-1s) he'd have to follow the news on PC-related websites to see when the 4-in-1 that finally fixed the problem appeared, then do the same for the PCI problem, etc.

I thought the debate about the importance of AMD chipsets was regarding the server market (and high-end systems in general). That's why I associated "people" to "sys admin". Regarding the need for AMD chipset for the maintream market, I disagree with you. AMD only occupies twenty-something percent of the market because that's all what they can current produce. AMD doesn't have the production capacity of Intel and even with all their fabs working at full capacity, they couldn't fill 50% of the market. The Athlon are selling like hot cakes. Sure, VIA shipped less chipsets than last year, but that's partly because SiS stole them a good chunk of their past market share (and that's at least as much because of the P4 market as the Athlon market). But the market performance of VIA didn't slow the progression of the Athlon.

So, yes, there is a need for an AMD full-featured chipset...in the server market. Maybe there would have been a pressure on AMD to also produce such a chipset for the maintream market if NVIDIA would not have step in and if VIA would not have been able to fix most of the problems that plaggued their past chipsets. Now, neither is perfect and bug-free, but they are both what I would call "good enough". SiS seems to be concentrating more on the P4 platform from now on, so they won't be an important player in the near future for the socket-A platform.

Regarding :

Let's cut the details crap though (on which we never seem to agree, which is perfectly normal, BTW) : do you generally agree that the chipsets supporting AMD CPUs have generally been buggy, slow, requiring a lot of attention to get them going, etc, until very recently when SiS released a couple fo chipsets that seem to be very decent in this respect?

Well, hem...I guess I'll...well...ok, yeah...I agree with you on this, but with shades of meaning. Some of them are buggy, although not that much, some of them have been slow, but major OEMs (Compcrap, HP) have used just as bad in the past for Intel-based systems. Ask any technician doing tech support for a living and he'll confirm (Tannin?). Intel pressure is doing much more harm to AMD than you seem to believe IMO. I don't expect us to agree about this BTW.

CougTek · Apr 10, 2002

CougTek said:
Hmmm, interesting. There's a report suggesting the AMD Athlon with the Thoroughbred core for notebook has 512K L2 cache. I wonder if it's true or not.

(Looking at the front page)

Apparently, HellDiver liked that report.

HellDiver · Apr 10, 2002

CougTek said:
CougTek said:

Hmmm, interesting. There's a report suggesting the AMD Athlon with the Thoroughbred core for notebook has 512K L2 cache. I wonder if it's true or not.

Click to expand...

(Looking at the front page)

Apparently, HellDiver liked that report.

Yup. I did. A lot! It's a shame I didn't see your post in here before posting it the news, though - now I feel very awkward... :-? (there's also a chance that I started submitting the news before you posted it here, but hey, by now it's too late anyway...)

If this turns out to be true and the desktop T-Breds show up in June, that'll explain a lot of things as I mentioned in the news bit : the delay of T-Bred from Q1 to summer (to validate the 512KB core), another speed grade on Palomino core (to somehow make it through the spring)... What does sound a little strange is that as late as April 1st X-Bit Labs received a confirmation from AMD about the 256KB part, but then again, maybe it was some sort of last-minute knee-jerk reaction to Intel piling up new stuff...

Either way, if CHammer is on-track for Q4, Barton seems either dead or pushed back to late Q4 as well (which in fact would play to the tune of what Ace's published after CeBIT : that Athlon will "live happily ever after" the introduction of CHammer). But that would in effect mean that CHammer will be positioned for "sort of" high end ("sort of" because it's not even DP/MP), and priced accordingly. Which generally leaves me very confused about AMD's roadmap...

<holding fingers for 512KB L2 desktop T-Breds>

I'll answer the technical side of your other post tomorrow, but as far as "I were" is concerned... That's not a typo, that's just another classic Americanism. I know it's wrong, and when I catch myself doing that - I actually correct that kind of thing, but quite often it just slips through. When you hear something often enough it just catches on. Run a search on Google : "I was" returns about 5,450,000 results, "I were" about 1,400,000. Which means that about one in every five content creators is illiterate. So, I'm in a good company...

(No, that's no excuse, I know)

P5-133XL · Apr 10, 2002

I have my own thoughts about AMD vs. Intel and their long-term strategy’s. I am going to try to keep my discussion at a macro level: Mainly because I don’t keep the details in my head, because I don’t usually need to and it would take some effort to support, in a rigorous way, what I am saying. i.e. I’m lazy, but I have somewhat kept up with the issues. If people here disagree with stuff here, fell free to critique what I am saying. I won’t pretend I know it all and actually want my errors corrected: Both factual and my conclusions.

In way’s I like both platforms and both long-term strategies. I like that AMD intends to keep their 64 bit platforms relatively compatible with the x86 processors. I like that for multiple processors AMD allocates independent busses to ram. For Intel I like that they’re processors scale MHz well. I like that they are improving efficiency of their chips via hyper threading and more pipelines with a very wide fast cache.

However, I see problems for both that need to be overcome. In a way both need to solve the same problem and that is the interface between ram and the CPU. CPU’s are just outright too fast for the current memory systems and over time it is getting worse. Both have the same problem because regardless of the MHz or the efficiency of the CPU the test to which is faster has to do with what is actually done. Both are competing at the same thing and thus both have the same basic problem/limitation.

Intel can allocate silicon to improving chip efficiency and to deeper, wider and faster pipelines, but in the end it can go no faster that how fast the memory system can keep the chip full. If anything the more efficient and the faster the chip the bigger problem Intel has. The only solution that I know of is to allocate more silicon to caching and increasing the basic speed of the ram (both latency and MHz). Intel has chosen interleaved RDRAM as their primary memory subsystem and that has high bandwidth but a large latency. The only way I can see lowering the latency is by improving the cache so that cached memory can be accessed at a low latency while the cache is replenished via the high latency and high bandwidth RDRAM. Intel also deals with this issue via a very wide path and interleaving the RAM to increase the bandwidth even more. Intel has to make a choice with its silicon and that choice is to improve the amount of work done via cache or efficiency/MHz.

AMD has taken a different track to the same goal. Its processors don’t run as many MHz, but do more work in the same amount time: As shown by their processor ratings. In the end they have the same problem as Intel – The RAM, CPU interface because just like Intel the processor can only go as fast as the memory subsystem can keep it full. Because AMD has decided to take the path they choose, their processors don’t scale as well as Intel and thus they are behind Intel here. However on the RAM->CPU interface, they are competing quite well. DDR has lower latency and similar bandwidth while their caches thinner and are slower than Intel’s. AMD is also supplying independent ram busses to multiple processors and that will improve bandwidth for multiple processor systems,, while Intel keeps the same BUS for multiple processors and that will limit Intel in the server market eventually. I think this keeps AMD competitive with Intel.

If I were a betting man, I’d be betting on Intel and it has nothing to do on the current or future platforms. I’d bet on the amount of capital ($$$$) that the companies can throw at the problem and here Intel wins. However, AMD will give Intel a good run for the money and keep Intel honest. A good example of this is Intel’s error with RDRAM and AMD/The Market forcing Intel to correct the error. Without AMD, there would have been no market pressure to force Intel to make a back-step.

CougTek · Apr 10, 2002

HellDiver said:
Run a search on Google : "I was" returns about 5,450,000 results, "I were" about 1,400,000. Which means that about one in every five content creators is illiterate. So, I'm in a good company... (No, that's no excuse, I know)

5,450,000 / 5 = 1,090,000
5,450,000 / 4 = 1,362,500

It seems that it's more one on four than one on five to me. Not only are you illiterate, but you don't even know how to count!

Sorry, I'm tired.

CougTek · Apr 10, 2002

P5-133XL said:
I have my own thoughts about AMD vs. Intel and their long-term strategy’s. I am going to try to keep my discussion at a macro level: Mainly because I don’t keep the details in my head, because I don’t usually need to and it would take some effort to support, in a rigorous way, what I am saying. i.e. I’m lazy, but I have somewhat kept up with the issues. If people here disagree with stuff here, fell free to critique what I am saying. I won’t pretend I know it all and actually want my errors corrected: Both factual and my conclusions.

Don't worry Mark. HellDiver and I are principally nasty with each other in threads where we are both involve. At least I will try to be nice with you ;-)

P5-133XL said:
CPU’s are just outright too fast for the current memory systems and over time it is getting worse.

The reason behind this is called MARKETING. MHz sells. I doubt a company would be able to make a breakthrough with an optimized 800MHz CPU, no matter how performant it would be against the offsprings of the two chip giants of the x86 market. Try to explain to Joe SixPack that your heavily parallelized/low latency/large bandwidth sub-1GHz chip can do more than a 2.4GHz Pentium 4. If you are able, you might replace Steve Jobs one day as Apple's CEO.

P5-133XL said:
AMD has taken a different track to the same goal. Its processors don’t run as many MHz, but do more work in the same amount time: As shown by their processor ratings.

The part I underlined should be "do more in the same amount of clock cycles" IMO. Sometimes, it is true that an Athlon XP can do more than a Pentium in the same amount of time (when they win under a certain benchmark), but the term "clock speed" is probably more appropriate.

P5-133XL said:
DDR has lower latency and similar bandwidth while their caches thinner and are slower than Intel’s.

So far, DDR SDRAM has a slight bandwidth disadvantage per channel versus RDRAM. It might change when DDR-II will hit the market, depending how high RDRAM will ramp up till then (if they are still alive by then).

P5-133XL said:
AMD is also supplying independent ram busses to multiple processors and that will improve bandwidth for multiple processor systems,, while Intel keeps the same BUS for multiple processors and that will limit Intel in the server market eventually. I think this keeps AMD competitive with Intel.

Maybe that's why their MP platform still matches the latest dual Xeon + E7500 chipset. Thinking about it, it's plenty of sense.

P5-133XL said:
If I were a betting man,

HellDiver isn't alone indeed ;-)

CougTek · Apr 10, 2002

CougTek said:
So far, DDR SDRAM has a slight bandwidth disadvantage per channel versus RDRAM.

Talk about a way to destroy my own credibility

DDR has more bandwith per channel (currently 2.1GBps vs 1.6GBps for RDRAM), but as there is no real dual-DDR channel for the Athlon, then the Pentium 4 's Rambus solution wins in the bandwidth department.

time · Apr 10, 2002

CougTek said:
5,450,000 / 5 = 1,090,000
5,450,000 / 4 = 1,362,500

It seems that it's more one on four than one on five to me. Not only are you illiterate, but you don't even know how to count!

Oh dear. You are tired, aren't you?

There are 5,450,000 plus 1,400,000 = 6,850,000 pages total.

6,850,000 / 5 = 1,370,000, so HellDiver is correct and you are ...

CougTek · Apr 10, 2002

Yes, I was and I still am. Spliting my night in two because of the funerals of the Queen Mother isn't good I guess.

I'll try to fix that ASAP, but it's hard to sleep when you woke up at 1pm.

BTW Time, HellDiver is big enough to defend himself alone, you don't need to step in at his rescue every now and then.

P5-133XL · Apr 10, 2002

CougTek said:
CougTek said:

So far, DDR SDRAM has a slight bandwidth disadvantage per channel versus RDRAM.

Click to expand...

Talk about a way to destroy my own credibility

DDR has more bandwith per channel (currently 2.1GBps vs 1.6GBps for RDRAM), but as there is no real dual-DDR channel for the Athlon, then the Pentium 4 's Rambus solution wins in the bandwidth department.

Currently DDR bandwidths are

PC1600 = 1.6 GBps (8 Bytes x 200MHz)
PC2100 = 2.1 GBps (8 Bytes x 266MHz)
PC2700 = 2.7 GBps (8 bytes x 333MHz)

Current RDRAM bandwidths are

Single channel = 1.6GB/s
Dual Channel = 3.2GB/s (2-way interleaved)
Quad Channel = 6.4GB/s (4-way Interleaved)

The standard currently for Intel Servers are Dual Channel RDRAM Though there are MB's with 4-way interleaving.

To my knowledge, there is nothing stopping DDR from being interleaved, but I know of no motherboards that do it. Thus both companies can relatively easily increase memory bandwidth.

Typical Latency for RAM types

PC1600 = 10ns Cas 2
PC2100 = 7.5ns Cas 2
PC2700 = 6ns Cas 2

Latency for RDRAM is hard to describe as a single number. It is a serial device and each chip added will add its latency to the memory system. Thus the more memory added the higher latency. Each chip (Each RIMM can contain multiple chips and up to 32 chips are allowed) will have 20-30ns of Latency. For a single 128MB RIMM has a typical latency is 45ns.

As you can see, the latency of RDRAM can get to be a very large issue and the only way to compensate is for the cache to get large blocks of memory at one time and keep all small accesses coming from the cache. This is undoubtably why Intel has such a large width to its cache. This also means that for servers, with lots of memory RDRAM is a big problem.

The current MP/MPX motherbords do not currently have multiple independand paths to RAM.
Athlon MPX

The Hammer Series will have multiple independant pathways to RAM.

2 way hammer
4 way Hammer
8 way hammer

HellDiver · Apr 10, 2002

CougTek said:
I thought the debate about the importance of AMD chipsets was regarding the server market (and high-end systems in general). That's why I associated "people" to "sys admin".

Uh... Ummm... OK, then, you know I don't give up my positions without putting up a good fight...

Seriously though, the 4-in-1 problems are next to non-existent in server market because VIA's line of chipsets that require those was never targeted at server market to begin with, at least not AMD-based line! True enough, VIA did produce a couple of dualie PIII chipsets, but to rely on VIA in server environments is not something I would consider a wise policy. Surely there are some dirt-cheap (in server terms) low-end VIA-based server boxes (in their overwhelming majority - simply desktop systems used as servers), but IMHO their market share is less than small. That's why I naturally related 4-in-1 problems to desktops.

But even if we were to talk about corporate owned high-end systems/servers based on VIA ("high-end/server based on VIA"... That's an oxymoron!!!), I don't think any admin would like the hustle of dealing with rather frequent driver updates that may potentially result in a dead system. At least none of those I know (including myself part-time) like that. And that's not because of their laziness. It's simply a RAS consideration (yes, a workstation that throws fits prevents at least one employee from being productive, and server doing the same has a potential to bring the entire team/workgroup to its knees).

Is it so critical a problem? Probably not. But why deliberately choose to purchase a system facing potential problems and pay $100-$500 less for it if every hour of downtime of that system spells loosing almost as much? And when talking about servers, a price premium of a grand or two is just as negligible, considering the potential damages!

AMD doesn't have the production capacity of Intel and even with all their fabs working at full capacity, they couldn't fill 50% of the market.

Existing production capacities are a function of projected production volumes requirements. Until some time ago Fab 30 (Dresden) wasn't even operating at full speed (don't know if it is now). If AMD knew they could sell more - they'd invest more into their own fabs or form alliances (as they did in fact for their future 300mm fab, IIRC). Intel has as many fabs as it does not just because they had a lot of extra money they didn't know what to do with, but because they knew their production volumes grow constantly (perhaps not in terms of marketshare, but in terms of raw CPU sales figures).

But the market performance of VIA didn't slow the progression of the Athlon.

But I do strongly believe that the past (and present!) history of incompatibilities and other various problems contributed a lot to preventing Athlon systems from finding their way into many (all but one?) 1st tier OEMs. And those sell more (and depending on the world region - up to several times more) rigs than DIY and small B&M shops, IIRC. That despite the fact that Athlon was/is one strongly performing CPU! There's no denying Intel pressure has a lot to do with it as well, but IMHO less than you assume.

So, yes, there is a need for an AMD full-featured chipset...in the server market.

According to what The Reg/The Inq (either, or both, don't remember) had to say there may in fact be some developments in that direction as far as Hammers are concerned - looks like Sun may be interested in having Hammers running Solaris in their low-end segment. How reliable this info is - I don't know, and won't know until I hear an official Sun's announcement... If it does work out though (wildly hypothesizing here!) that would do a lot of good to AMD.

Are Intel systems bullet-proof reliability-wise? I don't have to ask Tony to know the answer, I've dealt with too many of those myself. They've had their bad times, they've had their good times. But working on a chipset family for nearly 2 years to make it bug-free and performing on par?!? (a hint at a certain company with 3-letter name starting with 'V')

HellDiver · Apr 10, 2002

P5-133XL said:
I like that for multiple processors AMD allocates independent busses to ram.

It may indeed seem nice on the surface of things, but pretty soon after scaling the system to more than two nodes I think you'll discover that you're facing a NUMA in all of its beauty (which used to be dreaded as in itself, but that's not the point). Which may not be too bad per se (though facing certain severe disadvantages compared to true switch-based SMPs), as long as you're running NUMA aware (and preferably optimized) OS. I am yet to be demonstrated that memory sitting 2/3-hops away - even on a HT interconnect - can ensure latency comparable to that of local banks. Especially once there's heavy congestion on the HT links - something that is bound to happen on a non-NUMA-aware OS! And let's keep in mind that Windoze can barely manage local memory (though NT's memory manager - that takes its roots in VMS memory manager, BTW - is so much better), running it on an 4/8-way NUMA is most likely to end up with somewhat unexpected results.

HellDiver · Apr 10, 2002

CougTek said:
It seems that it's more one on four than one on five to me. Not only are you illiterate, but you don't even know how to count!

<snip>

BTW Time, HellDiver is big enough to defend himself alone, you don't need to step in at his rescue every now and then.

Actually, Coug is right, Time. Thanks for the wording, though!

Coug :

Oh dear. You are tired, aren't you?

There are 5,450,000 plus 1,400,000 = 6,850,000 pages total.

6,850,000 / 5 = 1,370,000, so HellDiver (Erm... That would be me!) is correct and you are ... not only illiterate and sleep-deprived, but also don't even know how to count!

Must have something to do with all that Mother Queen thing...

HellDiver · Apr 10, 2002

P5-133XL said:
To my knowledge, there is nothing stopping DDR from being interleaved, but I know of no motherboards that do it. Thus both companies can relatively easily increase memory bandwidth.

Well, there is that nVidia's chipset... Though there are a lot of interleaving-related issues with it. Not to mention the silly 1+2 limitation! But given the EV6 bus max throughput barrier that didn't do Athlon too much good anyway.

Increasing the number of RDRAM channels is in fact easier that SDRAM channels (DDR or not), if only for the pincount limitations and associated wiring problems. And if/once 32-bit RDRAM sticks show up, multi-channel RDRAM solutions will be much more attractive.

Besides, PC1066 has 2.1GB/s max theoretical throughput as well, so if Hastings shows up as promised (before or during this summer) things will look quite promising. Especially if 32-bit PC1066 modules will become available : 4.2GB/s off a single stick of memory is indeed respectable!

Typical Latency for RAM types

PC1600 = 10ns Cas 2
PC2100 = 7.5ns Cas 2
PC2700 = 6ns Cas 2

Frankly, listing latencies this way is completely meaningless, as compared to actual memory access latency of usually well over 100 CPU cycles, those nanoseconds are completely irrelevant. Once you count in the buses, the memory controller, the buffers and some of CPUs logic picture tends to change dramatically.

Hopefully CHammer's on-die memory controller will be able to cut the access latency by a good margin! As for SHammers in "SMP" - see my NUMA-related comments above.

CougTek · Apr 10, 2002

HellDiver said:
6,850,000 / 5 = 1,370,000, so HellDiver[/i] (Erm... That would be me!) is correct and you are ... not only illiterate and sleep-deprived, but also don't even know how to count!

Must have something to do with all that Mother Queen thing...

Hey, YOU are the illiterate (when busy elsewhere) and I'm the one who doesn't know how to count (when I'm dead tired). Don't put everything on my back!

Mark's listing of memory latency doesn't include the fact that DRAM needs several clock cycles from the moment they are asked to provide data (or to store it) and the moment they are able to send it. That's why DRAM sticks have specifications like 2-3-3 or 2.5-3-3 for instance.

RDRAM, on the other way, has a dreadful latency for the first chunk of data, but it has a lower latency for the data moved afterward sequantially because of the burst way it send/store data. All in all, PC800 memory has an average latency higher than DDR SDRAM anyway, but not by a two or three fold factor. I think it's more like a 20-30% disadvantage (from my memory only, I didn't check to be sure).

Tannin · Apr 10, 2002

My word people talk a lot of horseshit about VIA drivers. Updates every week. Huh? Constantly changing. What was that again? Two years to get it right. Speak louder, you're not making sense.

Now I know nothing about the big iron, I just play with desktops. And you know what method I use to make sure that my VIA driver setups are trouble-free? Pick up a CD-ROM disc. Stick it in the drive. Load the 4 in 1, unticking the IDE drivers, ticking the rest. Do this first, before you load any other drivers. Reboot. Go on to next job.

That is all I do. All I ever do. To my knowledge, I have never, ever, ever seen a problem result from this method, and with several thousand machines under my care, I figure that the method is about as well-proven as it's possible to get.

Oh. Which disc? Which version of the drivers? Which brand of motherboard? Which VIA chipset?

Any of them. First one I put my hand on. Usually I try to get the latest one that happens to have arrived with a batch of new motherboards, but not if it meands spending more than 15 seconds looking for it. Building a KT-266A and only seem to have MVP4 era drivers handy? Don't worry about it: a VIA driver is a VIA driver is a VIA driver. Just be sure you load it first and don't worry.

I'm not kidding guys. Everyone keeps talking about VIA driver problems, and I have no idea why. Surely if there was a real problem, I would have seen it by now? I mean there is such a thing as dumb luck, but we average about six VIA driver installs a day, five days a week, 52 weeks a year, and no amunt of dumb luck can explain away that record.

Grab a driver disc, if it happens to be a fairly recent version so much the better, load it up, don't forget to do it before you do the sound and video. Reboot.

How hard is that?

Tannin · Apr 11, 2002

"I were" can be perfectly good English. For example: "If I were an elephant, I'd have big feet." It's one of those complicated tense thingies that Buck and Cliptin seem to understand.

Pradeep · Apr 11, 2002

Just like it is possible to have five consecutive "and"s in a valid sentence.

Tea · Apr 11, 2002

Ye Gods, Tannin! They are all on Valium or something. That's a nice, old-fashioned bit of traditional Tannin plain speaking up there a post or two above - I have to admit, brother, that you were always good at telling them all they are talking horseshit - and we get not one peep out of the whole damn lot of them. You know what I think? I think they are all whimps.

Pradeep · Apr 11, 2002

There was a pub called the Fox and Hounds. One day they were getting the big name on the front repainted. The landlord left the painter to do their thing, but when he came out a bit later on he had to stop them and point out their error. He said "You need to leave a space between Fox and and and and and Hounds".

Intel Hyperthreading discussion.

Hairy Aussie

Storage? I am Storage!

Storage? I am Storage!

Learning Storage Performance

Learning Storage Performance

Hairy Aussie

Learning Storage Performance

Hairy Aussie

Hairy Aussie

Hairy Aussie

Learning Storage Performance

Hairy Aussie

Storage? I am Storage!

Learning Storage Performance

Learning Storage Performance

Storage? I am Storage!

Learning Storage Performance

Learning Storage Performance

Hairy Aussie

Learning Storage Performance

Hairy Aussie

Hairy Aussie

Learning Storage Performance

Xmas '97

Hairy Aussie

Hairy Aussie

Hairy Aussie

Storage? I am Storage!

Hairy Aussie

Xmas '97

Learning Storage Performance

Learning Storage Performance

Learning Storage Performance

Learning Storage Performance

Hairy Aussie

Storage? I am Storage!

Storage? I am Storage!

Storage? I am Storage!

Storage? I am Storage!

Storage? I am Storage!