Hyperthreading (SMT)

flagreen

Storage Freak Apprentice
Joined
Jan 14, 2002
Messages
1,529
Ah ha!!! The very first post in the Computer Forum!! Aren't I special? Just to be on topic here, what do you guys think of Hyperthreading? Looks very promising but that's about all. My Xeons have it enabled but so far it hasn't shown any big improvement over the performance when I disable it. Other than in Sandra that is. It appears that apps are going to have be written to take advantage of it before we see any real big improvement.
 

time

Storage? I am Storage!
Joined
Jan 18, 2002
Messages
4,932
Location
Brisbane, Oz
I confess I have no idea how it's supposed to help. If it pretends to split the entire CPU, then that would suggest that it's making up for poor arbitration in the OS.

My guess is that Intel reckons they can achieve some sort of sub-processor optimization by doing this. Sounds like optimizing for benchmarks ... again.

Skallas, prove you're reading this by explaining what it really means :)
 

Adcadet

Storage Freak
Joined
Jan 14, 2002
Messages
1,861
Location
44.8, -91.5
From what I read over at Anand's (I think) it seems a clever trick to make use of multiple ALUs etc.

I'm all for it. I need a 4-way systm. Conventional SMP is becomming so passe ;)
 

CougTek

Hairy Aussie
Joined
Jan 21, 2002
Messages
8,728
Location
Québec, Québec
If it can enhance the CPU's performances, then I vote for it. Hyperthreading or not, I still don't like the main idea being the design of the P4 core, that is : decreasing the IPC in order to boost the frequency. Hyperthreading sounds like a way to raise the IPC somewhat, so it certainly ain't a bad thing.

I read somewhere that Intel had tremendous difficulties with the early P4 processors, so they had to disable several fonctions in order to improve the yield. Hyperthreading was probably one of those fonctions they had to originally put aside. Now that the process has matured, they can re-enabled some (if not all) of the features that were originally supposed to be part of the first P4 core. My guess is that the early P4 were very deceptive because they were basicly severly crippled versions of what they should have been.

[rant]
But I still fundamentaly don't like the P4 design. All in all, after the dust settles down, it's a design based on maketing decisions more than on performances decisions. Yeah, I know Intel is a company and the goal of all companies is to make money, no matter how. I simply prefer AMD's way (and Motorola's, heard about the G5? Promising). I hope Intel's next major design won't have the same shortcomings.
[/rant]
 

Adcadet

Storage Freak
Joined
Jan 14, 2002
Messages
1,861
Location
44.8, -91.5
CougTek said:
[rant]
But I still fundamentaly don't like the P4 design. All in all, after the dust settles down, it's a design based on maketing decisions more than on performances decisions. Yeah, I know Intel is a company and the goal of all companies is to make money, no matter how. I simply prefer AMD's way (and Motorola's, heard about the G5? Promising). I hope Intel's next major design won't have the same shortcomings.
[/rant]

Ya know, I used to agree with you totally. But as the P4 scales above 2 GHz, I do see it's performance merit. If the architecture scales as well (frequency-wise) as the P3, then I think Intel has a winner, despite selling all those P4s <2 GHz based only upon marketing. It will also be nice to see what happens when Intel doubles the on-die cache.
 

The JoJo

Wannabe Storage Freak
Joined
Jan 25, 2002
Messages
1,490
Location
Finland, Turku
Website
www.thejojo.com
Too bad AMD isn't enlarging the cache on their processors. That 512KB L2 cache seems to me like a very nice boost.

Have I understood correctly that the Hammer series will only have 256KB cache?
Back on topic, go for hyperthreading, if it helps. Other than that I must agree with CougTek and his rant, damm marketing decisions...
The P4 seems to be able to grow in MHz speed quite well though, what's the latest record, 3800MHz with LN I think....;)
 

Corvair

Learning Storage Performance
Joined
Jan 25, 2002
Messages
231
Location
Desolation Boulevard
.

CougTek said:
If it can enhance the CPU's performances, then I vote for it.

It will -- significantly .


Hyperthreading or not, I still don't like the main idea being the design of the P4 core, that is : decreasing the IPC in order to boost the frequency.

There is indeed at least one solid benefit to the current revved-up clocky design of the P4 when compared to the P2/P3 core, and that's good I/O performance. The downside of such designs is heat and the sundry problems that you can expect when working with high frequency electronic designs.


Hyperthreading sounds like a way to raise the IPC somewhat, so it certainly ain't a bad thing.

Er... no. Think virtual processors and the doings of work when other things (threads) are in between cycles doing virtually nothing. So, in theory, a Hyper-Threading architecture will allow for the doing of REAL work (computations) virtually 100% of the time (processor time).

One of the negatives of SMT or Hyper-Threading, that have been discussed only occasionally so far is that these SMT / H-T processors will run hotter than a non-Hyper-Threading processor at an equivalent clock rate. The reason for this heat is just what we covered above -- the friggin' processor is now actually performing computations virtually 100% of the time ! ! ! Prediction: Give it 3 years after the introduction of the first Intel H-T microprocessor and ALL Intel processors will have some level of H-T capability, even lowly Celerons -- if they decide to keep them around that long.


I read somewhere that Intel had tremendous difficulties with the early P4 processors, so they had to disable several fonctions in order to improve the yield.

Not particularly more difficult than with any other processor they've had before. It does have a higher transistor count than any processor of the past, which is simply going to require more time to run verification tests on. Every other previous major Intel processor family design, except maybe for the StrongARM line, have always had more transistors and a more complex design than the one before.


Hyperthreading was probably one of those fonctions they had to originally put aside.

The next-generation Compaq / API Alpha 64-bit RISC processor called EV-7 was the first "announced" (i.e. -- not yet released) processor design back in 1998 to utilise Symmetrical Multi-Threading. The first SMT Alpha processor is supposed to finally be released in the next year or so. It will have SMT architecture as well as the ability to run in lockstep with another Alpha EV-7 processor, meaning that two processors run in perfect lockstep at full speed comparing results in the background. If an exception is found in a processor's computation accuracy, the system will take the failed processor offline.

As far as Intel goes, they have simply gained or licensed (likely a combination of both) the essential architectural building blocks to construct a 32-bit SMT x86 Pentium, and rename Symmetrical Multi-Threading to "Hyper Threading" to differentiate themselves from Alpha. Basically, you can file the word Hyper Threading right along with other Intel sugarcoated feelgood gobbledygook names like the infamous "NetBurst architecture." (iGary's definition of NetBurst = The sudden urination on oneself after several hours of non-stop gonzo web surfing.)


.
 

Cliptin

Wannabe Storage Freak
Joined
Jan 22, 2002
Messages
1,206
Location
St. Elmo, TN
Website
www.whstrain.us
CougTek said:
I read somewhere that Intel had tremendous difficulties with the early P4 processors, so they had to disable several fonctions in order to improve the yield.

IIRC, It had more to do with the extra functions increasing the die size such that it was cost prohibitive. Now that they have jumbo wafers and have decreased the trace width, it is more doable.
 

flagreen

Storage Freak Apprentice
Joined
Jan 14, 2002
Messages
1,529
I'm net bursting all over the place. I'm using two P4 2.2 Prestonias.
 

Corvair

Learning Storage Performance
Joined
Jan 25, 2002
Messages
231
Location
Desolation Boulevard
Cliptin said:
IIRC, It had more to do with the extra functions increasing the die size such that it was cost prohibitive. Now that they have jumbo wafers and have decreased the trace width, it is more doable.

The now-established 0.13 micron and eventual 0.10 micron manufacturing process was absolutely key in ushering in this major architectural update. SMT / H-T will knock yer sox off when it finally arrives. It will be as good as two processors, but only one socket used. Let's just hope the price of an H-T Pentium 4 won't be "hyper," nor the operating temperature. Hmm.... I can just see the thermal naysayers calling the H-T Pentium 4 the HoT Pentium 4.
 

flagreen

Storage Freak Apprentice
Joined
Jan 14, 2002
Messages
1,529
Here are my Xeons rendering VOB files with Xmpeg. They run at about the same temp as my old 1.7 Foster Xeons. Though it shows all four processors working, the improvement in performance is minimal. I suspect software is going to have to be optimised for SMT before we see any real benefit.

Hund.gif
 

Corvair

Learning Storage Performance
Joined
Jan 25, 2002
Messages
231
Location
Desolation Boulevard
flagreen said:
I'm net bursting all over the place. I'm using two P4 2.2 Prestonias.

Probably right now in some coffeeshop somewhere in Silicon Valley, an Intel goobledygook blathermeister is creating a slew of bothersome trademarked names like:

NetSpigot®, NetRush®, NetPort®, NetWarp®, NetFlush®, NetSafe®, NetGain®, NetLoss®, NetNut®, NetButt®, NetMommy®, NetDaddy®, NetSissy®, NetBro®, NetDump®, NetHouse®, NetBrain®, NetHead®, NetGuru®, NetPoet®, NetVision®, NetFriend®, NetBuddy®, NetExperience®, NetTalk®, NetSound®, NetMusic®, NetHappy®, NetHippy®, NetPsychic®, NetPsycho®, NetVillageIdiot®, NetMcNuggets®... ¤¤¤¤¤¤¤

Yet, these blathermeisters that brought us such misnomers as NetBurst probably don't know the difference between an Ethernet and a hairnet.


.
 

flagreen

Storage Freak Apprentice
Joined
Jan 14, 2002
Messages
1,529
BTW you'd better copyright those real quick. They have eyes everywhere you know.
 

Corvair

Learning Storage Performance
Joined
Jan 25, 2002
Messages
231
Location
Desolation Boulevard
.

flagreen said:
Though it shows all four processors working, the improvement in performance is minimal. I suspect software is going to have to be optimised for SMT before we see any real benefit.

I had been hearing for a while, and even just lately it seems, that "Jackson" (or was it "Jacksonville") technology processors would not be available until *late* in 2002, like November or December at best. I'll be damned if I can recall the code name of the first H-T technology P4 processor. Intel's multitude of multi-dimensional codewords are hard to keep up with. I might have been reading about an H-T IA-64 processor, but I don't believe so. In any case, H-T is supposed to deliver on average a 30% to 40% increase in performance at a given clockrate with 32-bit multi-threaded code. That's a pretty substantial increase!

So, the "Prestonia" is supposed to be a Hyper-Threading processor? I believe I keep hearing the term(s) NetBurst / Hyper-Pipelined, which is simply supposed to be Intel gobbledygook buzzwords for their revved-up P4 core's tuned pipeline. The word Hyper-Pipelined is fine, but NetBurst hasn't a thing to do with networking, though Intel claims it makes your InterNET experience better somehow.

w020123.jpg




.
 

flagreen

Storage Freak Apprentice
Joined
Jan 14, 2002
Messages
1,529
It is in the Prestonias. Here is a link from Intel's developer site (you'll need to register I think to get on it) explaining it. http://developer.intel.com/technology/hyperthread/
Note that what they claim is a 30-40% improvement in the allocation of "resources" not in performance. If you'd rather not register with them I'll post the text here. :) By the way, Windows XP Pro supports SMT. The latest bios for my Supermicro P4DC6 motherboard has an option to turn Hyperthreading on or off. So far the only great improvement I've seen is in Sisoft Sandra benchmarks which have been optimised to take SMT into consideration.
 

HellDiver

Learning Storage Performance
Joined
Jan 22, 2002
Messages
130
Corvair :
SMT / H-T will knock yer sox off when it finally arrives. It will be as good as two processors, but only one socket used.
WRONG!!! I'm not sure you know what you're talking about, mate.

The idea behind on-chip multi-threading is to put the CPU cycles normally wasted (cache misses, etc) to good use. While thread currently being executed is put on hold waiting for something to happen (like for instance resources to be freed up), current generation CPUs waste time. CPU implementing on-chip multi-threading is able to say "OK, there's not much left to do with this thread at the moment, so why not try to run some other thread and see if anything good comes out of it?" If the code being executed was "favorable" and CPU architecture was suitable (which IA-32 isn't!) - you'll get some performance boost, but the exact amount of boost you'll get is an extremely dynamic value, dependant on many factors, and varying from rig to rig, from app to app, from tme to time. And that's all. No magic, no miracles. You can't get something from nothing.

In case of IA-32 it gets tougher - while RISC architectures currently available on the market have substantial number of hardware assets to go around (some have truckloads - like Alpha's 152 usable registers, for instance), IA-32 has to get by with 8 GPRs. That means bye-bye register renaming, fast context switches, etc.

So frankly, I wouldn't hold my breath on IA-32 implementations of SMT... Sorry folks.

If you want to get a better idea of what SMT was supposed to look like - have a lookie at a pretty good writeup on EV8 (which is no longer to be!) by Paul DeMone (Part 1, Part 2, Part 3).
 

flagreen

Storage Freak Apprentice
Joined
Jan 14, 2002
Messages
1,529
Your analysis pretty much corresponds with what my testing of real world apps has shown HD. That is, very minor (a few percentage points) improvement in performance when running apps with SMT enabled vs. disabled (through the system BIOS). Of course none of these apps were written to take advantage of whatever "advantage" SMT might provide. If it simply a matter of allocating resources better so that the primary task at hand can be processed without interruption to take care of other mundane chores, then properly written software should be easy enough to develop I would think. If an app can be written multi-threaded for two cpus, it could also be written for four cpus, two physical cpus to handle specific primary tasks, and two logical cpus to handle the background crap, if they are capable of doing it, could it not? As you say, how much of an edge this would provide depends on how burdensome the background or secondary tasks were in the first place. Then there is the problem of what resources there are to allocate to begin with. If 100% of something is already being used then there is nothing left to share or allocate to any other processors, logical or physical. I'm just a layman so don't come down hard on me if I'm way off base here. Is RWT going under?
 

Corvair

Learning Storage Performance
Joined
Jan 25, 2002
Messages
231
Location
Desolation Boulevard
HellDiver said:
Corvair :
SMT / H-T will knock yer sox off when it finally arrives. It will be as good as two processors, but only one socket used.
WRONG!!! I'm not sure you know what you're talking about, mate.

Mr. Helldiver: I believe we got the 30% ~ 40% increase in performance versus 30% ~ 40% increase in resources issue cleared up earlier, which was my mistake. We had already uncovered the fact that a claimed 30% ~ 40% increase in resources represents a somewhat dubious gain in reality.

At this point what I'm trying to get straightened out is if Intel's "NetBurst" technology has mysteriously been redefined during the past year or so to now include Hyper-Threading. Unfortunately for me, I haven't been able to follow every marketing and technical development in the computer industry in recent times. A 40+ hour a week job and one year into a business degree leaves little free time for the, er... fun stuff.
 

HellDiver

Learning Storage Performance
Joined
Jan 22, 2002
Messages
130
Corvair said:
A 40+ hour a week job and one year into a business degree leaves little free time for the, er... fun stuff.
Screw hyperthreading and Intel, mate. As someone who had to work near full-time and do his undergrads at the same time, I really sympathize you man, and wish you the best of luck!
 

flagreen

Storage Freak Apprentice
Joined
Jan 14, 2002
Messages
1,529
Well if anyone is not confused enough already as to what Intel is up to with SMP processors, check out this article for interesting news regarding the Prestonia and Foster-MP Xeons. It is really strange that according to the article, Intel still hasn't announced the Prestonias one month after they have all been available on the retail market.

http://news.com.com/2100-1001-830470.html
 

HellDiver

Learning Storage Performance
Joined
Jan 22, 2002
Messages
130
Corvair said:
Mr. Helldiver: I believe we got the 30% ~ 40% increase in performance...
Flagreen said:
Take a look at this announcement from Intel about HT and the benefits...
I think the important thing to note while discussing HT benefits are the words "up to" Intel uses while discussing them... It sure would be interesting to watch what performance boost Apache gets from HT as it is, and then to [wait and] see how it's performance will improve once recompiled, for example. And then to compare those results under Win and under Lin... :roll:
 

HellDiver

Learning Storage Performance
Joined
Jan 22, 2002
Messages
130
To qoute Mike : "Intel's segmentation of its server products has now reached a stage of complexity we've never seen before". And if you ask me - that's an understatement!
 

Corvair

Learning Storage Performance
Joined
Jan 25, 2002
Messages
231
Location
Desolation Boulevard
flagreen said:
Hey Gary it turns out you where right. Take a look at this announcement from Intel about HT and the benefits. I'd be interested to hear your thoughts on it. - http://www.intel.com/pressroom/archive/releases/20020206tech.htm?iid=Homepage+Update_020206


About all I can say is that, early on (2000 or 2001), I read various forecasts of performance increases for the SMT Pentium IV, including increases of 30% ~ 40%. I guess that I also saw increases in processor resources of 30% ~ 40% as well and took that as a performance increase. After all, with the meager register count of the Pentium x86 architecture, any increase in the number of registers -- even virtual ones -- would seemingly have to give a near-linear increase in performance with SMP applications.


Flagreen: I realise now that there was one thing that I never did ask you earlier: Are you absolutely sure that you indeed have a SMT Pentium IV processor, one that does not have its SMP capability disabled? I keep reading how the SMP Pentium IV will be "coming soon" in Q2.

If you are running Windows XP, I believe the SMT functionality will be detected as two processors for every SMT Pentium IV processor. Windows 2000 and NT 4.0 can't detect the SMP functionality. I would imagine that the latest version of the Supermicro BIOS for that mobo can detect the presence of the SMT processor and can toggle on / off its SMT functionality. But, if Intel permanently disabled the SMT functionality during manufacturing, then the BIOS commands are simply manipulating a mode register and nothing more. Still. it would seem to me that a SMT-disabled processor really should not be detected as a SMT processor at all.
 

flagreen

Storage Freak Apprentice
Joined
Jan 14, 2002
Messages
1,529
Gary,
As you probably know by now, Intel has finally announced the release of the SMT Prestonias. I am using XP Pro and the Device manager shows four cpus. It does not distinguish between physical and logical cpus. It also showed four with my previous Foster Xeons. However with the Fosters the Task Manager only showed two, whereas with the Prestonias four are shown as above in the image I posted using Xmpeg. So in short, yes I'm positive the Prestonias have SMT enabled whereas it was permanently disabled within the core of the Fosters. I don't know why it took so long to anounce it.

http://www.intel.com/products/serve...eon/index.htm?iid=Homepage+Spot2_Text_020225&
 
Top