Perhaps this is a trickle-down from a merged unified OS in the Server edition?
		
		
	 
I thought about this on the way to work this morning (it's amazing how the clear the mind becomes going down the freeway on my bike at 110). 
We now the actual kernel itself is pretty much the same between all editions of Windows. (as in respect to the IO Scheduler, the task/process/thread scheduler, the memory management system, etc). What if these new restrictions are a flow down from the requirements/feature of the kernel used in Windows Server? Doug, as you mentioned CMPXCHG16B allows Windows to use more than 16TB of RAM, and since it's expensive to maintain two separate bits of code (especially one as an OS kernel), why not merge the two and keep the features of the higher performing one, and reduce the restrictions all round.
The other use, as mentioned with CMPXCHG16B, is the use of lock-free algorithms/data structures. By using lock-free algorithms/data structure inside the kernel itself, means less locks are involved (locks are bad for performance in multi-threaded code, but are what make multi-threaded code possible/reliable), means you can scale out handling more cores for the better performance. In ye'olde days, the NT kernel (and this included the Linux and *BSD kernels as well) pretty much shat itself when you had more than 8 cores, all because of locks... The introduction of CMPXCHGx allowed the OS to scale better with the increase in the number of cores present. (This is what allowed Linux to scale to those 1024 core computers without going to sh*t). As it's now affordable to purchased desktop machines with 8+ cores, something has to give in order for performance to not degrade to the point of adding more cores means you won't see any performance increase since the kernel is blocked from doing work due to all the locks. (I think, Coug or Dave, you mentioned looking at Quad socket servers with 10 core CPUs, so that's 40 cores (or 80 cores with Hyperthreading), that the CPU has to manage).
I know some of you have mentioned better performance in Win8.1, maybe the use of these new requirements is the reason?