Memory Node Interleaving - enable or disable?

Handruin · Jul 26, 2010

Anyone know which option is most desirable for memory interleaving (enabled/disabled)? I was under the impression that NUMA was the way to go, but I'm not sure. I have a few new Dell R710s that I'm configuring and I don't know if leaving the default (disabled) is the best option or not. This will allow for NUMA on my systems.

Dell Owner's Manual said:
Node Interleaving - (Disabled default) If this field is Enabled, memory interleaving is supported if a symmetric memory configuration is installed.

If Disabled, the system supports Non-Uniform Memory architecture (NUMA) asymmetric) memory configurations.

MaxBurn · Jul 26, 2010

I believe enabled is best performance but the memory modules must match and be installed in the correct places.

ddrueding · Jul 26, 2010

I think MaxBurn is right.

sechs · Jul 27, 2010

If you have non-uniform memory, then NUMA is usually (but not always) the better choice.

I'm not familiar with the the machine, but given that it's Intel, I'm sure that your memory is uniform.

Handruin · Jul 27, 2010

Sounds like I need to check and see if all RAM modules are of the same size and speed for them to be in uniform? If they are, that's the best option for performance rather than disabling Interleaving. I'll read through the hardware manual some more to see what I can find. All four systems are configured with 96GB of memory (12 x 8GB), but the system will hold 16 DIMMS. I don't know if that will cause problems with the interleave setting.

ddrueding · Jul 27, 2010

I will accept a sample unit for complimentary testing if you like

MaxBurn · Jul 27, 2010

I think you could find out if it works relatively fast if you enable it and run memtest86 on the thing. I think it has taken a backseat with dual/triple channel? Or maybe they interleave within the channels too, don't know.

Handruin · Jul 27, 2010

I'm going to try the interleaving. You're right that it's the preferred option for performance, but there seems to also be a lot of software-specific enhancements made for NUMA. What's not clear to me is if I'm using software (such as ESX 4.0) that is NUMA-aware, should I leave the memory as non-uniform to take advantage of this? Also, am I understanding the options correctly...if I set the memory to Interleaving, is that the same as Uniform Memory Access? To better clarify my questions, is this true?

UMA (Uniform Memory Acces) = Node Interleaving enabled
NUMA (nonuniform memory access) = Node Interleaving disabled

Funny you should mention the channels. There is also settings for the memory in a different category. They have the following modes:

Memory Operating Mode: This field displays the type of memory operation if a valid memory configuration is installed. When set to Optimizer Mode, the memory controllers run independently of each other for improved memory performance. When set to Mirror Mode, memory mirroring is enabled. When set to Advanced ECC Mode, two controllers are joined in 128-bit mode running multi-bit advanced ECC. For information about the memory modes, see "System Memory."

I have it set to the Optimizer Mode.

Optimizer (Independent Channel) Mode
In this mode, all three channels are populated with identical memory modules. This mode permits a larger total memory capacity but does not support SDDC with x8-based memory modules. A minimal single-channel configuration of one 1-GB memory module per processor is also supported in this mode. Table 3-2 and Table 3-3 show sample memory configurations that follow the appropriate memory guidelines stated in this section. The samples show identical memory-module configurations and their physical and available memory totals. The tables do not show mixed or quad-rank memory-module configurations, nor do they address the memory speed considerations of any configuration.

MaxBurn · Jul 27, 2010

I might be grossly wrong on this but to put it in hard drive terms I think of interleave like RAID0 for memory and dual/triple channel like having two/three RAID adapter cards. You can interleave on one physical DIMM. All desirable things.

The UMA and NUMA stuff has me confused though and the wiki articles didn't help much. For example I don't know how Intel breaks down the four CPUs on die against the one or possible three memory controllers. Anyway from what I gather is that NUMA aware software is coded in a way to avoid the time costly hit of assigning a task to a processor that doesn't have that info in memory and must transfer the data from another processors memory to local memory before it can start work.

MaxBurn · Jul 27, 2010

Are these multiple socket servers? If so any CPU design that has the memory controller on the CPU die and has multiple sockets is intrinsically NUMA, no way to get around that sense they moved the memory controller off of the chipset.

Handruin · Jul 27, 2010

We got 4 of these systems in a little over a week ago.

Yes, they are dual socket. They consist of the following:

2x Intel Xeon X5680 (Westmere) @ 3.33 GHz
98GB (12 x 8GB) 1066MHz
2x 146 GB SAS drives (10K RPM, 2.5")
6x GiGE NICs
1x Emulex LPe 11002 4Gb FC (dual port)

MaxBurn · Jul 27, 2010

Found this explaining the situation. The way the architecture is organized with the memory controllers on die now you don't seem to have a choice but to do NUMA. Maybe if you only had the one CPU socket populated you could disable it or maybe the system presents to the OS as UMA and handles things in the background if selected.

This first, high-end desktop implementation of Nehalem is code-named Bloomfield, and it's essentially the same silicon that should go into two-socket servers eventually. As a result, Bloomfield chips come with two QPI links onboard, as the die shot above indicates. However, the second QPI link is unused. In 2P servers based on this architecture, that second interconnect will link the two sockets, and over it, the CPUs will share cache coherency messages (using a new protocol) and data (since the memory subsystem will be NUMA)—again, very similar to the Opteron.

http://techreport.com/articles.x/15818

sechs · Jul 27, 2010

Handruin said:
Sounds like I need to check and see if all RAM modules are of the same size and speed for them to be in uniform?

Uh, no.

Being "uniform" has to do with access. If all memory has the same latency and bandwidth to all processors, then it is "uniform." At least until recently, most Intel systems had uniform memory because they only had one node (controller).

Since AMD integrated the memory controller onto the die, they seriously introduced the possibility of non-uniform memory into multi-CPU systems. In Opteron systems, each processor socket can have its own local node of memory. A processor can access memory on another node (local to another socket), but there is a penalty for reaching across external buses to get that.

With NUMA, the system is aware of what memory is attached to which nodes, i.e., what memory is local to what processor. A NUMA-aware operating system can then prevent the penalty of accessing non-local memory by matching local memory to local threads.

If you interleave nodes, memory bandwidth goes up, since addresses are "striped" across the controllers, but you encounter the latency penalty when you access the non-local memory.

MaxBurn · Jul 28, 2010

I guess the (my?) point of confusion is NUMA and interleaving as we generally understand it are not mutually exclusive. Each memory controller can interleave on its own bank of memory and likely do even if NUMA is selected. You just aren't interleaving between all system memory in the whole box.

Did you order them purpose built?

Talked to a friend that just got a R710 about this thread and he ordered them complete and configured with esxi installed on an sd card straight from dell like that. Says he is using the optomized settings but didn't configure a thing, dell did all the work. Says he just unboxed and started adding vms.

Handruin · Jul 28, 2010

Thanks to both of you for feedback on this. The systems were ordered to come with no OS installed. We just picked the hardware configuration. Since EMC owns VMware, we have a different way of getting our licensing so we don't order ESX through Dell.

We also configure the systems for dual purpose. For example, these 4 systems came with two internal drives in each. We split the drives and I take out the bottom internal drive. ESX is installed on drive 0 and later on we may need to test windows or Linux so we'll remove drive 0 and reconnect drive 1 to install that OS. Since we store the data on a clariion array, the internal drives are expendable.

The BIOS was mostly configured how I wanted it, but I just didn't know much about the memory interleaving. The one option Dell always disables that I use is the Virtualization Technology option. Likely if these were ordered like your friend's, they would have come with that enabled. Do you know if your friend's R710 had the Memory Interleaving option enabled?

MaxBurn · Jul 28, 2010

I asked but he didn't know. He said if he remembered they would look at it the next time they bounce the box but I think it is in production so that's going to be a while. IMO if your OS supports NUMA you definitely want that on to avoid the cross buss data gathering hit. May want to look into your linux flavors to see if they need a switch or kernel to support it, don't know much about linux.

sechs · Jul 28, 2010

MaxBurn said:
I guess the (my?) point of confusion is NUMA and interleaving as we generally understand it are not mutually exclusive. Each memory controller can interleave on its own bank of memory and likely do even if NUMA is selected. You just aren't interleaving between all system memory in the whole box.

You're confusing bank interleaving with node interleaving. You can bank interleave with NUMA (and that's a good thing), but node interleaving and NUMA *are* mutually exclusive.

MaxBurn · Jul 28, 2010

Handruin · Oct 15, 2010

I know this is a late reply to the thread I started a while ago, but I did find this piece of information in a VMware Best Practices guide for vSPhere 4.0 on page 15.

VMware said:
Some NUMA-capable systems provide an option to disable NUMA by enabling node interleaving. In most cases you will get the best performance by disabling node interleaving.

Pradeep · Oct 16, 2010

Handruin said:
I know this is a late reply to the thread I started a while ago, but I did find this piece of information in a VMware Best Practices guide for vSPhere 4.0 on page 15.

Late reply here too, the IBM X series servers I've used with the 5160 type Xeons have an option to use a spare bank of memory as a hot spare, in addition to some kind of RAID type setting IIRC. That's what I always went with, integrity over ultimate speed, there were available slots, in the health care setting. The gold standard for us was the Z series mainframe which only went down for time zone changes.

Memory Node Interleaving - enable or disable?

Administrator

Storage Is My Life

Fixture

Storage? I am Storage!

Administrator

Fixture

Storage Is My Life

Administrator

Storage Is My Life

Storage Is My Life

Administrator

Storage Is My Life

Storage? I am Storage!

Storage Is My Life

Administrator

Storage Is My Life

Storage? I am Storage!

Storage Is My Life

Administrator

Storage? I am Storage!