Best server for Openstack

CougTek

Hairy Aussie
Joined
Jan 21, 2002
Messages
8,728
Location
Québec, Québec
I want to propose running Openstack on a local server in order to replace the costly Amazon EC2 instances we buy to run some shared online video softwares. We have several large instances on Amazon, but we could do with medium instances in most cases. However, the guy in charge of that project has absolutely no discipline regarding the proper administration of resources (and company funds - he's been racking in +5000$ monthly bills for the last few months). Hence my desire to run everything on a local server, which will only cost money to buy once, but almost nothing afterward.

I hesitate between an AMD or an Intel solution. Here's the description of both :

AMD server
*SuperMicro AS-1042G-TF (1 node/4 sockets)
*4x Opteron 6376 2.3GHz 115W (8 fp cores/16 int cores)
*8x 8GB DDR3 1600MHz ECC Reg RAM Low Voltage
*SSD storage (enough capacity)
~5800$

Intel Server
*SuperMicro SYS-1027TR-TF (2 nodes/2 sockets each)
*4x Intel Xeon E5-2620 2.0GHz 95W (6 cores/12 threads)
*8x 8GB DDR3 1600MHz ECC Reg RAM
*SSD storage (enough capacity)
~4600$

I could bump the Xeon to the 2.3GHz E5-2630 model, but that would add ~720$ to the server's price. Intel's 8-core Xeon are simply too expensive to be considered here (using 4x Xeon E5-2650 would add almost 3000$ to the server's cost).

Even despite the lower frequency, I expect the Xeon to have a higher performance per thread than the Opteron. However, the Opteron having more cores than the Xeon, I would also expect it to accomplish slightly more work overall. Being made of a single node should also simplify the Openstack configuration, altough from what I read on a few Openstack websites, that shouldn't be too much of an issue.

So my concern is : is the AMD server worth 25% more than the dual-node Intel server for my intended purpose? I'm really not sure. I guess I should ask the question on some Openstack forum, but I wanted to check with you guys first. I'll install Grizzly on top of CentOS. RedHat as a nice and simple how-to guide.
 

CougTek

Hairy Aussie
Joined
Jan 21, 2002
Messages
8,728
Location
Québec, Québec
Yeah. I'll probably even propose to opt for a SuperMicro SYS-2027TR-HTRF+ and populate only two nodes to begin with, because of the redundant power supplies, something the 1U solutions miss. It would push the Intel solution up to ~6000$ though. It would be the best solution in the middle to long term, but the added cost won't help me to make them swallow the pill in the short term.
 

Chewy509

Wotty wot wot.
Joined
Nov 8, 2006
Messages
3,348
Location
Gold Coast Hinterland, Australia
Hi Coug,

Some items to consider:

1. Like the idea of a two node setup, for redundancy reasons.

2. You've got 4 CPUs, and 8 DIMMs, which I assume is 2x DIMMs per CPU. Both the AMD and Intel CPUs are quad-channel CPUs, so there will be some performance loss as you're running them in dual channel mode vs quad channel mode (effectively halving the available bandwidth to the CPU cores - not a good thing when you are trying to feed 6 or 8 cores).

3. What is the workload to be used, is it mainly FP or Integer work. If it's mainly integer work, then the AMD solution will be faster than the Intel solution (8 more cores + plus each core is higher MHz - Note: AMD and Intel are nearly on par per MHZ when dealing with pure integer code, and AMDs tend to do better when dealing with very branch heavy or code that uses bytecode (eg, .NET or Java)).

4. What processes are you running, is it very heavily threaded that likes lot's of RAM and you need to share RAM between all threads, then the single memory space that the AMD solution should come out ahead. Otherwise if it's mostly single-threaded with few shared resources, then it won't matter. (However 2x nodes w/32GB each depending on your load may not be as good as a single server w/64GB). What's the maximum RAM requirement?

5. With a dual setup, you may have to be mindful of any communication between the 2x nodes and how much there is. You don't want to bog down waiting on the LAN adapters to transfer data between nodes for processing.

6. Which is offer best bang for buck, will most likely be based on points 3, 4 and 5 above...

7. If the guy is spending $3-6K per month, then a single or two servers for $6K is a bargain. (try to push it based on annual costs, eg even at $3K x 12mths = $36K) ;) This is what you show the companies accountants, since all they understand is the $$$.

8. ????

9. Profit
 

CougTek

Hairy Aussie
Joined
Jan 21, 2002
Messages
8,728
Location
Québec, Québec
Both the AMD and Intel CPUs are quad-channel CPUs, so there will be some performance loss as you're running them in dual channel mode vs quad channel mode (effectively halving the available bandwidth to the CPU cores - not a good thing when you are trying to feed 6 or 8 cores).
Didn't know that. I thought dual channel would be fine. Thanks for letting me know.

Regarding the workload and multi-threading vs single thread applications, I'll have to ask the programmer who's working with the concerned project manager.

Network bandwidth shouldn't be an issue. Gigabit connections should do the job. Anyway, I plan to eventually plug everything on a 10GbE SFP+ DAC ports. But not immediately (that's further down the road when more machines will be renewed).

I plan to present things exactly the way you describe them in point #7. That's the reason motivating me to push this.
 

Chewy509

Wotty wot wot.
Joined
Nov 8, 2006
Messages
3,348
Location
Gold Coast Hinterland, Australia
It's interesting you're looking at OpenStack: at University (7 weeks before I finish/graduate), I'm working in the HPC (High Performance Computing) lab on a project, primarily looking at Information Retrieval systems that can handle complete genomes of bacteria (approx 10M base pairs) and perform various processing in a distributed manner. (primarily similarity matrices and location awareness for machine learning applications).

Currently I'm negotiating to get exclusive access to one of the test nodes in the HPC lab for a day, which has 4x Intel E7540's, 256GB of RAM, a single Tesla M2090 and 11TB of local 15K SAS based storage. To get access I need to justify why I need the resources (it's pretty easy, the dataset I'm working with has a memory requirement of 145GB), but also had to specify how many cores I needed, how long, what type of code it is (int vs FP, single threaded vs heavily multi-threaded), etc. It wasn't that hard to get access, but I needed to think about all the things I needed, as the node has to be configured by HPC Lab staff for me. (I just get a shell account to run my stuff). I would have loved to get access to the Uni's super computer for a day, but they don't like giving a mere student access to it. (while there is no hard figures, the HPC staff member said the test node was the same as each node in the super computer, except the real nodes had updated Telsa's for GPGPU. Unfortunately he wouldn't tell me how many nodes there are, but did indicate that there was over 256TB of RAM and just under 1.2PB of storage, which by my math is 100 nodes :-o).

The types of questions I asked, was the stuff I had to answer a few days ago... it can make a huge difference between looking good and providing a sub-optimal solution that they won't like (even though the cost justifications make it necessary).
 

Chewy509

Wotty wot wot.
Joined
Nov 8, 2006
Messages
3,348
Location
Gold Coast Hinterland, Australia
Network bandwidth shouldn't be an issue. Gigabit connections should do the job. Anyway, I plan to eventually plug everything on a 10GbE SFP+ DAC ports. But not immediately (that's further down the road when more machines will be renewed).
In most cases it's not, heck most small-medium scale Hadoop clusters run on stock 1Gb Ethernet. Would InifiBand be an option if you needed low-latency high-throughput? 40Gb InifiBand HBAs are available price competitive to 10GbE?
 

CougTek

Hairy Aussie
Joined
Jan 21, 2002
Messages
8,728
Location
Québec, Québec
40Gb InifiBand HBAs are available price competitive to 10GbE?

Not unless what I saw 6 months ago completly changed. 40Gb adapters are significantly more expensive than the SFP+ DAC 10Gb adapters. The cables' prices diference wasn't that high (~25% IIRC), but the adapters make the solution prohibitive unless you need the bandwidth. You just don't get it "just because".

Slightly unrelated, but I tried a few moments ago to fit my new 2.5" SSD inside its 3.5" adapter in a 3.5" SAS hotswap bay of the spare PowerEdge 2950 I do my testing on and the 2.5"-to-3.5" adapter places the SSD some 2mm too much to the center to fit the SAS connetor inside the server. I'm pissed. Now I have to find another adapter placing the SSD flush to the side of the 3.5" sled.
 

Mercutio

Fatwah on Western Digital
Joined
Jan 17, 2002
Messages
22,232
Location
I am omnipresent
Unrelated to Coug's topic, but perhaps of general interest for people who get to do large-scale deployments, I found this link on Slashdot a little while ago, describing the peering requirements for ISPs who want their own private Netflix, as well as the machines they use to deliver content.

If you're willing to buy some space at one of the datacenters Netflix uses and can constantly sustain 2Gbps over a 10Gbps link, they'll give you a 108TB file server that can handle 90 - 95% of Netflix streaming requests.
 

blakerwry

Storage? I am Storage!
Joined
Oct 12, 2002
Messages
4,203
Location
Kansas City, USA
Website
justblake.com
I want to propose running Openstack on a local server in order to replace the costly Amazon EC2 instances we buy to run some shared online video softwares. We have several large instances on Amazon, but we could do with medium instances in most cases. However, the guy in charge of that project has absolutely no discipline regarding the proper administration of resources (and company funds - he's been racking in +5000$ monthly bills for the last few months). Hence my desire to run everything on a local server, which will only cost money to buy once, but almost nothing afterward.

You mentioned the hw needed, but I haven't seen any consideration regarding network bandwidth, performance, or availability. Are these services accessed by your staff only or is it for clients? All moving the servers in house incur any additional wan costs? Will it save on wan costs? Why are these services on ec2 in the first place?
 

CougTek

Hairy Aussie
Joined
Jan 21, 2002
Messages
8,728
Location
Québec, Québec
The services on EC2 are used to send video files to our customers. We send between 2TB and 3TB of data monthly. The services are on EC2 because the people working on the project didn't know better when the project started more than two years ago and they did not ajust since. There's also a desire on their part to show that their video streaming technology is portable, easy to use and customable.

I agree that EC2 can be a viable option for many scenarios, but I'm sure there are far more advantageous alternatives in this particular case.

Regarding the network use, I can rent a 4U rack space with 5TB of monthly traffic for less than 300$ per month. That's a far cry from the +5000$ we pay right now. If we ever need more than a gigabit connection, a SFP+ dual port network adapter cost between 400$ and 800$, depending on the model and if it uses fiber or copper cable.
 
Top