Cheapest and Scalable Web server only as storage and URLlink

John3478

What is this storage?
Joined
Aug 5, 2005
Messages
2
Which one do you choose for the server which only server as storage and url link to it(with IIS or PHP).
The purpose is to unticipate the size will grow as big as flickr or Myspace.com
1. CX400 can support up to 60 disk drives (with a total capacity of 4.3 TB), up to 512 logical disk units, and up to 64 host server connections. It has a peak performance of 60,000 I/O operations per second and an I/O bandwidth of 680 MB/sec.
OR

2. Or using less number of disk drive maybe 15 per each operating system, but take more space in hosting. However, it will be able to handle more connections perseconds for each links.

So basically comparing
a. 5 Operating system with big size hard drive each 5 terabyte
OR
b. 25 Operating System with 1 terabyte each

Thank you
 

Handruin

Administrator
Joined
Jan 13, 2002
Messages
13,862
Location
USA
If you're considering a mid-range storage solution of this size, you may first want to consider your overall budget, because from the little information you've provided, I'm seeing a decent cost will be involved for a decent storage solution at a CX400 level.

The CX400 (assuming you're referring to EMC Clariion line of arrays) has been superseded by the CX500, and the CX300 offers the specs comparable to the CX400. I would encourage you to NOT consider connecting 60 hosts to a single array of this size, because your performance will suffer greatly. The CX500 and CX300 both ship with two service processors, both containing 2Gb Fibre channel ports per SP (4 total). In order to get more than four hosts connected to these, you will need a a Fibre channel arbitrated loop, or switch fabric environment. This, again, will add to your cost. As you can see, the bottle neck is quickly becoming apparent at the SP level, and hopefully you can see why 60 host connections is not advised, even if supported.

The next aspect you really need to plan for is data availability. If you really want a high-availability solution to compliment this level of storage you will want to seriously consider a two-switch fabric solution with redundant ISL's (inter switch links). Each switch will have two fibre connections running to each SP, and two links connecting the switches together. All Fibre connections should also be run in different locations. By running a setup in this fashion, you will remove any single point of failure, short of the entire array dieing. You can also plan for an entire array failure with mirror view, but you better break out your wallet for that level of availability.

Next, you will want to consider redundant host connections with a PowerPath solution and Dual HBA's. This will increase the reliability and availability of your environment and also improve performance to some degree. If you've seen large dollar signs fly by eyes, you aren't mistaken.

Next, you should also consider your backup strategy and recovery plan. If you fill up the CX300/500 to full capacity, you'll need a backup device such as tape, or a second CX300/500 (possibly even a NAS solution). You will also want to plan for enough storage space to utilize SnapView for an added backup strategy while in production.

You also need a decent HVAC environment and adequate power for the array and hosts. They will shutdown if overheated. None of this was mentioned in your plan, and you need to consider it in your budget.

In my opinion, I would recommend connecting a minimal amount of hosts as you need to get the job done. I'd personally suggest 5 hosts over 25 if they will be doing the same amount of work. 25 hosts create more heat, consume more power, and offer 5x the opportunity for failure just at the host level, never mind the individual parts within the hosts. If you need more horsepower per host, bump of the specs for CPU/RAM if that makes sense. Since I don't really know what the hosts will be doing individually, my recommendations may not be correct, so please provide more info.
 

John3478

What is this storage?
Joined
Aug 5, 2005
Messages
2
Wow Handruin, you are the man, thank you for the detail explanation.

We are on the planning stages and design the architecture, how to accomodate current and future storage. Currently, each individual will have 10GB space, and current total about 300 users. As the number of user increase every year, we plan to increase the size of storage.
Now if it is open to public, then we will have the same scenario as MySpace.com

Hopefully this example scenario will be more clear. We plan to create the same scenario as Flickr photo storage
http://www.flickr.com/photos/stewart/1795/

The server basically storing Photo and videos, also at the same time, it will have PHP Webserver,
For example
http://photos1.flickr.com/1795_c3348e07d2.jpg?v=0
This above link will point to the server and grap the photo or videos.

I think there will be a lot of hits to the Webservers(PHP Server). The PHP is not storing sessions or user authentication. Just basic grapping the image and video and post it to the user.

The initial budget is not big, probably around 50000. However, it can increase as the number of user increase. That's why I map out 5 terabytes or 1 terabytes scenario.

-----From google storage lesson-------------------------------
http://www.techworld.com/features/i...isplayfeatures&featureid=467&page=1&pagepos=4

"With 6 billion web pages to index and millions of Google searches run daily you would think, wouldn't you, that Google has an almighty impressive storage setup. It does, but not the way you think. The world's largest search company does use networked storage but in the form of networked clusters of Linux servers, cheap rack'em high, buy'em cheap x86 servers with one or two internal drives.
A cluster will consist of several hundred, even thousands of machines, each with their internal disk. At the last public count, in April 2003, there were 15,000 plus such machines with 80GB drives. As an exercise let's assume 16,000 machines with 1.5 disk drives, 120MB, per machine. That totals up to 1.84TB. In fact Google probably has between two and five petabytes altogether, if we add in duplicated systems, test systems and news systems and Froogle systems and so forth. Why does Google use such a massively distributed system?
"
 

Handruin

Administrator
Joined
Jan 13, 2002
Messages
13,862
Location
USA
Sounds like a nice project. My initial response is probably over kill, but not unpractical depending on your required availability. Are you planning to do RAID 1+0 like Flickr, or another level of RAID like 5 or 1?

You can buy a CX300 with a partial rack, however I think it needs at least 5 disks populated to start (1 DAE). The array can grow as you need it while online. You can choose between internal Fibre channel disks, SATA, or ATA disks for different budgets. If you don't want to start with switches and a SAN, you can attach the hosts directly to the fibre ports, but you'll be limited to four connections.

For your PHP web server, you may want to consider a caching tool such as PHP Accelerator or Turk MMCache. I've used the former with good results, and also heard good things about the latter.

If your traffic will be high, and the content relatively static, you could also look into a caching solution such as squid. Wikipedia uses this in their environment.
 
Top