Blades and Virtual Infrastructure: Who thought this was a good idea?

Blade servers are all the rage these days, but let’s jump back a bit and remember the original intent. Blades are a consolidation mechanism. The first blade servers consisted of more underpowered physical servers crammed into a smaller U (Rack Unit) footprint. Over time the chassis have become more redundant, and the blades have grown to equal their physical brethren, but the intent remains the same! Sounds a lot like the mission statement for virtualization doesn’t it? I have issues with Blade servers and virtualization:

  1. Most Datacenters don’t have enough PSC (Power, Space, and Cooling) to support blades.
  2. If you’re using virtualization then you already have a consolidation mechanism.
  3. Your upgrade path is fairly static, and exorbitantly expensive.
  4. Losing a blade center is too destructive to the IT organization.

If you’re in a datacenter that was built more than three years ago, you can’t run modern blades! We’ll that is unless you place an 8-16 U blade center in a rack all by itself. If we can agree on that last part then I ask you, what’s the point? This is further compounded when you start to place the VMware stack on the infrastructure. DPM is purpose built to address the PCS problem, but powering down a blade lowers the amperage draw, but no to the same extent powering down a physical server would! Modern blades are also a nightmare for the cooling guys, concentrating a large thermal footprint into a small space.

If you’re datacenter has been constructed to support blades then you sir are responsible for global warming…

Regardless of the OS or workload, Isolating a service will drastically increase reliability. Blades were a small step towards consolidation, virtualization is the quantum leap! A blade Center enables x2-x3 times the rack density (assuming adequate PCS) vs. physical servers. A conservative virtualization effort will offer at least x5-x20. I know the counter is to put virtualization on blades. In theory this would result in x10-x60 (2x-x3 * x5-x20), but in practice blades lower your consolidation due to the restrictions they place (RAM, Network, PCIe).

This is one point you just can’t argue. Blades work by eliminating elements of the server to fit in a smaller space. Examples remove a processor to fit a hard drive, remove the power supply, and eliminate PCIe slots. Vendors will tout the modularity and ease of upgrade, but if you look close it’s a closed list of supported components. Worse yet most of these will only work with that blade/blade center, and there all more expensive then there physical brethren.

Blades also have Moore’s Law laughing at them. It takes time to reengineer component x to fit in a blade. It also takes money, thus guaranteeing that some components will never be ported to the blade architecture. In the event that they are. they are always out of date, and more expensive!

As IT does the 9’s dance around SLA/OLA with internal and external customers. Our infrastructure becomes increasingly complicated and less manageable. When you compound consolidation on top of availability the price skyrockets! Again I ask, what’s the point? The very technologies that enable Blades make them inherently less reliable. If you lose a blade center you lose 10-28 servers. If those servers we’re running VM’s your screwed! No reasonable infrastructure can handle the loss of 10-280+ servers at once. If you can then I assure you you’re in the minority. Instead IT is forced to make the assumption that that won’t happen, and then privately plan for when it does!

If you’re utilizing one of the many high availability technologies MSCS, Veritas Cluster, VMware HA/Lock-Step, etc… Then Blades are a serious thorn in your side. With every step you now have to calculate an additional dependency! A particularly nasty one at that, the amount of interdependencies at play in a blade center makes it almost impossible to account for them all.

As you’ve probably already surmised I am not a big fan of virtualization on blades. I think it’s a waste of money the ROI for a blade center and the ROI of virtualization are not in concert. In theory a blade center pays for itself around 80% capacity, but the inherit limitations of a blade server means you need more servers! Since most virtualization is licensed per PROC the more vm’s you can get on a PROC the better the ROI!

Seriously, think about it, what do we run out of first in a virtual infrastructure? Memory is usually the first to go, and is also the most expensive asset on a blade. Blades force one to utilize higher density dims, and even then can’t reach the capacity of their physical cousins. I/O is usually the next to go, while it is true that saturation of the pipe (FC,ISCSI,SAS,NFS) is less common then simply over committing the underlying disks. The lack of onboard storage with blades eliminate one the mitigating techniques in play. Moving non-essential low priority vm’s onto local storage!

It’s hard to convince money guys that a 40k 2U server will save them money in the long run. I’m learning that first hand. OEM’s have their sales guys out in force and the sell for blades has been perfected, but I’ll take a 4-6 physicals over 14-28 blades any day!

4 thoughts on “Blades and Virtual Infrastructure: Who thought this was a good idea?”

  1. Simply going thourgh the first paragraph and some more made me sad.
    Somewhat inteligent person is not considering:
    a) power consumption per cpu core
    b) MORE consolidation is always* better than LESS consolidation (*- but you have to manage the risks)
    c) you do NOT want to start with only one Blade center.
    I’m currently in the middle of moving into new datacenter that was built to the specs of Blades. Yes I will consume more power per rack but NO – I will NOT consume more energy than the same datacenter (same ammount of cores) run on “regular” rackmount servers.
    My calculations, taking into account: 42U rack cost (rent) + energy cost + SAN fabric cost + ETH cost (access and core layer, show that one full HP C7000 Blade populated with 16xfully loaded BL495 + 6xCisco switches +2xBR SAN FC etc… will give out total of 332,8 GHz (cpu clock x cores), 1024GB of ram and with average VM being 1Ghz + 3GB you end up with 333VM’s per that setup.
    It will consume 10U of rack and 5250W of enery (max).
    Same setup with the best price/value rackserver for VM (DL585) is:
    8 servers, Total GHz = 320, Total GB = 1024, Total U’s = 32 (3x as much), Total energy = 7016W (33% more). Giving 320 VM’s.
    And this is without calculating in the power and U’s of the switches that you will need as extra to have the same ammount of eth that was calculated in with Blade setup.

    So in short: If you want to use blades you have to start with more than 1 Enclosure (otherwise you will consolidate also your risks into one box).
    With the same ammount of computing power you will win 3+x the space and 30% of energy.
    The TCO on 5 years will be depending on your U price and eletricity alone about 20% better (Blade is bigger investment but not much – less than 10% for same power)

    So – if somebody tries to convince me that I contribute to global warming by having datacenter full of blades … sorry, do your calculations not based on energy per rack but enery per computing power.
    Bottom line:
    With blade my average VM takes about 16W of energy to run vs 22W with DL585, I will fit 33 of them per U vs 10(-) with DL585 and they will weigh about 619 gramms per VM instead of 1075(+).
    Yes – they will take less power (more green) and weigh less (more green) than rack machines.

  2. Toomas Kärner,

    Consolidation is my point, in my environment I do not have a single host that is out of CPU. As a matter of fact 80% of my host are seriously underutilized. The reason for this constraint is the SFF blades must conform to. Bottom line yes big boxes eat more PSC, but you need far fewer servers when used in conjunction with virtulization. I’ll sit down this evening, and try to better state my case.


Leave a Reply