Follow Me Icons

 

Follow @KendrickColeman on TwitterConnect on LinkedInWatch My Videos on YouTubeFollow me on FacebookCheck Out My Projects on GitHubStay Up To Date with RSS

Search

VMware vSphere 5 Host NIC Network Design Layout and vSwitch Configuration [Major Update]

This is an update to an older post and I wanted to overhaul it for the Indy VMUG... This was also another VMworld submission that didn't get the votes. See what you guys are missing out on? :)

 

As vSphere has progressed, my current 6, 10, and 12 NIC designs have slowly depreciated. In an effort to update these to VMware vSphere 5, I took the 2 most popular configurations of 6 and 10 NICs and updated the Visios to make them a bit more pretty. I also don't know how much longer these will be necessary as the industry moves forward with 10GbE as a standard. I also added in a few more designs as inclusion of Fiber Channel has also been requested.

 

The assumption of these physical NIC designs is that these hosts are going to be configured with Enterprise Plus Licensing so all the vSphere features can be used. I didn't create a bunch of different designs as before because ideally you would want to stick with a simple design that meets a bunch of criteria for most people. In addition, I have updated these configs for performing multi-pathing for iSCSI and removing the use of etherchannel configurations because those were mostly needed on standard vSwitch configurations. I would also recommend to start moving everything over to a vNetwork Distributed Switch configuration because it is the easiest way to standardize across all of your hosts. vSphere 5 implemented a better HA and failed host policy in vSphere 5 so the use of a hybrid solution is fading as well. So if you are using a standard vSwitch, please make adjustments appropriately.

 

They key to any design is discovering requirements. I shouldn’t have to say it because that’s really the first rule of design. That being said, I don’t think I need to talk about it any deeper. Once you have your design requirements, you need to start thinking about the rest of these components. The goal of any good design is finding a perfect balance of redundancy and performance. You need to strive for that perfect balance on redundancy and performance while keeping budgetary constraints in mind.

 

 

 

Design considerations

  • Discovering Requirements
  • Network Infrastructure
  • IP Infrastructure
  • Storage
  • Multiple-NIC vMotion
  • Fault Tolerance
  • vSphere or vCloud
  • Strive for Redundancy & Performance

 

Network Infrastructure

The size of the business will usually dictate the type of servers you acquire. A SMB might go with rack mount servers while a Larger enterprises will probably go with a blade infrastructure. A blade infrastructure gives larger enterprises the ability to compress more compute power in a tighter space. A SMB will go with rack mount servers because you need atleast 2 blade chassis to take into account an entire chassis failure. While it doesn’t happen often, you need to be prepared. Again, it’s thinking about the perfect balance of performance and redundancy. In addition, the cost of filling up 2 chassis with only a few blades can be cost ineffective.

 

Rackmount servers have the same compute potential as blade servers but you also get a greater failure domain.

 

Blade servers, as I said before have the ability to compress your computing footprint and with technologies like UCS you can even get some greater technologies such as FCoE which can reduce your cabing footprint, which we will talk about next.

 

Depending on the amount of VMs your host can hold, should be an indicator of 1GB vs 10GB. Using 6 1GB NICs and IP storage can satisfy servers that have 96GB of RAM. What if your server has 192GB of RAM? Depending on the amount of VMs your hosting and the type of applications and traffic patters, perhaps a 10GB environment might be necessary. Again, this will depend on budget and performance needs.

 

Physical networking infrastructure also plays a crucial role. By default, you should have two switches. This takes into account redundancy of a completely failed switch while still allowing production workloads to function while the problem is addressed. Now let’s examine how the switches all talk, look, and communicate with each other.  The only approach that IS NOT recommended is going with a set of switches that can only be paired with one another via trunk links on the switch ports themselves. This leads to configuration headaches, decisions on port group routing, possible spanning tree problems among others.

 

Having stacked switches that are configured via Stacking cables or through Cisco virtual port-channels is the recommended solution. This will give you the redundancy needed during a failed switch while at the same time making configuration much easier because it’s all viewed as a Single solution. These can be accomplished with switches like the Catalyst 3750s or the Nexus 5500s. Another option is using a single high-end switch such as a Cisco 4500/6500 catalyst or Nexus 7000 that contain slots which can be filled with ethernet ports and then by design you can distribute among the slots.

 

On the configuration side, PortFast needs to be enabled for a vsphere environment because if a port comes online during a failure or if packets starts flowing through that port, you want it to skip the Listening and Learning stages of Spanning Tree port states and go directly into the forwarding state. When a port comes online, there is a convergence state of around 20-60 seconds that happens to make sure the port isn’t prone to a loop via spanning tree. This will make HA failover events and failback event more predictable and reduces the down-time as well. Read more at VMware Networking, Don't Forget STP

 

IP Infrastructure

 

Running a 1GB infrastructure will require more hardware. A typical server will come with 2 1GB NICs, and this is of course an insanely horrible vSphere design. If you’re running hosts with only 2 1GB NICs, you’re in the right session. If you are running a substantial workload, you are easily maxing out bandwidth capacity by running management, vMotion, and virtual machine traffic, and if you are running IP storage, it’s definitely saturated, not to mention, the loss of a physical NIC will put you in a very bad spot as well. You will want to add an expansion card with more NICs into your server.  Remember, these NICs need to be on the VMware HCL or you may have wasted a ton of money, remember to always refer to the Vmware HCL for any piece of equipment you add to your vSphere environment. Vmware has a maximum of 32 1GB NICs that can be on a single host. If you have a use case for 32 1GB NICs, I would love to hear it, and if you do, it’s probably because you still believe in physical separation. The most common cases is adding 1 or 2 4x1GB expansion cards which gives us a total of 6 and 10 NICs per host because most rack mount servers only have 2 PCI slots available. If you have a FC network, adding in 1 2x1GB NIC may suit most needs as well. You may have more PCI slots available and you can add more NIC expansion cards for greater redundancy but performance wise, most use cases are satisfied with 6 NICs. You also have to understand that more NICs also account for more switch ports. The more switch ports you need, the more physical switches that are needed The more NICs you have the more cables that are needed. The more cables needed the more anal you have to be about making sure your datacenter isn’t turning into a rats nest. So the more hardware needed also means more electricity needed to power the PCI cards, switches, and ports and you also need to cool all of this as well. So adding more 1GB NICs to your environment works in an exponential way.

 

10GB on the other hand works in an opposite manner. You can now start densely populating your environment. Less wires are needed and more bandwidth is gained, but it also comes with a cost factor as well. Who here is running 10GB today? And I’m sure many of you are looking at implementing 10GB within your envionment, or atleast at your core within the next year or two. If I had to suggest one or the other if you were to build a virtualized environment, I would pony up the money and go with a 10GB solution.  You can also expand your hosts with up to 8 10GB NICs, but what I’ve found out, is that using only 2 10GB NICs satisfies 95% of performance requirements in most environments. Adding more than 2 10GBe NICs will only increase redundancy in most cases. One of the requirements of your network is of course your switches need to be 10GB capable, that should go without saying. On the flip side of that, depending on the amount of hosts you need, you only need a limited amount of switches.

 

There is also the possibility of getting a server with 2 1GB NICs and adding an expansion card for 2 10GB NICs. Can we use the 1GB NICs? Absolutely. Are we going to use them in conjunction with the production side, perhaps. But even a better idea is to use them as a fail-safe, this design we will show later. You have to remember that eve if you aggregate NICs together in a port group or do LACP trunks, the throughput of that link will never be greater than 1GB. So why use a 1GB NIC when you have a 10GB NIC that can push 10x the bandwidth?

 

Then comes in Cisco UCS or HP Virtual Connect FlexFabric. I don’t know jack about HP because I work for VCE and I deal with Cisco UCS so that’s what we will focus on. The Cisco UCS platform is a revolutionary approach to blade technology. A caveat is that it does require Cisco specific hardware for the functionality. The other crazy thing is that you can deploy 256 virtual interfaces in the most recent VIC 1280 which means 256 virtual NICs. The only reason you do this is because you don’t understand logical separation. That’s a bit overkill because at the end of the day, there is really only 2 10GB adapters on the backend. I will show this design as well later on. The one thing to note is that these 2 10GB adapters are running FCoE, so you use the same 10Gb links to deliver 4GB lossless FC connectivity.

 

 

Storage infrastructure

 

If you have an IP Based storage infrastructure you will need to take that into account in your physical NIC design. It’s a best practice to have atleast 2 1GbE NICs dedicated to IP storage. You also have the option to configure jumbo frames on your switches. It’s recommended to make sure your switches are enabled for jumbo frames, even if you don’t plan on using it. Confiuring anyhting for jumbo frames later on down the road usually involves a switch reboot. I won’t go into the 1500 vs 9000 MTU settings for IP storage because there are plenty of discussions outt here about it, find out what works for you and is supprted by your storage manufacturer.

 

If you have a 1GbE infrastructure the option is available that you can create an isolated set of stacked swtiches just for your storage infrastructure. This is primarily for security reasons. Though this might be a constraint on budget, it does heighten security of the storage to a varying degree. If you find the separate stack of switched to be inflexibile and cost prohibtive, you need to create your IP Storage network on a non-routable layer 2 network. This is done to limit the amount of cross-chatter from VMs and other devices sending out broadcast packets. The other big one is because IP Storage protocols send data in clear text. Therefore it’s very easy to sniff and steal information. The easiest way to mitigate this with iSCSI is to make sure CHAP Authentication is setup between your initiator and target.

 

If you have a 10GbE Infrastructure, you do not have to have dedicated pair of NICs for IP Storage, but instead you must remember to use NIOC on a vDS or QoS on the 1000v to keep traffic prioritized. And of course you aren’t going to have separate switches

 

If you have a Fiber Channel backend, then you don’t need to worry about leaving aside 1GbE NICs for IP storage. You can use these extra NICs for other features in vSphere. If you have a 10GbE infrastructure, this would relieve any stress that IP Storage puts on the shared bandwidth on that pipe, allowing other vSphere functions to take advantage of greater bandwidth.

 

I’m not going to get into the battle of IP Storage vs Fiber Channel because that’s a battle no one will win. Instead, I want you to think about what design decisions need to be made if IP storage or FC storage is in play.

 

Moving on, I’m going to show you the configuration differences between NFS and iSCSI.

 

Here is the logical representation. The iSCSI array has a single interface. After binding the vmkernel interface to the adapter and presenting the LUNs, there are multiple paths to the datastores {CLICK} as seen by the hypervisor. By default, most arrays will choose a single path {CLICK}. This system will only use a single NIC and basically the other NIC is only there in case of an emergency {CLICK}. The link being utilized is represented by Active (I/O) while the other link is seen as Active.

 

{CLICK}If your array supports Round Robin, you can change your datastores to be accessed via Round Robin. This mechanism utilizes both NICs and paths to storage. When visualized inside the vSphere Client, the the paths to the data say both are active and I/O is going. The caveat to this is that traffic is flip-flopped every 30 seconds {CLICK} {CLICK} {CLICK} {CLICK}. So in reality, only 1 NIC is used at a time for all traffic. Failover is helped a bit in this situation because instead of your datastores being down for 2-3 minutes while the path is deemed dead, it may only be a matter of 30 seconds during the flip time.

 

You may be thinking that NFS configuration and design would be similar. And it is but there are ways to optimize it to gain the best performance. Since vSphere doesn’t use NFSv4 and still relies on v3, there is a 1:1 relationship with NFS connections. Just like iSCSI, NFS traffic will only flow through one pipe at a single time {CLICK}. So utilizing etherchannel or LACP is gaining you much benefit unless you know what you are doing. And the only way this algorithm load balances is if a specific initiator has to send out large chunks of data to separate recipients. Not to mention utilizing Etherchannel and IP hash algorithms has a lot of caveats involved {CLICK} {CLICK} {CLICK} {CLICK}. So how do we make the most out of our NICs being used for NFS traffic?

 

The first thing you have to do is see if your storage array can present multiple NFS targets (or virtual interfaces) on separate subnets. The object of this is to create multiple channels of communication. The bad part about this is that a datastore can only be associated with a single interface on the storage array, but we can mitigate around this. In this scenario, our storage array is presenting 4 datastores from 2 different interfaces. We create 2 port groups, each containing a vmkernel port on the subnet of the virutal interface. By setting the vmnics as Active Stand-by on each port, we know that we are going to be utilizing a single NIC for each interface to access the datastores {CLICK}. By setting them on stand-by, we know that in case of a NIC failure {CLICK}, or even a switch failure, communication can be re-routed over to the other uplink

Thanks to this solution from Chris Wahl - NFS on vSphere – Technical Deep Dive on Multiple Subnet Storage Traffic

 

 

 

As we have heard time and time again, one of the most common configurations you can do is to set your portgroups to Route based on Physical Load. This is an optimal setting in most configurations because the traffic can do a few things.

 

First, it will probably route traffic as you expected in the last scenario {CLICK}. Given that everything is running normal in your environment, both uplinks will be utlized and traffic will be dispersed between.

But what if you are running in a 10gbE scenario and you just kicked off a storage vMotion? Since vSphere is now running the show, it knows exactly how much bandwidth it’s pushing, therefore, it can be smart enough to move traffic over to the other uplink {CLICK} for a period of time because that link isn’t being saturated by vMotion. Once bandwidth has been restored, it will move the appropriate workload over to the other uplink {CLICK}.

 

 

This scenario will still overcome an uplink failure as well {CLICK} by re-routing the traffic over the uplink. Of course, the time it takes for a link to failover can be up to two minutes.

 

 

Multiple-nic vMotion

 

As of vSphere 5, we can now use vMotion over multiple uplinks. The maximums are stated that you can use 16 simultaneous 1GbE NICs or 4 10GbE NICs. That’s a lot of bandwidth. So this is a pretty cool feature, but lets evaluate where and why you use it. 99% of the time you are ONLY going to want to do this in a 1GB environment if you have more than 6 NICs or can be used in the place of FT if you know you won’t be using FT. If we are using FC Storage, then we have extra NICs available in 1GB setup. Even if you are using 10GbE and FC, it’s still recommended to not use both NICs for multi-nic vMotion.

 

vMotion will use all the bandwidth available and can easily completely saturate an entire link. There is an intense amount of data that needs to flow through the network to make sure the vMotion succeeds, much less a Storage vMotion that isn’t paired with a VAAI capable array. This is where NIOC plays a critical role in 10GbE environments. If a vMotion can completely saturate a 10GbE uplink, your other services are going to suffer and everyone is going to be fighting for bandwidth.

 

Of course, you are probable saying, well if I have a 10GbE environment can’t I just use NIOC while having both 10GbE NICs take part in vMotion? Yes, you probably could, but multi-NIC vMotion WILL use all available uplinks and bandwidth it’s been given. So if you start a vMotion by using resources on both of your uplinks, you are putting an unnecessary strain on the host. Instead, by leaving vMotion pinned to a single uplink in a 2x 10GbE environment, you are letting vSphere be smart enough to utilize as much bandwidth on a single adapter for vMotion while allowing other workloads to flip over to the other link so services aren’t interrupted.

 

In a test performed by a commenter on Duncan Epping’s blog Yellow-bricks, he performed a multi-nic vMotion test. Using 2 identical blades in the same chassis, each blade having 4x1GbE NICs and 2 HBAs for FC storage, he tested the speed from one blade to the other 5 times and averaged out the numbers. As we can see, the greatest benefit was just adding in that second adapter.

 

In a normal vMotion scenario, only a single uplink is used. So a 1GB uplink can only transfer at 1GB {CLICK}. A 10Gb Uplink can transfer at 10Gb, but depending on NIOCs control, it’s will more than likely never be given full 10Gbs of bandwidth {CLICK}. It’s safe to say the vMotion on a 10GbE link is faster than a 1GbE link.

 

In a Multi-NIC vMotion Scenario, multiple VMKernel ports are used to create multiple channels of communication {CLICK} and data can transfer 50%+ faster {CLICK}

If you want to learn how to configure it please visit Eric Sloof's Video - Running vMotion on multiple–network adaptors

 

Fault Tolerance

 

When vSphere 4 was released, one of the coolest features was fault tolerance. The ability to run a shadow virtual machine on a separate host and if the primary host fails for any reason, the secondary shadow virtual machine picks up where the first left off. This is a great feature for anyone who needs greater availability that what HA offers today. Of course the caveat of 1vCPU exists and many people struggle with this concept. There have been plenty of studies and whitepapers that address lower vCPU counts for virtual machines actually increase performance. So before you start making 4vCPU the standard in your image, you should give the forums a quick read. The other limitation is that you can only have 4 FT protected VMs on a single host at one time.  It’s limited to 4 because more than 4 FT protected VMs can easily saturate a link.

 

Even though FT isn’t widely used, I prefer to implement it anyway. Heck, you are paying for the feature in Ent+, you might as well be ready for it. Making your infrastructure FT ready will make it easier for you to go ahead and FT protect a VM without having to do a lot of configuration down the line. In a 1GB infrastructre, the amount of NICs, the use of IP storage, and the actual use of the FT network will all play a role in the design. I prefer to have 2 dedicated NICs for FT traffic because if you are really using FT, that means that the VMs your are actually protecting with FT are critical VMs. This way you need to make sure the availability is there. If you want to try and consolidate functions, you can do that as well and let vDS NIOC take care of the bandwidth throttling, but note that putting vMotion and FT on the same NICs can effect FT performance if you’re not careful.

 

In a 10GbE environment with 2 NICs, it’s going to be normal operations.

 

If you have dedicated NICs in a 1GB environment, or you have 2 10GbE NICs, we are going make this set to route based on physical NIC load. This will allow vSphere to make the decisions on where to send traffic based on the ingress and egress traffic flows from the NIC {CLICK} {CLICK}  We will see later on how this is setup in different scenarios where this doesn’t fit and more often than not you will need to set NICs to Active / Stand-by Mode.

 

Key Criteria:

  • Make sure your physical networking infrastructure is redundant and you have multiple paths defined, use the correct routing mechanism on your port group, usually Route Based on Physical NIC Load
  • Type of storage dictates physical NIC design
  • Use of vSphere features dictate physical and logical design
  • When using a 10GbE Infrastructure remember to use NIOC to help prioritize IP Storage, vMotion, etc
  • 1Gb infrastructure needs to overcome:
    • Failure of a single NIC port
    • Failure of an entire PCI-x card or on-board NIC
    • Failure of an entire switch
    • Teaming != More Performance

 

This first diagram is a 4 NIC host layout with FC networking.This has an emphasis on Mutli-NIC vMotion because we know that FT won't be used. Since we don't have any traffic running over FT, we can use it to drive more vMotion traffic.

 

This is another FC and 4 NIC design but we want to make sure all services are available including reserving room for FT.

 

Now moving on to 6 NICs with FC, we are dividing up all 6 NICs onto two vDSs. Three NICs are responsible for virutal machine traffic while the other three use vSphere features. Of course you can run FT, but it's not going to be the best because failure of a single link will put FT and multi-NIC vMotion on a single uplink. YMMV.

 

This scenario puts an emphasis on FT. Think about it... If you have VMs being protected by FT, then it making sure you have a redundant FT network is critical. 2 NICs is satisfactory to account for virtual machine traffic because VM traffic usually isn't network intensive.

 

Let's examine 6 NICs with IP storage. 2 NICs need to be dedicated for IP storage services. This scenario depicts a single stacked switch network with adequate port groups to account for MPIO of iSCSI and NFS. Now, you might notice that I put vmnic4 and vmnic5 on dvSwitch1 for vSphere features such as Management, vMotion, and FT. This is because I want to make sure that in the event that I lose the entire PCI card, operations will still function. By operations, I mean that all NICs assigned to dvSwitch2 and dvSwitch3 are for virutal machine operations. VM network connectivity as well as VM Storage will still work if you lose either NIC (on-board or PCI). Since vSphere 5 uses datastores with heartbeats for an additional layer of HA, losing the primary management vmkernel port still won't leave us in a false positive HA situation.

 

This diagram is similar to the first except I'm depicting what it looks like to have a segregated set of IP storage switches. Since I don't plan on using FT, I can utilize vmnic4 and vmnic5 for Multi-NIC vMotion.

 

This scenario depicts 10 NICs with IP storage. This scenario will give plenty of bandwidth and redundancy to all vSphere features.

 

When examining a 10GB environment, everything becomes much more simple. A single vDS and set all port groups to Route Based on Physical NIC Load. This is just clean and simple. I'm not going to discuss this much because I did this in a recent article. To read more on these please visit vSphere and vCloud Host 10Gb NIC Design with UCS & More

 

If you have a server with 2 1GB NIC and 2 10GB NICs, I wouldn't recommend using the 2 1GB NICs at all because of extra unnecessary cabling. If you want to use them, I wouldn't put them into any production workloads. Instead I would use them as a failover for a management redundancy portgroup.

 

Thanks for reading the update. There will probably be a small amendment when vSphere5.next comes out

Related Items