I wanted to post an small article about the Nutanix technology that protects a environment against a CVM (Controller VM) failure how does the storage still remain accessible etc. I’m not going to go into too much detail around the architecture of Nutanix it would take too long, in order to understand the technology I suggest having a read of the Nutanix “Bible” written by Steven Poitras, but I will quickly explain the purpose of the CVM.
A CVM is deployed with each node in a block or essentially each physical ESXi (or hypervisor) host has a CVM which is installed as part of the Nutanix deployment and a key part of the software defined solution. The job of the CVM is to manage and serve all of the IO between the VMs and the storage, regardless of the hypervisor, it is here that the compression, replication and deduplication is handled with no impact to performance. So as you can see it has a pretty important job and one might think what happens if this goes down as there is only 1 deployed per host it must be a single point of failure – Wrong..
Recently I was working for a customer who had this exact concern and what happens to the route of the traffic. The below image is an example of the solution they had designed essentially what the wanted to do was to contain all of the traffic to route internally through the top of rack switches and not back to the core. In this design they have 2 Arista switches and a Nutanix Block with 3 ESXi hosts inside. The picture denotes what happens with the traffic illustrated in 3 scenarios by 3 different coloured lines.
Red – If a VM is migrated to a different host, the hot data blocks don’t immediately follow, when a read request is made the CVM directs the Request to the CVM on the other host where the blocks are located until the blocks are localised.
Green – As blocks are accessed from an alternate node these blocks are copied to the local storage of that node to have the data localised.
Purple – In the even of a CVM failure the hypervisor can redirect the requests to another CVM to locate the data on the DSF and return it acting as the CVM until the local CVM is back online and able to serve IO once again.
The CVM health state is monitored by Prism and in the event of a CVM failure the hypervisor initiates a HA.py (high availability) script to run a esxcfg-route command which injects the public IP address of the an alternate CVM in the Block\Cluster. The NFS information is shared by all CVMs as mentioned above with the red line, therefore in this scenario the NFS traffic can be routed via an alternate CVM until the original CVM is back on line and healthy once again. Information about configuring a CVM is below in this article.
Nutanix forms a cluster of hosts and storage to be able to manage resources effectively mainly for the purpose of the storage compute resource is managed by vSphere. It uses data locality to migrate hot blocks of data to the local host and as space gets to be an issue it migrates cold blocks to alternate hosts where there is more space. All this should happen infrequently, after VMs are provisioned there should not be a need for them to migrate and so data should be localised and settled. When new VMs are created either 2 or 3 copies (depending what is configured) of the data blocks are made available on other nodes in the cluster for availability.
The CVM has 2 nics attached as shown below, 1 has a none-routable address which they are all configured with, this nic allows local access to data which is available on the host immediately. The other is a public routable nic which is used to migrate cold and hot data to or from the node, it also redirects requests using CVM autopathing when a CVM is not available as mentioned above.
The next image represents the back of a Nutanix node or ESXi host (there are 4 nodes in a block), there is one IPMI which is for lights out like ILO, 2x 10GB SFP ports and 2x 1GbE ports, we are using the 10GB ports with twinax cables.
The CVM address needs to be a routable address and is required to be on the same VLAN as the hypervisor for the following reasons.
- If a CVM is unavailable the data is redirected via the hypervisor to other CVM’s in the cluster to direct the IO. Without this ability vSphere would assume that the host is isolated due to IO not being directed anywhere and HA would invoke to restart the VMs on other hosts.
- Prism which is a cluster management tool connecting all the nodes to a cluster has a running service on each node, each node is reachable via https address. One of the CVMs will be elected as a leader and host the cluster IP.
- DNS and active directory allow control on access to the prism interfaces.
- One click upgrades will not be available – which is a feature where components such as hosts and CVMs can be upgraded and the servers and traffic managed accordingly.