::: Virtual Aleph ::: Virtualization Techniques: BC8274 VMware Fault Tolerance best practice and usage scenario

VMworld 2016 Banner

VMworld 2016 Banner
VMworld 2016 Barcelona

14 October, 2010

BC8274 VMware Fault Tolerance best practice and usage scenario

speaker for this session Mr. Tom Stephens, Sr. TMA VMWare Inc.


recap for this session.

Vmware provides protection at component level, server level, storage level, backup level, site level. FT adds another level of availability to the Virtual Datacenter.

with HA, VMs get restarted. With FT there is zero downtime
it uses some kind of vMotion to create a secondary copy of the FT protected VM
the secondary machine is always up and running but never communicating with the outside world.

FT uses vLockStep Technology: the two VMs are exactly the same and access the same vmdk disk. Only the primary write to disk

FT requrements
   AMD Barcelona Intel Penryn or beyond
   storage: fc, iscsi, nas
   nics: 2 ft logging network 1 gb or better
   esx: same ft version on each hosts (release number and patch number)
   features: vmware HA must be enabled

   VMs must be: 1 vCPU, non thin provisioned, vmware tools recomended, no USB, no floppy, no CDROM, no RDM. No need of special guest drivers or patches

suggestion:
download sitesurvey tool from vmware site and run against hosts to kmow if you can enable FT on your hosts.

best practices
   while doing storage vMotion disable FT duringnthe task.
   do not put too many FT protected VM primary on the same host
   mix protected and ghist machine
   consider 10 GB link for logging NIC if possible

if latency is less than 1 millisecs this improves guest networking performances

scenario example
   FT on demand, during specific lapse of time, end of the quarter,... etc

where to use FT
   Databases medium size DB
   Exchange with < 1000 users
   remote branch office many workload that cannot be clustered
   custom application not application aware 

examples
   in SAP, Ascs (messages and transaction locking service) is a spof and should be, for example, configured for FT.

   BES is a good candidate (1vcpu with 200 users receiving 200 emails a day)

performance
   when you enable ft there is a spike on the net while creating the ghost machine.
   impact on CPU is <10%
   introduces some IO latency NOT users experienciable
   traffic on the network depends on the workload of the primary VM: a gigabit is       sufficient for most workloads

on Oracle 11g performance is quite the same
on Exchange FT has small impact on Exchange Virtual Machine performances circa 1.5% worst than without FT)

Nahalem do a great job in conjunction with FT.

\mf