wiki:VirtualMachineBehaviors
Last modified 3 years ago Last modified on 05/22/11 17:27:57

RGManager handles virtual machines slightly differently from other non-VM services.

Normal Operations

VMs managed by rgmanager should only be administered using clusvcadm or another cluster aware tool. Most of the behaviors are common with normal services. This includes:

  • Starting (enabling)
  • Stopping (disabling)
  • Status monitoring
  • Relocation
  • Recovery

Migration

In addition to normal service operations, virtual machines support one behavior not supported by other services: migration. Migration minimizes downtime of virtual machines by removing the requirement for a start/stop in order to change the location of a virtual machine within a cluster.

There are two types of migration supported by rgmanager which are selected on a per-VM basis by the migrate attribute:

  • live (default) - the virtual machine continues to run while most of its memory contents are copied to the destination host. This minimizes the inaccessibility of the VM (typically well under 1 second) at the expense of performance of the VM during the migration and total amount of time it takes for the migration to complete.
  • pause - the virtual machine is frozen in memory while its memory contents are copied to the destination host. This minimizes the amount of time it takes for a virtual machine migration to complete.

Which migration style you use is dependent on availability and performance requirements. For example, a live migration may mean 29 seconds of degraded performance and 1 second of complete unavailability while a pause migration may mean 8 seconds of complete unavailability and no otherwise degraded performance.

For increased performance of migration operations, virtual machine resources may have a migration mapping which maps alternative hostnames to cluster nodes. This may be added on a per-VM basis by adding a migration_mapping attribute to the virtual machine resource(s) in question.

<cluster ...>
  <clusternodes>
    <clusternode name="node1" nodeid="1" votes="1">
      <fence> ... </fence>
    </clusternode> 
    <clusternode name="node2" nodeid="2" votes="1">
      <fence> ... </fence>
    </clusternode> 
  </clusternodes>
  ...
  <rm>
    <vm name="foo" migrate="live"
        migration_mapping="node1:migration-hostname-1,node2:migration-hostname-2" />
    <vm name="foo2" migrate="pause" />
  </rm>
</cluster>

Important Notes

  • A virtual machine may be a component of service, but doing this disables all forms of migration and most of the below convenience features.
  • The use of migration with KVM requires careful configuration of ssh. See KvmMigration for more details.

Convenience Features

Stuff rgmanager does to try to make life easier.

Virtual Machine Tracking

  • Starting a virtual machine with clusvcadm if the VM is already running will cause rgmanager to search the cluster for the VM and mark the VM as 'started' wherever it is found
  • Administrators who accidentally migrate a VM between cluster nodes with non-cluster tools such as virsh will cause rgmanager to search the cluster for the VM and mark the VM as 'started' wherever it is found

Note: If the VM is running in 2+ places, rgmanager does nothing to warn you.

Transient Domain Support

Rgmanager supports transient virtual machines which are supported by libvirt. This enables rgmanager to create and remove virtual machines on the fly, helping reduce the possibility of accidental double-starts of virtual machines due to the use of non-cluster tools.

Support of transient virtual machines also enables you to store libvirt XML description files on a clustered file system so that you do not have to manually keep /etc/libvirt/qemu in sync across the cluster.

Using XML Files Directly

This is done using an optional attribute on each virtual machine called xmlfile. For example:

<vm name="foo" xmlfile="/mnt/gfs2_vmstore/foo.xml" />

The above will cause rgmanager to look for a file called /mnt/gfs2_vmstore/foo.xml in order to manage a virtual machine named foo.

Notes:

  • The file name must match the virtual machine name.
  • If the file name does not have a .xml extension, it may be treated as a Xen domain configuration file.

Using Path Spec Support

This is done using an optional attribute on each virtual machine called path. For example:

<vm name="foo" path="/mnt/gfs2_vmstore:/mnt/gfs2_vmstore2" />

The above will cause rgmanager to look in /mnt/gfs2_vmstore for a file called foo (Xen) or foo.xml (Xen or KVM; libvirt XML description). If it does not find either in /mnt/gfs2_vmstore, it will then look in /mnt/gfs2_vmstore2.

Once found, the xmlfile parameter is set and the domain is started.

Notes:

  • The path attribute is not required and should not be specified when using default locations for virtual machine descriptions.
  • See notes above on the xmlfile parameter

Management Features

  • Adding or removing a VM from cluster.conf will not start or stop the VM; it will simply cause rgmanager to start or stop paying attention to the VM
  • Failback (moving to a more preferred node) is performed using migration to minimize downtime

Unhandled Behaviors

<<Anchor(bad_ideas)>> Stuff you should never do.

  • Using a non-cluster-aware tool (such as virsh or xm) to manipulate a virtual machine's state or configuration while the cluster is managing the virtual machine. Checking the virtual machine's state is fine (e.g. virsh list, virsh dumpxml).
  • Migrating a cluster-managed VM to a non-cluster node or a node in the cluster which is not running rgmanager. Rgmanager will restart the VM in the previous location, causing two instances of the VM to be running, resulting in file system corruption.