wiki:cluster_in_a_box
Last modified 3 years ago Last modified on 05/23/11 17:57:40

Cluster in a box (cbox)

The project goal is to help/facilitate deployment of kvm based test clusters to show case cluster technologies, by executing one simple command.

It is NOT meant to create production clusters and NEVER will.

status

There are no official/stable releases at this point in time.

Code is being developed at very high speed and it can break your environment.

Use with extreme caution.

requirements

  • an x86_64 host machine, with virt capabilities that can be sacrificed to the Gods.
  • enough CPU/RAM/disk.
  • clean Fedora 14 installation.
  • must be root

download and install

cbox is currently only available in git.

On the freshly installed Fedora 14 host, run:

yum install git automake autoconf appliance-tools qemu-system-x86

git clone git://github.com/fabbione/cbox.git

cd cbox

./autogen.sh

./configure

make all install

first time execution

Before diving into the many options that cbox has to offer, please take a second to run it standalone.


[root@foobar14 ~]# cbox

***** WARNING *****

cbox is an experimental script to build virtual TEST cluster
and it performs actions and configurations that might not be
considered safe on both the host and the guests.
Please abort the execution now unless you fully understand
what you are doing.
The resulting setup has to be used only for TESTING.
Do NOT place this cluster into production.

cbox can only be executed once on each host to create a virtual cluster.
Second execution will safely request to destroy the previously
created cluster, and start all over again.

***** END OF WARNING *****

YOU HAVE BEEN WARNED

If you agree, and want to proceed, please type:
Yes, I fully understand the risks of what I am doing

type here: Yes, I fully understand the risks of what I am doing

***** requested configuration *****

cluster name:             testcluster
cluster nodes:            2
cluster type:             cman
cluster qdiskd:           
cluster resource manager: none
cluster fedora release:   14

node capacity (for each node):

RAM:                      2048 (MB)
CPU(s):                   2
root partition size:      4096 (MB)
swap partition size:      1024 (MB)

shared disk size:         20544 (MB)

under the hood info
cbox data dir:            /usr/share/cbox
cbox hooks dir:           /usr/share/cbox/hooks
cbox log dir:             /var/log/cbox
cbox log file:            /var/log/cbox/cbox-2011-02-09_04:26:27.log
cbox VMs storage dir:     /srv/cbox

Proceed (Y/n)?

This will show you the defaults that can be tuned via options.


[root@foobar14 ~]# cbox -h
cbox usage:

cbox [options]

Options:
 -c cluster_name                set clustername (default: testcluster)
 -m [pacemaker|rgmanager|none]  use a resource manager (default: none)
 -n number_of_nodes             a value between 2 and 16 (default: 2)
 -o path_to_storage_dir         set path where to store VMs (default: /srv/cbox)
 -q                             enable qdiskd for cman clusters (default: no)
 -r fedora_release              force fedora release to install on the nodes (defaults: autodetect/same as host)
 -t [cman|corosync]             define cluster type (default: cman)
 -h                             this help
 -v virt_opts                   define virtual nodes and storage size in a comma separated list of parameters
    valid parameters are:
    ram=XXX                     amount of RAM (for each node) in MB (min. 512 - default 2048)
    cpus=XXX                    number of virtual cpus
    root=XXX                    size of the root partition in MB (min. 2048 - default 4096)
    swap=XXX                    size of the swap partition in MB (min. 512 - default is 1024)
    share=XXX                   size of the shared storage disk in MB (min. 20544 - default 20544)

NOTE: not all options are working yet and a lot are missing. Specifically -m and -t are still ignored and they will default to cman/rgmanager.

second time execution

Assuming your first time execution completed (or failed miserably), when executing cbox the second time, cbox will detect previously installed cluster (or leftovers) and will proceed to clean it up.


[root@foobar14 ~]# cbox
Cluster: testcluster appears to be installed on the system

Do you want to totally detroy testcluster (y/N)? 

This operation will rollback to the original state of the machine on a best effort base.

default run

This is the output you should see when executing cbox with default settings:


Proceed (Y/n)? 

Creating cluster... this might take several minutes

Executing 01_host_update: OK
Executing 02_host_install_packages: OK
Executing 03_host_net_setup: OK
Executing 04_host_fence_setup: OK
Executing 05_host_create_ssh_keys: OK
Executing 06_host_create_guest_kickstart: OK
Executing 50_guest_create: OK
Executing 51_guest_mount: OK
Executing 52_guest_net_setup: OK
Executing 53_guest_fstab_setup: OK
Executing 54_guest_install_ssh_keys: OK
Executing 55_guest_create_cluster_conf: OK
Executing 58_guest_apply_hacks_prerun: OK
Executing 59_guest_umount: OK
Executing 60_guest_create_first_node: OK
Executing 61_guest_start_first: OK
Executing 62_guest_setup_clvmd: OK
Executing 63_guest_setup_gfs2: OK
Executing 64_guest_setup_qdiskd: OK
Executing 67_guest_enable_cluster_daemons: OK
Executing 68_guest_apply_hacks_live: OK
Executing 69_guest_stop_first: OK
Executing 70_guest_clone_nodes: OK
Executing 71_guest_autostart_nodes: OK
Executing 72_guest_start_nodes: OK
Executing 99_final_cleanup: OK
cluster testcluster successfully created and running

At this point cluster nodes are available and running.

Try (from the host):

[root@foobar14 ~] ssh testcluster-node1 -i /root/.ssh/id_rsa_testcluster

should bring you straight on the node1 of selected clustername (and it should work for all cluster nodes too).

The default root password is: cluster

a slightly more complex example

[root@foobar14 ~]# cbox -c evilcluster -n 16 -q -r 13 -v ram=1024,cpus=1,root=4096,swap=512,share=1000000
** usual warning **

***** requested configuration *****

cluster name:             evilcluster
cluster nodes:            16
cluster type:             cman
cluster qdiskd:           yes
cluster resource manager: none
cluster fedora release:   13

node capacity (for each node):

RAM:                      1024 (MB)
CPU(s):                   1
root partition size:      4096 (MB)
swap partition size:      512 (MB)

shared disk size:         1000000 (MB)

under the hood info
cbox data dir:            /usr/share/cbox
cbox hooks dir:           /usr/share/cbox/hooks
cbox log dir:             /var/log/cbox
cbox log file:            /var/log/cbox/cbox-2011-02-09_04:48:33.log
cbox VMs storage dir:     /srv/cbox

Proceed (Y/n)? 

current final setup

The resulting setup currently is:

host:

  • updated and installed all required packages
  • libvirt configured to handle all the nodes and the network bridges (clustername-br0 for services and -br1 for cluster activities)
  • fence_virtd configured and running to handle node fencing
  • dedicated ssh keys in /root/.ssh/id_rsa_clustername to easily access the cluster
  • updated /etc/hosts to map IP and hostnames

guests:

  • cman cluster manager
  • fencing configured
  • a shared block device divided in:
    • 64 Mb partition for qdisk
    • 50% to a GFS2 filesystem mounted on all nodes
    • 50% allocated to a cluster vg of which:
      • 50% is another GFS2 filesystem mounted on all nodes
      • 50% is free to play around
  • 2 networks for each node, a service net (eth0) and a network dedicated to all cluster activities (eth1)
  • all cluster daemons up and running

Next in list

Please refer to the TODO list in the git checkout. There is a lot TODO.

Developers guide

The idea behind cbox is extremely simple and let´s keep it that way.

cbox itself is a run-wrap alike command, that does just a bit more by checking and validating options passed by the users, export appropriate envvars and then execute a series of hooks.

The idea behind hooks is also simple. One hook, one operation.

hooks are numbered to guarantee ordered execution. Values below 50 are operations to be performed on the host, 50 and above are operations to be performed on the guests.

When developing hooks for cbox keep the following in mind:

  • all hooks will always be executed, make sure they are wrapped correctly to check presence of options
  • try not to limit a hook based on options, whenever possible. Example, user selects rgmanager as resource manager, make sure to configure pacemaker too, no matter what, but differentiate only by chkconfig the selected one. This way it should be easy to switch from one to another without rebuilding the cluster.
  • implement sanity checks in cbox, before hooks will be executed. Failing in a hook is more difficult to recover.

NOTES

  • Fedora rawhide and Fedora 15 are currently supported on a best effort base