With the latest release of RancherVM, we’ve added the ability to schedule virtual machines (guests) to specific Kubernetes Nodes (hosts).
This declarative placement (in Kubernetes terms: required node affinity) can be modified at any time. For stopped VMs, no change will be observed until the VM starts. For running VMs, the VM will enter a migrating state. RancherVM will then migrate the running guest machine from old to new host. Upon completion, the VM returns to running state and the old host’s VM pod is deleted. Active NoVNC sessions will be disconnected for a few seconds before auto-reconnecting. Secure shell (SSH) sessions will not disconnect; a sub-second pause in communication may be observed.
Migration of guest machines (live or offline) requires some form of shared storage. Since we make use of virtio-blk-pci para-virtualized I/O block device driver which writes virtual block devices as files to the host filesystem, NFS will work nicely.
Note: You are welcome to install RancherVM before configuring shared storage, but do not create any VM Instances yet. If you already created some instances, delete them before proceeding.
Install/Configure NFS server
Let’s walk through NFS server installation and configuration on an Ubuntu host. This can be a dedicated host or one of the Nodes in your RancherVM cluster.
Install the required package:
sudo apt-get install -y nfs-kernel-server
Create the directory that will be shared:
sudo mkdir -p /var/lib/rancher/vm-shared
Append the following line to /etc/exports:
/var/lib/rancher/vm-shared *(rw,sync,no_subtree_check,no_root_squash)
This allows any host IP to mount the NFS share; if your machines are public facing, you may want to restrict * to an internal subnet such as 192.168.100.1/24 or add firewall rules.
The directory will now be exported during the boot sequence. To export the directory without rebooting, run the following command:
From one of the RancherVM nodes, query for registered RPC programs. Replace <nfs_server_ip> with the (private) IP address of your NFS server:
rpcinfo -p <nfs_server_ip>
You should see program 100003 (NFS service) present, for example:
program vers proto port service
100000 4 tcp 111 portmapper
100000 3 tcp 111 portmapper
100000 2 tcp 111 portmapper
100000 4 udp 111 portmapper
100000 3 udp 111 portmapper
100000 2 udp 111 portmapper
100005 1 udp 47321 mountd
100005 1 tcp 33684 mountd
100005 2 udp 47460 mountd
100005 2 tcp 45270 mountd
100005 3 udp 34689 mountd
100005 3 tcp 51773 mountd
100003 2 tcp 2049 nfs
100003 3 tcp 2049 nfs
100003 4 tcp 2049 nfs
100227 2 tcp 2049
100227 3 tcp 2049
100003 2 udp 2049 nfs
100003 3 udp 2049 nfs
100003 4 udp 2049 nfs
100227 2 udp 2049
100227 3 udp 2049
100021 1 udp 49239 nlockmgr
100021 3 udp 49239 nlockmgr
100021 4 udp 49239 nlockmgr
100021 1 tcp 45624 nlockmgr
100021 3 tcp 45624 nlockmgr
100021 4 tcp 45624 nlockmgr
The NFS server is now ready to use. Next we’ll configure RancherVM nodes to mount the exported file system.
Install/Configure NFS clients
On each host participating as a RancherVM node, the following procedure should be followed. This includes the NFS server if the machine is also a node in the RancherVM cluster.
Install the required package:
sudo apt-get install -y nfs-common
Create the directory that will be mounted:
sudo mkdir -p /var/lib/rancher/vm
Be careful to use this exact path. Append the following line to /etc/fstab. Replace <nfs_server_ip> with the (private) IP address of your NFS server:
<nfs_server_ip>:/var/lib/rancher/vm-shared /var/lib/rancher/vm nfs auto 0 0
The exported directory will now be mounted to /var/lib/rancher/vm during the boot sequence. To mount the directory without rebooting, run the following command:
This should return quickly without output. Verify the mount succeeded by checking the mount table:
mount | grep /var/lib/rancher/vm
If an error occurred, refer to the rpcinfo command in the previous section, then check the firewall settings on both NFS server and client.
Let’s ensure we can read/write to the shared directory. On one client, touch a file:
touch /var/lib/rancher/vm/read-write-test
On another client, look for the file:
ls /var/lib/rancher/vm | grep read-write-test
If the file exists, you’re good to go.
Live Migration
Now that shared storage is configured, we are ready to create and migrate VM instances. Install RancherVM into your Kubernetes cluster if you haven’t already.
Usage
You will need at least two ready hosts with sufficient resources to run your instance.
We create a Ubuntu Xenail server instance with 1 vCPU and 1GB RAM and explicitly assign it to node1.
After waiting a bit, our instance enters running state and is assigned an IP address.
Now, let’s trigger the live migration by clicking the dropdown under Node Name column. To the left is the requested node, to the right is the currently scheduled node.
Our instance enters migrating state. This does not pause execution; the migration is mostly transparent to the end user.
Once migration completes, the instance returns to running state. The currently scheduled node now reflects node2 which matches the desired node.
That’s all there is to it. Migrating instances off of a node for maintenance or decommissioning is now a breeze.
How It Works
Live migration is a three step process:
- Start the new instance on the desired node and configure an incoming socket to expect memory pages from the old instance.
- Initiate the transfer of memory pages, in order, from the old to new instance. Changes in already transferred memory pages are tracked and sent after the current sequential pass completes. This process repeats until we have sufficient bandwidth to stream the final memory pages within a configurable expected time period (300ms by default).
- Stop the old instance, transfer the remaining memory pages and start the new instance. The migration is complete.
Moving Forward
We’ve covered manually configuring a shared filesystem and demonstrated the capability to live migrate guest virtual machines from one node to another. This brings us one step closer to achieving a fault tolerant, maintainable virtual machine cloud.
Next up, we plan to integrate RancherVM with Project Longhorn, a distributed block storage system that runs on Kubernetes. Longhorn brings performant, replicated block devices to the table and includes valuable features such as snapshotting. Stay tuned!
James Oliver
Tools and Automation Engineer, Prior to Rancher, James’ first exposure to cluster management was writing frameworks on Apache Mesos predating the release of DC/OS. Self-proclaimed jack of all trades, James loves reverse engineering complex software solutions as well as building systems at scale. Proponent of FOSS, it is his personal goal to automate the complexities of creating, deploying, and maintaining scalable systems to empower hobbyists and corporations alike. James has a B.S. in Computer Engineering from University of Arizona.