Checkpointing and live migration

Checkpointing and live migration

A live migration and checkpointing feature was released for OpenVZ in the middle of April 2006. It allows to migrate a container from one physical server to another without a need to shutdown/restart a container. The process is known as checkpointing: a CT is frozen and its whole state is saved to the file on disk. This file can then be transferred to another machine and a CT can be unfrozen (restored) there. The delay is about a few seconds, and it is not a downtime, just a delay.

Since every piece of the container state, including opened network connections, is saved, from the user's perspective it looks like a delay in response: say, one database transaction takes a longer time than usual, when it continues as normal and user doesn't notice that his database is already running on the another machine.

That feature makes possible scenarios such as upgrading your server without any need to reboot it: if your database needs more memory or CPU resources, you just buy a newer better server and live migrate your container to it, then increase its limits. If you want to add more RAM to your server, you migrate all containers to another one, shut it down, add memory, start it again and migrate all containers back.