Wednesday 21 June 2017

Boot Environments with Live Upgrade under the hood

Every solaris admin who patch their Solaris infrastrucuture would be familiar with the live upgrade feature of Solaris. I consider it to be used essentially with ZFS although it can be used with UFS as well. In this article I'll primarily focus on what happens when we create and activate a boot environment. I'll not be demonstrating a live upgrade patching procedure in this article.

A boot environment is basically a bootable instance of the Solaris operating system essentially comprised of a root dataset and other optional datasets underneath. A dataset is a generic name to  refer to a ZFS entity. With respect to boot environments a zfs dataset will refer to components which make up the boot environment. This is usually the root zpool mostly named rpool by convention.

I'd like to clarify that for the purpose of this demonstration I'm using Solaris 10 with the root zpool named rpool and there is no separate /var dataset.

When we create a boot environment, the live upgrade utility provided within the operating system takes a snapshot of the root zfs file system, clones that snapshot and populates the new boot environment from it. Aside from the root file system any other datasets would be present in both the active and inactive boot environments which is I find amazing!

Let's get into the command line for a demonstration.

First we'll get the current status of our rpool file systems.

root@sandbox:/# zfs list -r rpool
NAME                        USED  AVAIL  REFER  MOUNTPOINT
rpool                      5.64G  9.98G    34K  /rpool
rpool/ROOT                 3.64G  9.98G    21K  legacy
rpool/ROOT/s10x_u8wos_08a  3.64G  9.98G  3.64G  /
rpool/dump                 1.00G  9.98G  1.00G  -
rpool/export                 44K  9.98G    23K  /export
rpool/export/home            21K  9.98G    21K  /export/home
rpool/swap                    1G  11.0G    16K  -


Now let's run lustatus to check if there are any boot environments created on the system.

root@sandbox:/# lustatus
ERROR: No boot environments are configured on this system
ERROR: cannot determine list of all boot environment names
root@sandbox:/#

This is a fresh install so lustatus shows no BEs. Now we create one.

root@sandbox:/# lucreate -n testBE
Checking GRUB menu...
Analyzing system configuration.
No name for current boot environment.
INFORMATION: The current boot environment is not named - assigning name <s10x_u8wos_08a>.
Current boot environment is named <s10x_u8wos_08a>.
Creating initial configuration for primary boot environment <s10x_u8wos_08a>.
The device </dev/dsk/c0d0s0> is not a root device for any boot environment; cannot get BE ID.
PBE configuration successful: PBE name <s10x_u8wos_08a> PBE Boot Device </dev/dsk/c0d0s0>.
Comparing source boot environment <s10x_u8wos_08a> file systems with the
file system(s) you specified for the new boot environment. Determining
which file systems should be in the new boot environment.
Updating boot environment description database on all BEs.
Updating system configuration files.
Creating configuration for boot environment <testBE>.
Source boot environment is <s10x_u8wos_08a>.
Creating boot environment <testBE>.
Cloning file systems from boot environment <s10x_u8wos_08a> to create boot environment <testBE>.
Creating snapshot for <rpool/ROOT/s10x_u8wos_08a> on <rpool/ROOT/s10x_u8wos_08a@testBE>.
Creating clone for <rpool/ROOT/s10x_u8wos_08a@testBE> on <rpool/ROOT/testBE>.
Setting canmount=noauto for </> in zone <global> on <rpool/ROOT/testBE>.
WARNING: split filesystem </> file system type <zfs> cannot inherit
mount point options <-> from parent filesystem </> file
type <-> because the two file systems have different types.
Saving existing file </boot/grub/menu.lst> in top level dataset for BE <testBE> as <mount-point>//boot/grub/menu.lst.prev.
File </boot/grub/menu.lst> propagation successful
Copied GRUB menu from PBE to ABE
No entry for BE <testBE> in GRUB menu
Population of boot environment <testBE> successful.
Creation of boot environment <testBE> successful.
root@sandbox:/#


Now that we have created a new boot environment let's check its status.

root@sandbox:/# lustatus
Boot Environment           Is       Active Active    Can    Copy
Name                       Complete Now    On Reboot Delete Status
-------------------------- -------- ------ --------- ------ ----------
s10x_u8wos_08a             yes      yes    yes       no     -
testBE                     yes      no     no        yes    -

After the creation of the boot environment a recursive view of rpool looks like as follows:

root@sandbox:/# zfs list -r rpool
NAME                               USED  AVAIL  REFER  MOUNTPOINT
rpool                             5.64G  9.98G    36K  /rpool
rpool/ROOT                        3.64G  9.98G    21K  legacy
rpool/ROOT/s10x_u8wos_08a         3.64G  9.98G  3.64G  /
rpool/ROOT/s10x_u8wos_08a@testBE  68.5K      -  3.64G  -
rpool/ROOT/testBE                 99.5K  9.98G  3.64G  /
rpool/dump                        1.00G  9.98G  1.00G  -
rpool/export                        44K  9.98G    23K  /export
rpool/export/home                   21K  9.98G    21K  /export/home
rpool/swap                           1G  11.0G    16K  -
root@sandbox:/#

We can observe that the new dataset rpool/ROOT/testBE has been created.

The datasets associated with boot environments can be viewed with lufslist command as shown below:

root@sandbox:/# lufslist -n s10x_u8wos_08a
               boot environment name: s10x_u8wos_08a
               This boot environment is currently active.
               This boot environment will be active on next system boot.

Filesystem              fstype    device size Mounted on          Mount Options
----------------------- -------- ------------ ------------------- --------------
/dev/zvol/dsk/rpool/swap swap       1073741824 -                   -
rpool/ROOT/s10x_u8wos_08a zfs        3911067648 /                   -
rpool                   zfs        6059543040 /rpool              -
rpool/export            zfs             45056 /export             -
rpool/export/home       zfs             21504 /export/home        -
root@sandbox:/#

root@sandbox:/# lufslist -n testBE
               boot environment name: testBE

Filesystem              fstype    device size Mounted on          Mount Options
----------------------- -------- ------------ ------------------- --------------
/dev/zvol/dsk/rpool/swap swap       1073741824 -                   -
rpool/ROOT/testBE       zfs            103936 /                   -
rpool/export            zfs             45056 /export             -
rpool/export/home       zfs             21504 /export/home        -
rpool                   zfs        6059564544 /rpool              -
root@sandbox:/#


Now to prove that the testBE boot environment is in fact a clone of a snapshot of the root file system of the original boot environment s10x_u8wos_08a we'll check it's origin property.

The origin property of the zfs dataset allows us to determine the source of the dataset.

If I check for the origin property of my root zfs dataset rpool/ROOT/s10x_u8wos_08a, I get:

root@sandbox:/# zfs get origin rpool/ROOT/s10x_u8wos_08a
NAME                       PROPERTY  VALUE   SOURCE
rpool/ROOT/s10x_u8wos_08a  origin    -       -

The value is dashed out.

If I check the origin property for my new BEs' dataset rpool/ROOT/testBE I get:

root@sandbox:/# zfs get origin rpool/ROOT/testBE
NAME               PROPERTY  VALUE                             SOURCE
rpool/ROOT/testBE  origin    rpool/ROOT/s10x_u8wos_08a@testBE  -
root@sandbox:/#


There we have it. The source of this dataset is the snapshot of our original root dataset thereby also verifying that rpool/ROOT/testBE is a clone.

We can go ahead and patch the alternate boot environment but I'm not going to do that here.

Now let's activate testBE.

root@sandbox:/# luactivate testBE
Generating boot-sign, partition and slice information for PBE <s10x_u8wos_08a>
Saving existing file </etc/bootsign> in top level dataset for BE <s10x_u8wos_08a> as <mount-point>//etc/bootsign.prev.
A Live Upgrade Sync operation will be performed on startup of boot environment <testBE>.

Generating boot-sign for ABE <testBE>
Saving existing file </etc/bootsign> in top level dataset for BE <testBE> as <mount-point>//etc/bootsign.prev.
Generating partition and slice information for ABE <testBE>
Copied boot menu from top level dataset.
Generating multiboot menu entries for PBE.
Generating multiboot menu entries for ABE.
Disabling splashimage
Re-enabling splashimage
No more bootadm entries. Deletion of bootadm entries is complete.
GRUB menu default setting is unaffected
Done eliding bootadm entries.

**********************************************************************

The target boot environment has been activated. It will be used when you
reboot. NOTE: You MUST NOT USE the reboot, halt, or uadmin commands. You
MUST USE either the init or the shutdown command when you reboot. If you
do not use either init or shutdown, the system will not boot using the
target BE.

**********************************************************************

In case of a failure while booting to the target BE, the following process
needs to be followed to fallback to the currently working boot environment:

1. Boot from Solaris failsafe or boot in single user mode from the Solaris
Install CD or Network.

2. Mount the Parent boot environment root slice to some directory (like
/mnt). You can use the following command to mount:

     mount -Fzfs /dev/dsk/c0d0s0 /mnt

3. Run <luactivate> utility with out any arguments from the Parent boot
environment root slice, as shown below:

     /mnt/sbin/luactivate

4. luactivate, activates the previous working boot environment and
indicates the result.

5. Exit Single User mode and reboot the machine.

**********************************************************************

Modifying boot archive service
Propagating findroot GRUB for menu conversion.
File </etc/lu/installgrub.findroot> propagation successful
File </etc/lu/stage1.findroot> propagation successful
File </etc/lu/stage2.findroot> propagation successful
File </etc/lu/GRUB_capability> propagation successful
Deleting stale GRUB loader from all BEs.
File </etc/lu/installgrub.latest> deletion successful
File </etc/lu/stage1.latest> deletion successful
File </etc/lu/stage2.latest> deletion successful
Activation of boot environment <testBE> successful.
root@sandbox:/#


Now the lustatus will look like this:

root@sandbox:/# lustatus
Boot Environment           Is       Active Active    Can    Copy
Name                       Complete Now    On Reboot Delete Status
-------------------------- -------- ------ --------- ------ ----------
s10x_u8wos_08a             yes      yes    no        no     -
testBE                     yes      no     yes       no     -

The testBE will become the active boot environment for this system after reboot.


Let's take a took at our zfs list output again for rpool.

root@sandbox:/# zfs list -r rpool
NAME                        USED  AVAIL  REFER  MOUNTPOINT
rpool                      5.64G  9.98G  36.5K  /rpool
rpool/ROOT                 3.64G  9.98G    21K  legacy
rpool/ROOT/s10x_u8wos_08a   682K  9.98G  3.64G  /
rpool/ROOT/testBE          3.64G  9.98G  3.64G  /
rpool/ROOT/testBE@testBE    154K      -  3.64G  -
rpool/dump                 1.00G  9.98G  1.00G  -
rpool/export                 44K  9.98G    23K  /export
rpool/export/home            21K  9.98G    21K  /export/home
rpool/swap                    1G  11.0G    16K  -
root@sandbox:/#

Notice that the dataset rpool/ROOT/s10x_u8wos_08a is only using 682K space now and rpool/ROOT/testBE is now using 3.64G. 

The origin properties for these datasets have also changed as shown below:


root@sandbox:/# zfs get origin rpool/ROOT/testBE
NAME               PROPERTY  VALUE   SOURCE
rpool/ROOT/testBE  origin    -       -
root@sandbox:/#
root@sandbox:/# zfs get origin rpool/ROOT/s10x_u8wos_08a
NAME                       PROPERTY  VALUE                     SOURCE
rpool/ROOT/s10x_u8wos_08a  origin    rpool/ROOT/testBE@testBE  -
root@sandbox:/#

This basically means that a zfs promote operation has been created out and the orignal root dataset rpool/ROOT/s10x_u8wos_08a as been replaced by the clone dataset rpool/ROOT/testBE and this will be the now root file system after reboot and once rpool/ROOT/testBE is the active root file system we can delete rpool/ROOT/s10x_u8wos_08a and associated datasets as they'll no longer be needed unless we want to rollback to the previous BE.

I've not been able to drill down on what exactly what happens during the reboot which propells an activated BE as the root file system automatically. I'll definitely write on that if I'm able to ascertain the details.

I hope this article has been helpful in understanding the live upgrade process beyond lucreate, luupgrade and luactivate commands.

No comments:

Post a Comment

Using capture groups in grep in Linux

Introduction Let me start by saying that this article isn't about capture groups in grep per se. What we are going to do here with gr...