Mirroring root / boot disk with SVM

Guide to mirroring root / boot disks with solaris volume manager (disksuite) on solaris 10


Here's an step-by-step tutorial on mirroring your system drives with Solaris Volume Manager under Solaris 10. It will also work with solaris 8 and 9 (SVM was previously know as disksuite, Online Disk Suite, or just ODS. SVM is mature and stable - I'd recommend using it until ZFS matures.

Step 0:

The following packages should be installed on Solaris 10: SUNWmdr (Solaris Volume Manager) SUNWmdu (Solaris Volume Manager) SUNWmdar (Solaris Volume Manager Assistant (Root) ) SUNWmdau (Solaris Volume Manager Assistant (Usr) ) You can use:
   pkginfo -l | grep _packagename_
to make sure these packages are installed. If not, you'll have to install them (from CD, or whatever - that's beyond the scope of this document.

Step 1: Partition disks

I Suggest that you put a Sun Label on your disks. This may make them slightly smaller than they normally would be, but will allow you to use any replacement disk of the same general size if you need to replace a failed disk. IE: if you originally were using two Seagate ST336705LC drives and one failed, and you could only get a ST336704LC as a replacement, this wouldn't work - the ST336705LC by default has some extra blocks which makes it slightly larger that the ST336704LC. Partitions have to be the same size or larger or mirroring will fail. You can set this up by using
format
:
AVAILABLE DISK SELECTIONS:
       0. c0t0d0 
          /pci@1f,4000/scsi@3/sd@0,0
       1. c0t1d0 
          /pci@1f,4000/scsi@3/sd@1,0
..
..
      10. c4t6d0 
          /pci@1f,4000/IntraServer-Ultra160,scsi@5/sd@6,0
Specify disk (enter its number): 10
selecting c4t6d0
[disk formatted]


FORMAT MENU:
        disk       - select a disk
        type       - select (define) a disk type
        partition  - select (define) a partition table
        current    - describe the current disk
        format     - format and analyze the disk
        repair     - repair a defective sector
        label      - write label to the disk
        analyze    - surface analysis
        defect     - defect list management
        backup     - search for backup labels
        verify     - read and display labels
        save       - save new disk/partition definitions
        inquiry    - show vendor, product and revision
        volname    - set 8-character volume name
        !     - execute , then return
        quit
format> type

AVAILABLE DRIVE TYPES:
        0. Auto configure
        1. SUN146G
        2. SUN18G
        3. SUN36G
        ...
        23. SEAGATE-ST318451LC-0003
        24. SEAGATE-ST336704LC-0004
        25. SEAGATE-ST318451LC-0002
        26. other
Specify disk type (enter its number)[25]: 2
selecting c4t6d0
NOTE: for the rest of this document, I will assume that your two system drives are c0t0d0 and c0t1d0.

Step 2: Partition the disks identically

Again, using format or whatever partitioning tool you prefer, partition the two system disks identically. Here's an example:
Part      Tag    Flag     Cylinders         Size            Blocks
  0       root    wm    2701 -  7561        8.79GB    (4861/0/0) 18432912
  1       swap    wu       0 -  2700        4.88GB    (2701/0/0) 10242192
  2     backup    wm       0 - 18901       34.18GB    (18902/0/0) 71676384
  3        var    wm    7562 -  9722        3.91GB    (2161/0/0) 8194512
  4 unassigned    wm    9723 - 11870        3.88GB    (2148/0/0) 8145216
  5 unassigned    wm   11871 - 16731        8.79GB    (4861/0/0) 18432912
  6 unassigned    wm   16732 - 18892        3.91GB    (2161/0/0) 8194512
  7 unassigned    wm   18893 - 18901       16.66MB    (9/0/0) 34128
I'll be using slice 0 (s0) for root (/), slice 3 (s3) for /var, and slice 1 (s1) for swap. Slice 7 will be used by disksuite for the metadb state database. You'll need root, /var, swap, and a small partition for the metadb at a minimum; I use the other partitions here for the excellent Solaris Live Upgrade feature.
OK, what's a metadb? From the metadb man page:
   The metadevice state database contains the configuration of all
   metadevices and hot spare pools in the system. Additionally, the
   metadevice state database keeps track of the current state of
   metadevices and hot spare pools, and their components.

Step 3: create the metadb state databases

   metadb -a -f -c 3 /dev/dsk/c0t0d0s7
   metadb -a -c 3 /dev/dsk/c0t1d0s7
You can view the current metadb configuration with:
   metadb -i

Step 4: Create /etc/lvm/md.tab file

At this point I create a text file called /etc/lvm/md.tab. From the man page:
   The /etc/lvm/md.tab file contains Solaris Volume Manager configuration
   information that can be used to reconstruct your Solaris Volume Manager
   configuration. 
You could bypass this file and specify everything on the command line, but I find this easier, and it serves as documentation of what was done. Your file should look like this:
d10 -m d101 d102 1
d101 1 1 c0t0d0s0
d102 1 1 c0t1d0s0
d11 -m d111 d112 1
d111 1 1 c0t0d0s1
d112 1 1 c0t1d0s1
d12 -m d121 d122 1
d121 1 1 c0t0d0s3
d122 1 1 c0t1d0s3
Thus the d10 metadevice is going to be our root (/) mirror, and d101 and d102 are going to be our root submirrors. Note that if you use metadevice names > d128, you might have to increase the "nmd" parameter is /kernel/drv/md.conf - see "Tips and Tricks" section, below.

Step 5: Set up one-way mirrors

At this point, we can start creating our mirrors. Run the following commands to initialize the mirror devices, and half of the submirrors:
    metainit d122
    metainit -f d121
    metainit d12 -m d121

    metainit d102
    metainit -f d101
    metainit d10 -m d101
    
    metaroot d10
    
    metainit d112
    metainit -f d111
    metainit d11 -m d111

Step 6: Edit /etc/vfstab

I suggest you make a quick backup of /etc/vfstab (IE: cp /etc/vfstab /etc/vfstab.ORIG). Then, change the entries for root (/), /var, and swap to reflect the new devices. IE:
ORIGINAL:
/dev/dsk/c0t0d0s0 /dev/rdsk/c0t0d0s0 /  ufs 1 no logging
/dev/dsk/c0t0d0s3 /dev/rdsk/c0t1d0s3 /var ufs 1 no logging
/dev/dsk/c0t0d0s1 -    -      swap - no -
NEW:
/dev/md/dsk/d10 /dev/md/rdsk/d10    /        ufs    1     no logging
/dev/md/dsk/d12 /dev/md/rdsk/d12    /var     ufs    1     no logging
/dev/md/dsk/d11  -                  -        swap   -     no       -

Step 7: Reboot

sync
sync
init 6

Step 8: Sync the second half (submirror: make 2-way mirrors)

metattach d10 d102
metattach d11 d112
metattach d12 d122
Note that the metattach command will come back almost immediately, but the disks will by syncing in the background. You can run the following command to see the progress of the sync, as well as to check for errors (denoted by "Maintenance" status):
   metastat
OR, to filter out the uninteresting output:
   metastat | grep -i resync
At this point, your system disks are mirrored. I would recommend setting up a small shell script to check for disk/mirroring problems (using metastat), and run it via cron.

Tips and Tricks:

By default the system will not boot without one more than half the total metadb replicas. I suggest you put extra copies of the metadb on some of your non-system disks. If you only have two disks (IE: the system disks) I suggest you add this entry to your /etc/system file:
set md:mirrored_root_flag=1
This will allow you to boot with only one metadb, if one of you disk drives fails.
Also, you might want to add:
set md_mirror:md_resync_bufsz = 2048
to /etc/system as well. This greatly speeds up resyncs of failed mirror components by using 2 meg buffers instead of the default 128k. On one of our SunFire V215 machines, this increased the resync speed from 8-9 MB/second to 36-37 MB/second.
The default maximum number of devices is 128. IE: creating a metadevice named "d129" will fail with an error. You can increase this by changing:
nmd=128 
to
nmd=400
in /kernel/drv/md.conf. This required a "boot -r" to take effect.
Lastly, if your system ever has a panic or something causes it to reboot when one of the system disks is failed, you can manually boot from the other disk (without physically moving it) by running:
boot disk1
from the OK prompt. Similarly:
boot disk0
will boot off the primary disk (this is the default, and I wouldn't recommend changing it).