Plans for upgrading Fritz to Debian Squeeze

'''Upgrade was completed 2011-07-17'''

= Preliminaries =

[[http://www.debian.org/releases/stable/amd64/release-notes/ch-upgrading.en.html|Release Note Information of Upgrading From Lenny]].

== Pre-Install Cleanup Tasks ==

=== Sanitize NSS Configuration ===

'''DONE'''

 * Synchronize the UIDs of locally created users with their counterparts in AFS
   * Affected users
     * `docelic_admin`
     * `rkd_admin`
     * `clinton_admin`
     * `adamc_admin`
     * `shadowfax_admin`
   * Ensure ssh and console login for `root` works and keep the password handy in case all `_admin` accounts are locked out because of the UID changes.
 * Locate and update any files owned by an obsolete UID to the new UID
 * Setup `libnss-afs` (`afs files`)

=== Reconfigure PAM ===

This may be better to do after the installation.

Configure `sshd` and `login` to use `pam_localuser` instead of `pam_unix` to ensure only local users can login ignoring the NSS configuration (right now non-local users can't login using just `pam_unix`, but this is an accident of the implementation of `libnss-afs` and not something that should be relied upon).

== Pre-Install Software Upgrades ==

=== Jabber ===

The same version of `ejabberd` must be used across a cluster, and the easiest way to migrate the installation to another machine is to do it with a running cluster. Luckily, `deleuze` is running the version from `etch-backports` which is the same version in `lenny`.

==== DONE ====

 1. Install `ejabberd` from `lenny` on `fritz`
 1. Add firewall rules to permit connects to/from `deleuze` on port 4369` (check `deleuze` as well)
 1. Add `fritz` to the mnesia cluster
 1. Add XMPP SRV records to provide both `deleuze` and `fritz`
 1. Ensure everything works ~24 hours
 1. Remove XMPP SRV records pointing to `deleuze`
 1. Ensure everything continues to work for ~72 hours (DNS propagation &c)
 1. Disable `ejabberd` on `deleuze`

After upgrading `fritz` to `squeeze` the [[http://www.process-one.net/docs/ejabberd/guide_en.html#htoc18|ejabberd guide]] says it will automatically handle updating the `mnesia` tables. Once this is all done it may be a good idea to add `hopper` to the `ejabberd` cluster for a bit of fault tolerance.

= Installation environment =

'''On All Machines'''

 1. `su` to root, start a `screen` session (preventing partial upgrade issues if the network connection drops)
 1. Open a physical console root login just in case

After the upgrade remember to log out of the kvm root console on the other machines.

= Installation Steps =

== Early Preparations ==

 * `dpkg --audit`
 * Remove `lenny` and `lenny-backports` from `sources.list`
 * `apt-get update`
 * Run `apt-get upgrade` and ensure no essential packages conflict (e.g. `postgresql-8.1`)

== Backup Important Data ==

 * `ejabberd` mnesia database
 * Debian stuff (package lists, ..., ?)

== Upgrade Kernel and udev ==

 1. Install new kernel image and `openafs-module-dkms`
 1. Install `udev`
 1. Reboot

== Basic Upgrade ==

 1. `apt-get upgrade`
 1. Reboot?

== Full Upgrade ==

 1. `apt-get dist-upgrade`
 1. Reboot?

== Clean Up ==

 1. Make sure the other machines are still sane after losing volume access for a while.

= Caveats =

== pam_unix_session locking all login access ==

'''Not an issue'''

This bit us on hopper. ClintonEbadi has confirmed this is not in use--it appears `hopper`'s PAM configuration was copied from another machine that had been running `etch` earlier and used deprecated modules.

== Locally built packages ==

'''Not an issue'''

ClintonEbadi scanned the currently installed packages and we are using the backports versions of afs and kerberos with nothing else locally built.

= Service Interruption Mitigation =

== Read Only Volumes on Deleuze ==

''Not Doing This'' (the time required is not worth a few minutes of afs downtime at this point)

Since we have openafs we may as well take advantage of it by adding deleuze's `vicepa` as a site for `user.$USER` volumes. There does not appear to be enough room for `mail.$USER` volumes so we won't worry about those (mail will still be queued and having a read only copy of mail volumes is of dubious value).

=== Preparation ===

A few days before the upgrade:

 * Prevent backup from running (uncomment `exit 0` in `hcoop-backup-wrapper`) before scheduled upgrade date
 * Purge last backup data
 * Purge `db.$USER` volumes
 * Purge `{user,mail}.$USER.d` volumes for members who departed more than (tentatively) 90 ago
 * For all active `user.$USER` volumes: `vos addsite deleuze vicepa user.$USER`

Immediately before upgrading:

 * For all active `user.$USER` volumes: `vos release user.$USER`

=== Clean Up ===

For all user volumes `vos remsite deleuze vicepa user.$USER` to free space for the backup. Alternatively, since the backup will be moved to fritz anyway, leave them in place. There seems to be little benefit to doing so since deleuze does not have much space compared to fritz and we have nothing in place to regularly `vos release` volumes making them effectively useless.