welcome: please sign in

The following 363 words could not be found in the dictionary of 7 words (including 7 LocalSpellingWords) and are highlighted below:
about   access   accessing   acls   adaptation   admin   Administration   Admins   admins   aescrypt   afs   afterward   all   allow   allows   also   an   and   any   anyone   anyway   apt   are   arise   as   at   authenticate   automatically   available   back   backed   backing   backup   Backups   backups   backupsys   bad   bayes   be   become   being   below   better   birthday   bites   broken   bug   Bugzilla   bugzilla   but   by   can   Can   cannot   case   cat   Category   ccrypt   cd   cdk   cgi   child   common   Configuration   containing   Contents   control   copy   corrupted   could   creation   cron   current   currently   daily   damaged   data   Database   database   Databases   databases   date   day   days   db   dbbackup   dead   deletes   deleuze   describes   despite   directly   directory   Do   do   does   done   double   dump   dumped   dumping   dumps   each   effort   Encrypted   encrypted   encryption   enter   entire   erased   etc   every   few   fi   file   files   filesystem   First   first   following   for   free   from   fs   full   Future   generate   get   gets   given   gpg   gunzip   gz   happens   has   have   hcoop   hcoopbackup   host   however   However   http   https   human   id   if   If   Implementation   impossible   in   In   incremental   Incremental   individual   instead   intermediate   is   issues   it   its   just   keep   keeping   key   keys   keytab   keytabs   krb5   large   last   life   list   liw   local   localauth   locally   Ls   machine   machines   mail   make   manager   Managing   Manually   might   mkdir   mkm   mount   Mounting   must   name   Navigating   need   Needs   needs   net   new   newly   nicely   node   nodes   Non   not   note   now   obnam   Of   of   off   old   On   on   one   Only   only   onto   open   openafs   or   other   our   over   page   pam   particular   path   perform   Perhaps   place   Plans   Plays   please   pony   possibly   Possibly   preserved   previous   Probably   probably   procedure   proper   punt   reality   recording   recovery   release   remote   remove   Removing   rename   Renaming   replaced   replaces   repository   requirements   Requirements   respectively   restoration   restore   restored   Restoring   resurrect   Retrieving   rewritten   rm   root   rsync   rsyncing   run   running   Same   script   scripts   seem   separately   server   setting   Should   should   similarly   simpler   simply   simultaneously   since   site   snapshot   so   some   space   spamassassin   srv   ssh   state   status   system   System   Table   takes   tar   tarballs   teach   tenth   that   The   the   their   them   Then   There   Things   this   This   those   Thus   time   tmp   to   To   top   two   uncompressed   unexposed   unique   unit   Unmounting   unobvious   up   use   used   user   username   users   Using   using   via   vicepa   violation   volume   volumes   vos   want   wanted   warning   waste   We   we   week   weekly   what   when   where   whichever   while   whole   wiki   with   Work   world   worth   would   writing   xvzf   you  

Clear message
Edit

BackupInfo

This page describes the procedure for accessing and using our off-site backups. Only admins can do this -- if you want to get some file or directory back from the dead and are not an admin, please open a Bugzilla bug.

The backup/restore procedure below is being replaced with obnam, a backup manager that can perform incremental backups while simultaneously keeping the backup encrypted.

1. Managing Backups

The backup manager script is currently broken and needs to be rewritten on top of obnam.

Using backup-manager:

backup-manager list
backup-manager list YYYY.MM.DD

1.2. Retrieving a backup

(NOTE: $VOLNAME is not simply username, it is <db|mail|user>.USERNAME)

Using backup-manager:

backup-manager get YYYY.MM.DD $VOLNAME.dump.gz.aescrypt

1.3. Restoring the volume dump to a volume with a new name

Using backup-manager:

backup-manager restore YYYY.MM.DD $VOLNAME.dump.gz.aescrypt $VOLNAME.restored

Manually:

cat /vicepa/hcoop-backups/restored/YYYY.MM.DD-$VOLNAME.dump.gz.aescrypt | \
  ccrypt -cdk /etc/backup-encryption-key | \
  gunzip | \
  vos restore deleuze /vicepa $VOLNAME.restored

1.4. Mounting the newly restored volume onto the filesystem

fs mkm /afs/hcoop.net/.old/tmp-mount $VOLNAME.restored
vos release old

1.5. Restoring a particular file

# examine /afs/hcoop.net/.old/tmp-mount

1.6. Unmounting the restored volume

fs rm /afs/hcoop.net/.old/tmp-mount
vos release old

1.7. Renaming the restored volume so it takes the place of the damaged/corrupted/erased volume

Do this if you want to restore an entire volume. This deletes the old volume and replaces it with the backup.

vos remove $VOLNAME
vos rename $VOLNAME.restored $VOLNAME

1.8. Removing the restored volume

If you only wanted to restore a few files from the volume, you should remove the local copy of the backup volume when done.

vos remove -id $VOLNAME.restored

2. Database Backups

cd /vicepa/hcoop-backups/restored
mkdir YYYY.MM.DD-db
cd YYYY.MM.DD-db
cat ../YYYY.MM.DD-databases.tar.gz.aescrypt | \
  ccrypt -cdk /etc/backup-encryption-key | \
  gunzip | \
  tar -xvzf -

3. Implementation

3.1. Requirements

Thus, obnam. Things that might seem unobvious for anyone setting it up:

3.2. Configuration

There is an encrypted obnam repository at path on rsync.net. We are only using one repository for now, and only backing up afs.

Non-human user hcoopbackup has a gpg key and ssh key that allows it to access the obnam repository and rsync.net, respectively. Admins should generate individual gpg keys for accessing the backup.

The script /afs/hcoop.net/common/etc/scripts/hcoop-obnam-backup is run on an openafs file server machine from cron.daily as root (it must be a file server, since we use -localauth when dumping). The backup node has a keytab /etc/keytabs/hcoopbackup that allows the backup script to become the backup user.

First the entire system is backed up using vos backupsys. Then all backup volumes are dumped to /backups/hcoop-backups/dumps. Then the backup user takes over, and does an incremental backup to the remote repository at rsync.net.

On the first day of each week, make a full volume dump, afterward recording the time of the dump (or creation of the backup volume, whichever afs needs). On the following days of the week, make an incremental dump from the weekly full dump to the current date. Then, obnam can be used to access a dump for any given date and we only need to mount two dump files.

3.3. Future Plans

To make life simpler, individual machines would be backed up to afs volumes ({recovery|backup}.$machine.hcoop.net?). Probably as uncompressed tarballs, using an adaptation of the current system backup scripts to only copy files that cannot be restored automatically by setting the status file and running apt. Should each machine have a unique backup user (what happens if you have a $user/$host.hcoop.net key? Can it authenticate on all nodes via pam_krb5, or just on $host?)?

Databases could be backed up similarly, just by rsyncing over the /srv/databases directory. Same issues as previous backups with possibly bad file system state could arise however... if it's not impossible, we should do a proper database dump for each database separately. Perhaps we could "resurrect" $user.db, but instead of $user.dbbackup containing the last snapshot of the user's database dumps? Possibly not worth the effort / it might be better to keep them unexposed to the world at large (violation of database acls).

spamassassin should probably be writing its bayes database directly to afs anyway, so we can punt on that. There should not be any other local state.

In a world where every child gets a free pony for their tenth birthday, we could teach obnam about afs acls and just mount the $user.backup volumes (i.e. not double the local space requirements for volumes!) and backup from those with the intermediate dump. This would also allow users to control what data gets backed up via ACLs. However, reality bites.


CategorySystemAdministration CategoryNeedsWork

BackupInfo (last edited 2019-03-31 19:34:13 by ClintonEbadi)