| Size: 2256 Comment:  | Size: 5457 Comment: pretty out of date, can be salvaged | 
| Deletions are marked like this. | Additions are marked like this. | 
| Line 1: | Line 1: | 
| #pragma section-numbers off | |
| Line 3: | Line 5: | 
| [[TableOfContents]] | <<TableOfContents>> | 
| Line 9: | Line 11: | 
| You'll want to add a new {{{AFSDB}}} record for the new server.  Note that the numeric field in an {{{AFSDB}}} record must always be "1" -- it is not a priority like in MX records!  The order of the records determines their priority (not like SRV records). | |
| Line 11: | Line 15: | 
| On all existing AFS servers, add the IP address for the new machine to [[/etc/openafs/server/CellServDB]] (this should be a symlink to [[/etc/openafs/CellServDB]]). The format of this file is very strange, and often confuses people: | On all existing AFS servers, add the IP address for the new machine to {{{/etc/openafs/server/CellServDB}}} (this should be a symlink to {{{/etc/openafs/CellServDB}}} but not vice-versa). The format of this file is very strange, and often confuses people: | 
| Line 28: | Line 32: | 
| Now, restart each of the existing AFS servers, one at a time, so they reload their CellServDB files. FIXME: is this really necessary? | Now, restart each of the existing AFS servers, one at a time, so they reload their CellServDB files.  To completely ensure continuity of service, always wait a full five minutes after restarting one server before restarting the next one (five minutes is the worst-case time needed for AFS peer servers to "recognize" each other and rejoin the cluster; in practice the time required is usually much, much shorter). Unfortunately this [[https://lists.openafs.org/pipermail/openafs-info/2008-April/029087.html|really is necessary]]. | 
| Line 32: | Line 38: | 
| == Install Debian Packages == | == Ensure Hostname Resolves == | 
| Line 34: | Line 40: | 
| == Relink CellServDB == | Execute this command, and make sure it works.  If it doesn't, the AFS server will fail cryptically and mysteriously. {{{ dig +short `hostname` }}} == Copy CellServDB, UserList, KeyFile, BosConfig, ThisCell == Copy the CellServDB, UserList, KeyFile, and BosConfig from an existing AFS server: {{{ mkdir -p /etc/openafs/server/ scp deleuze.hcoop.net:/etc/openafs/server/UserList /etc/openafs/server/ scp deleuze.hcoop.net:/etc/openafs/server/KeyFile /etc/openafs/server/ chown root:wheel /etc/openafs/server/KeyFile chmod o-r /etc/openafs/server/KeyFile scp deleuze.hcoop.net:/etc/openafs/CellServDB /etc/openafs/CellServDB scp deleuze.hcoop.net:/etc/openafs/BosConfig /etc/openafs/BosConfig }}} == Relink CellServDB and ThisCell == | 
| Line 39: | Line 65: | 
| mkdir -p /etc/openafs/server/ | |
| Line 40: | Line 67: | 
| ln -sf /etc/openafs/ThisCell /etc/openafs/server/ThisCell | |
| Line 41: | Line 69: | 
| == Create /vicepa == The AFS server will store its files in {{{/vicepa}}}. So, you should create that directory, ensuring it resides on whatever storage (raid, etc) you want to use for AFS backing. Furthermore, you must let AFS know that it is safe to use it: {{{ touch /vicepa/AlwaysAttach }}} == Install Debian Packages == {{{ dpkg -i /afs/hcoop.net/common/debian/openafs/1.4.6/openafs-{fileserver,dbserver}*.deb }}} = Replicate Volumes = We want most of our {{{readonly}}} volumes to be replicated as widely as possible. So, for each readonly volume, you should: {{{ vos addsite newserver.hcoop.net /vicepa volname vos release volname }}} Currently, the minimum of volumes you should replicate are: {{{ common.bin common.databases common.logs old root.afs root.cell }}} = Remove AFS server = Here's a list of tasks that were done when we were removing Krunk: * Run '''vos listvol HOST''' to find existing volumes on the server. * Run '''vos remove -server HOST -id NAME|ID''' for each of them (note: really removes data! It's ok in case of replicated volumes whose r/w is kept elsewhere) * Run '''vos changeaddr -oldaddr HOST_IP -remove''' * Edit ''/etc/openafs/CellServDB'' on all machines to remove mention of HOST * Run '''bos shutdown krunk''' * Edit ''/afs/hcoop.net/common/etc/scripts/hcoop-kprop'' and remove mention of HOST, apply with '''DOMTOOL_USER=hcoop domtool hcoop.net''' * If the cell is published with grand.central.org, mail cellservdb@central.org and tell them the new CellServDB configuration | |
| Line 44: | Line 119: | 
| The information in [[CellServDB]] needs to stay in sync with the [[AFSDB]] DNS entries -- they both contain essentially exactly the same data in different formats. Unfortunately AFS can't be modified to "do away with" the CellServDB file because the AFS fileservers are supposed to be able to operate correctly even when DNS is down (clients are another story). So, it would be nice to have some way of generating the [[CellServDB]] from the AFSDB records periodically. | The information in {{{CellServDB}}} needs to stay in sync with the {{{AFSDB}}} DNS entries -- they both contain essentially exactly the same data in different formats.  Unfortunately AFS can't be modified to "do away with" the CellServDB file because the AFS fileservers are supposed to be able to operate correctly even when DNS is down (clients are another story).  So, it would be nice to have some way of generating the {{{CellServDB}}} from the AFSDB records periodically. ---- CategorySystemAdministration CategoryNeedsWork CategoryOutdated | 
These steps are listed in approximately the order in which they should be performed, after performing all of the "generic" steps in SetupNewMachines.
Update Existing Machines
Update AFSDB DNS Records
You'll want to add a new AFSDB record for the new server. Note that the numeric field in an AFSDB record must always be "1" -- it is not a priority like in MX records! The order of the records determines their priority (not like SRV records).
Update CellServDB on AFS Servers
On all existing AFS servers, add the IP address for the new machine to /etc/openafs/server/CellServDB (this should be a symlink to /etc/openafs/CellServDB but not vice-versa). The format of this file is very strange, and often confuses people:
- A line starting with a ">" (greater-than sign) indicates the start of the declaration of the servers for a cell. The name of the cell comes after the greater-than. 
- All lines between the previous line and the next line starting with a greater-than sign are servers for the previously mentioned cell. Each of these lines consists of an IP address, one or more tabs, a hash mark, and the hostname of the server.
Here is an example:
>hcoop.net 1.1.1.1 #afs1.hcoop.net 2.2.2.2 #afs2.hcoop.net >whitehouse.gov 0.0.0.0 #ovaloffice.whitehouse.gov
Restart All AFS Servers
Now, restart each of the existing AFS servers, one at a time, so they reload their CellServDB files. To completely ensure continuity of service, always wait a full five minutes after restarting one server before restarting the next one (five minutes is the worst-case time needed for AFS peer servers to "recognize" each other and rejoin the cluster; in practice the time required is usually much, much shorter).
Unfortunately this really is necessary.
Set Up New AFS Server
Ensure Hostname Resolves
Execute this command, and make sure it works. If it doesn't, the AFS server will fail cryptically and mysteriously.
dig +short `hostname`
Copy CellServDB, UserList, KeyFile, BosConfig, ThisCell
Copy the CellServDB, UserList, KeyFile, and BosConfig from an existing AFS server:
mkdir -p /etc/openafs/server/ scp deleuze.hcoop.net:/etc/openafs/server/UserList /etc/openafs/server/ scp deleuze.hcoop.net:/etc/openafs/server/KeyFile /etc/openafs/server/ chown root:wheel /etc/openafs/server/KeyFile chmod o-r /etc/openafs/server/KeyFile scp deleuze.hcoop.net:/etc/openafs/CellServDB /etc/openafs/CellServDB scp deleuze.hcoop.net:/etc/openafs/BosConfig /etc/openafs/BosConfig
Relink CellServDB and ThisCell
The AFS client and server (which can both be simultaneously installed on the same machine) keep their CellServDB's in different places, for historical reasons. We can simplify our setup by symlinking the server's to the client's (the reverse will not work due to restrictive permissions on /etc/openafs/server/):
mkdir -p /etc/openafs/server/ ln -sf /etc/openafs/CellServDB /etc/openafs/server/CellServDB ln -sf /etc/openafs/ThisCell /etc/openafs/server/ThisCell
Create /vicepa
The AFS server will store its files in /vicepa. So, you should create that directory, ensuring it resides on whatever storage (raid, etc) you want to use for AFS backing. Furthermore, you must let AFS know that it is safe to use it:
touch /vicepa/AlwaysAttach
Install Debian Packages
dpkg -i /afs/hcoop.net/common/debian/openafs/1.4.6/openafs-{fileserver,dbserver}*.deb
Replicate Volumes
We want most of our readonly volumes to be replicated as widely as possible. So, for each readonly volume, you should:
vos addsite newserver.hcoop.net /vicepa volname vos release volname
Currently, the minimum of volumes you should replicate are:
common.bin common.databases common.logs old root.afs root.cell
Remove AFS server
Here's a list of tasks that were done when we were removing Krunk:
- Run vos listvol HOST to find existing volumes on the server. 
- Run vos remove -server HOST -id NAME|ID for each of them (note: really removes data! It's ok in case of replicated volumes whose r/w is kept elsewhere) 
- Run vos changeaddr -oldaddr HOST_IP -remove 
- Edit /etc/openafs/CellServDB on all machines to remove mention of HOST 
- Run bos shutdown krunk 
- Edit /afs/hcoop.net/common/etc/scripts/hcoop-kprop and remove mention of HOST, apply with DOMTOOL_USER=hcoop domtool hcoop.net 
- If the cell is published with grand.central.org, mail cellservdb@central.org and tell them the new CellServDB configuration 
To Do
The information in CellServDB needs to stay in sync with the AFSDB DNS entries -- they both contain essentially exactly the same data in different formats. Unfortunately AFS can't be modified to "do away with" the CellServDB file because the AFS fileservers are supposed to be able to operate correctly even when DNS is down (clients are another story). So, it would be nice to have some way of generating the CellServDB from the AFSDB records periodically.
CategorySystemAdministration CategoryNeedsWork CategoryOutdated
