welcome: please sign in

Diff for "FritzVirtualization"

Differences between revisions 47 and 48
Revision 47 as of 2012-12-09 07:22:19
Size: 18726
Editor: ClintonEbadi
Comment: time to start moving content to new pages
Revision 48 as of 2012-12-09 07:51:31
Size: 11490
Editor: ClintonEbadi
Comment: prune completed tasks from task list
Deletions are marked like this. Additions are marked like this.
Line 9: Line 9:
 * (./) Set up network bridge
 * (./) Create test KVM to discover preseed values and other config bits
 * (./) Preseed Debian Install
   * (./) Generate basic preseed file where login + `kinit && aklog` work
   * (./) Custom partman config
     * `/`, `/boot`, `/tmp`, and `/var/cache/openafs/`
   * (./) Preseed that installs `libnss-afs` and `hcoop-nsswitch-config`
   * (./) Add hcoop archive to installed `sources.list`
 * (./) Create local Debian archive
   * {o} gnupg keyring etc. for verified package builds
   * {o} Archive key for secure apt installs
 * (./) Package `nsswitch.conf` changes and generate preseed for a machine that recognizes pts users (ssh $hcoop-user@machine should work at this point)
   * (./) Update DebianPackaging with information on creating config packages and using `dput` to push them to the archive
 * (./) Update FirewallRules `closed.conf` for the modern age
   * (./) Add hostname field `fwtool` firewall config (so that users / services can have different ports on different machines)
   * (./) Codify universal afs / kerberos / etc. ports that always have to be open in firewall config (can probably mostly yank this info from fritz)
   * (./) Create `hcoop-firewall-config` package with default restrictive firewall for all nodes
     * (./) Create empty system rule files as conffiles
     * {X} Create per-machine system services conffiles package (alternative: hcoop-$service-config includes ferm.d rules)
 * (./) Package essential configuration
   * (./) sudoers + login.restrict
     * Change: list all admins explictly; adding admins to `wheel` on each machine is prone to decoherence. Less than ideal having to update two packages with per-node admin access info, but better than the situation before, and it can be improved.
   * (./) pam files to check login.restrict via pam_listfile on admin-only nodes
     * --(I think there is a pam config framework that we should use for this)-- (it doesn't help us for restricting logins but not auth in general)
   * (./) sshd (GSSAPI support)
     * {o} restart sshd after install (punting for now, not really ''neccessary'' for preeseeding)
       * Actually easy: just setup a trivial postinst/prerm ala `hcoop-apache2-config`
   * (./) krb5.conf (admin_server)
     * Fun fact: the SRV record based way of locating the admin server does not actually work (even in kerberos 1.10) so you still have to distribute a custom krb5.conf (listing admin servers in the `admin_server` clause of `libdefaults`). Feh!
     * (./) Alias `kerberos-adm.hcoop.net` to whichever machine is the master KDC (hack until MitKerberos can use SRV records)
     * (./) Set `domain_realm`. Default mapping logic seems like it would work, but it doesn't, feh!
 * (./) Essential custom Debian packages
   * (./) Move `libnss-afs` sources from `~clinton_admin` to repo source directory
   * (./) Check packaging of `mod_waklog` and include in repository
 * (./) Install domtool (automate as much as possible)
   * (./) Backport `mlton-tools` from Wheezy
     * Through some wizardry `ml-nlffigen` was in lenny, didn't make it into squeeze, but got back into testing afterward?
   * (./) Update `domtool` init scripts to work with `insserv` since non-dependency-based init is deprecated and will be removed in `wheezy`
   * (./) Synchronize keytabs (just updating the existing script to sync to kvm is fine for the next year...)
   * (./) Fix aklog failure in `domtool-admin-sudo` so it actually works
     * Turns out pagsh in pagsh doesn't create a new pag, but just runs in the current pag. You only get a new pag if you don't have one. k5start performs the needed incantations internally to take care of the job luckily.
Line 51: Line 11:
   * (./) Package the result
     * (./) Remember `apache-sync-logs` cron job
     * {i} Not enabling `mod_cache` -- do we need to enable `mod_disk_cache` (and related cron job to keep cache pruned?)
Line 56: Line 13:
   * {X} Add new `phpVersion 53` to DomTool and (hopefully this can be done) make `phpVersion` support checking if the host supports that version (easy check: if the node is mire, support 4/5, if the node is fritz only support 5.3)
     * Can't be done -- we can only differentiate between 4 and 5 :-\
Line 60: Line 15:
   * {X} Metapackage for php, basic cgi libraries, etc. (enough to bring up HCoop services)
     * OK if this is an ugly package
     * Punting to post install script for now
Line 64: Line 16:
   * (./) Build local fork of `suphp` (dang maintainer builds it in an insecure mode)
   * (./) Craft firewall rules
     * (./) Need to add services.d/* reading to firewall config
     * (./) Allow to listen on 80/443 (naturally)
       * {X} (--(If Possible,)-- iptables --(may not)-- doesn't support this) Restrict listening on those ports to `www-data` user
     * (./) Allow proxying to some port range on mire/bog (30000-... seems semi-reasonable -- basically give the upper half of port space to members)
       * Might want to do initial ad-hoc allocations near the upper end of the range (top 1024/2048) to avoid having a cluttered base (or: implement something to easily manage blocks of ports so that members can add a reasonable number of ports in the future and not have their addresses jumping all over the place... might benefit firewall filtering too)
       * {i} Opened to allports > 1024 on mire for the transitional period.
 * (./) Name machines (HostnameSuggestions)
   * (./) User server: `bog`
   * (./) Web server: `navajos` (decided to use `mccarthy` for future admin node)
 * (./) Integrate into infrastructure
   * (./) Sync keytabs
   * (./) DomTool configuration
     * (./) Add as slave
     * (./) Add as user webnode
     * (./) Generate node certificate
   * (./) Route outgoing mail through deleuze
   * (./) Add to portal for package requests etc.
   * (./) Finalize network bridge name: `br0:navajos`. But we need to see how bridges work when you have multiple kvm instances.
 * (./) Automate post-install
   * (./) Extract host keytab
   * (./) Deploy domtool slave
   * (./) Create domtool node directories
     * Not sure of the best way to do this ... only nodes that run particular services ought to have the required directories and it seems better handled during domtool installation. For now we can live with a navajos-specific postinst (in addition to a generic one)
   * (./) Install packages needed for web server
 * (./) Deal with Kerberos principles and weak crypto
   * Issue: most of our daemon and user principles were generated with older kerberos and use weak crypto. The krb5 on navajos is set to reject these ciphers so... Either we:
     * {X} Enable weak_crypto in krb5.conf -- bad! We are just delaying the inevitable with this, and likely will never be able to turn it off. On other hand, if all other paths prove untenable :(
     * (./) Regenerate all principles (or at least /daemon). I think the only side effect of regenerating the /daemon principles would be processes running under k5start would cease to work, but that's not too bad of a cost with an announced time. Regenerating user principles IIRC invalidates their password... kind of a pesky problem, but one that can probably be punted to bringing up bog (we need to force a password reset anyway, our policy sucks). More research is needed.
   * {i} It also turns out that we had to use an undocumented krb5 api to enable weak crypto when acquiring tokens in mod_waklog. That was fun hunting... hooray for openafs still using des.
Line 97: Line 18:
     * Remember!: Checkpoint the machine after running the preseed since that is known to work rather than reinstalling repeatedly...
Line 107: Line 27:
   * {o} Clone wiki and upgrade to moin 1.9 on navajos
     * Move wiki into afs
Line 114: Line 32:
     * {o} Set up postgresql 9.1
     * {o} Install postgres/mysql client libraries
Line 125: Line 45:
==== Other Tasks ==== Other tasks, lower priority:
Line 127: Line 47:
These need to be done, but aren't going to kill anyone if they go undone until after the new machine is up. A lot of them were surfaced through the setup process, but we don't have a year to right every wrong...

 * {o} Package configurations
   * {o} debarchiver
 * {o} Fix local user vs ptdb user for system services UID mismatch
 * {o} Repackage `libnss-afs` for the current Debian standards version
   * Through some magic the package builds fine for the time being
 * {o} Clean up local Debian archive
   * {o} gnupg keyring etc. for verified package builds
   * {o} Archive key for secure apt installs
   * {o} Package debarchiver config
     * OR: switch to another archiver, and then package that config. I hear debarchiver isn't recommended.
   * {o} Create an afs group that can write to the `incoming` directory of the archive
   * {o} Debug debarchiver cron job (it isn't working, feh!)
 * {o} Configuration package nits
   * {o} ssh: restart sshd after installation
     * Actually easy: just setup a trivial postinst/prerm ala `hcoop-apache2-config`
   * {o} firewall: restart ferm after installation
Line 136: Line 60:

{{{#!wiki note
This, and other information, should be merged into a general description of our infrastructure and how it differs from a stock Debian installation.
}}}
Line 146: Line 74:

{{{#!wiki note
This, and other things, should be merged into a "Undecided Infrastructure Issues" document, so that folks don't make the mistake that "the path of least resistance" is how we wanted to do things.
}}}
Line 198: Line 131:
{{{#!wiki note
Move to DebianMirror
}}}
Line 199: Line 136:

== TODO ==

 * {o} Create an afs group that can write to the `incoming` directory of the archive
 * {o} See why `libnss-afs` can no longer build with openafs 1.6 (not an issue at the moment, but it will be sooner than we expect...)
 * {o} Debug debarchiver cron job (it isn't working, feh!)
Line 244: Line 175:
{{{#!wiki note
Move to a page describing infrastructure decisions
}}}

Initial scratch notes on getting kvm working on fritz. This will need to be integrated into SetupNewMachines and AdminArea after everything is working.

See http://wiki.hcoop.net/Migration2009/SoftwareSetup for the gist of what ClintonEbadi is trying to do here, but s/OpenVZ/KVM via libvirt/g.

1. Tasks

(./) = done, {o} = not done, <!> = possibly done, awaiting verification, {X} = gave up or died trying

  • <!> Apply advanced wine making techniques to carefully blend the Apache configurations on fritz and mire

    • {o} Per machine NameVirtualHost config

      • It turns out that both the NameVirtualHost and VirtualHost directive must use * or an explicit IP. For the sake of correctness, keeping the IP in VirtualHost directives seems like a Good Idea (tm), so we need to have domtool install /etc/apache2/conf.d/hcoop-namevhost-$machine for every web serving node.

    • {o} Change defaultPhpVersion to 5

      • {o} Check all user domtool configs and explicitly set phpVersion = 4 if needed

    • <!> Domtool mod_proxy support to machines other than localhost

  • <!> Spin up the fancy new Apache KVM and pray that it works

    • (./) Reinstall using preseed + postinst

    • {o} Make openafs start earlier in boot process

      • Things like apache need to resolve pts users; it's easier to divert/transform the openafs script than to divert every script that needs access to afs uids.
    • {o} Migrate and upgrade hcoop wiki

      • Ideally into afs space, owned by the hcoop account.
    • {o} Move gitweb and git hosting over

    • {o} Set up rcube

    • {o} Install squirrelmail at webmail.hcoop.net temporarily (mod_proxy from deleuze later)

    • {o} Turn off fritz's Apache (it's the KVM host and KDC ... change of plans, eh)

      • {o} Deal with http://debian.hcoop.net (if we move it to navajos, we can't reinstall navajos... but we probably should move it to navajos)

    • (./) Move unknownlamer.org onto navajos (what better a guinea pig)

    • {o} Assist jbms with moving the conkeror wiki

      • It's pretty high traffic and will benefit the most from having access to more power / relieves the immediate load pressure on mire
    • {o} Migrate Portal

      • Move all data storage into afs!
      • {o} Set up postgresql 9.1

      • {o} Install postgres/mysql client libraries

    • {X} Point hcoop.net at the new machine (also a huge reconfiguration PITA)

    • {o} Start assisting the first brave users with "moving" to new machine (i.e. webAt "newNode", or adding an env var to Easy_Domain to change the default web node for everything)

      • "Volunteers": SteveKillen, BtTempleton

      • After sure of everything working, inspect all user DomTool configs and make the needed changes for the users to switch their hosting to new node (in trivial cases e.g. mod_proxy to app on mire, static file serving)

  • {o} Using lessons from above tasks, spin up new user shell machine

  • {o} Harrass any users who refuse to leave mire

    • {o} Remove php4 support from domtool

  • {o} Turn mire off, remove from rack, set on fire

    • {o} Move secondary DNS to hopper, update ns aliases

    • {o} Move phpmyadmin hosting to navajos

Other tasks, lower priority:

  • {o} Clean up local Debian archive

    • {o} gnupg keyring etc. for verified package builds

    • {o} Archive key for secure apt installs

    • {o} Package debarchiver config

      • OR: switch to another archiver, and then package that config. I hear debarchiver isn't recommended.
    • {o} Create an afs group that can write to the incoming directory of the archive

    • {o} Debug debarchiver cron job (it isn't working, feh!)

  • {o} Configuration package nits

    • {o} ssh: restart sshd after installation

      • Actually easy: just setup a trivial postinst/prerm ala hcoop-apache2-config

    • {o} firewall: restart ferm after installation

2. Packages Config

This, and other information, should be merged into a general description of our infrastructure and how it differs from a stock Debian installation.

Things not mentioned on SetupNewMachines that had to have their default debconf values changed.

  • ssmtp

    • forward all mail for UID < 1000 to logs

    • Masquerade as hcoop.net

  • PAM
    • Newfangled pam-config framework for a fresh squeeze install looks quite promising... (enabled kerberos + unix + afs session)

3. Major Open issues

This, and other things, should be merged into a "Undecided Infrastructure Issues" document, so that folks don't make the mistake that "the path of least resistance" is how we wanted to do things.

  • Exim setup (have to add to forwardable domains on deleuze)
  • Figuring out what to do wrt local users for system services that need to access afs
    • e.g. Apache, Exim, debarchiver, domtool, impad, spamd, ...
    • AFAICT, it makes more sense to just have afs users -- if the ptdb is gone, the services will not operate in a correct way anyway
    • Removes issues with keeping UIDs in sync
    • How does this interact with Debian automatically creating system users for packages?
    • A few system users were created using create-user -- mail is routed to them and they are subscribed to mailing lists and whatnot which is ... probably bad. i.e. We probably want to split create-user into the portions to just create an afs/kerberos user and then to do the fancy stuff an actual factual human user needs.

  • Integration with package requests
    • Preseeding means we can kill/respin web node images with ease -- but not restore packages users have requested to support their cgi programs
    • If the portal stores this info, we need a package to reinstall user packages
  • Is the keytab situation messy? Having the domtool keytab at the toplevel seems out of place
    • Reading old -sysadmin posts revealed that only having $user.daemon keys was supposed to be temporary -- we really should have a few standard principles for hcoop services to access data (e.g. $user.mail for mail delivery), and have a portal interface (and domtool integration) allowing users to request additional principles if they want (e.g. $user.$webapp-they-run ... or just $user.cgi in the default Just Works (tm) configuration)

3.1. fwtool

Making FirewallRules support all of the needed functionality for a user machine is proving difficult

  • Want to store per-node firewall config for system services (apache, exim, imap, etc.)
    • Ideally, also store common port config (afs, kerberos, domtool, etc.)
  • Need to easily grant certain users additional permissions (e.g. all admins should be allowed outgoing ssh, normal users should not) on multiple nodes
  • Don't want to tie configuration to physical nodes (e.g. moving to a new shell server)
  • Restrict users to having rules on certain nodes (statically enforced)

Conclusion: the current fwtool implementation would require duplicating a lot of functionality already present in the support machinery for the domtool domain type. A new syntax for user rule files would need to be created (or tons of hackish supporting code) so ...

The only (in)sane way forward is to create a domtool node type and firewall plugin to manage rules. This has distinct advantages:

  • Takes advantage of current domtool infrastructure for pushing configs
  • The domtool language is quite nice and has the needed functionality for abstracting groups of users, common config, etc.
  • Lays the groundwork for using domtool to perform node management in addition to domain management

And a few distinct disadvantages:

  • Need to solve the uid mismatch between nodes for system users problem beforehand
    • ClintonEbadi says: openafs pts shall be the only source for users (except those needed before afs can come up)

  • Pretty hefty time/code investment
    • ClintonEbadi's SML-fu and type theory are weak (but knowledge is power™)

    • The portal will need updating (but its interface to firewall rules sucks anyway)

Interim solution:

Getting a user shell machine online is slightly less important than shifting cgi hosting off of mire (load average is usually high, software is outdated). Users can live with for another month logging into an etch system but running their php and whatnot on a new machine... Therefore:

  • Common ferm config distributed using Debian package
    • conffiles for local system ports created
  • Local ports config distributed using another Debian package (divert of site-wide conffiles)

This will force codification of the open ports for the web server machine, and will be easy to undo when domtool support is in place. A slightly hacked together FirewallRules may need to be used for the user node (time, what is time?) -- but a restrictive firewall must be used (it's impossible to implement one on a box that didn't have one before with breaking things). Unfortunately, without SELinux, we can't restrict what ports members listen on, so input firewall rules will be less useful than they could be for now.

4. Debian Mirror

Move to DebianMirror

See DebianPackaging (look ma, I kept the docs up to date)

5. Config

As part of standardizing the config ... these should be put into hcoop-debarchiver-config and hcoop-dput-config

/etc/debarchiver.conf: see hopper, too long to include

/etc/cron.d/debarchiver: Unfortunately not quite working -- for some reasons this has to be done twice before Packages is updated (this happens with my local debarchiver so I ... have no idea)

#
# Regular cron jobs for the debarchiver package
#
# Run the archiver every five minutes.
*/5 * * * *     debian-archive  test -x /usr/bin/debarchiver && k5start -f /etc/keytabs/user.daemon/debian-archive -t -U -- debarchiver --autoscanall --addoverride | logger -t debarchiver -p daemon.info

6. Debian Based Package Config

Most info updated at DebianPackaging

Packages needing customization on all machines:

Packages that need customization if installed:

  • whatever imapd we use on the new machines
  • exim
  • ejabberd
  • apache

Ideas:

  • virtual packages hcoop-user-node-config and hcoop-services-node-config that conflict and depend on the appropriate basic config settings (e.g. for setting up login.restrict, default ulimits, etc.) This is trivial using equiv.

  • If we want to use runit for services, we might include the service files and init.d overrides

  • What copyright policy should we take on conffiles (are they copyrightable? ... at least disclaim copyright? Does basing them on debian base files mean they have to take the license of the package?)

7. Installer Preseeding

Move to a page describing infrastructure decisions

http://wiki.debian.org/DebianInstaller/Preseed

http://git.hcoop.net/?p=hcoop/machine-template.git;a=summary

Pretty useful, need to document more.

Installer command line: auto url=http://hcoop.net/~clinton_admin/preseed-test-0.cfg

Proof this is worth it (enter network info -> hot damn any afs user can login to the kvm)

http://unknownlamer.org/tmp/proof.png


CategorySystemAdministration CategoryWorkInProgress

FritzVirtualization (last edited 2013-01-28 07:21:09 by ClintonEbadi)