welcome: please sign in

Diff for "ClintonEbadi"

Differences between revisions 83 and 84
Revision 83 as of 2019-02-10 21:07:02
Size: 11639
Editor: ClintonEbadi
Comment: did a couple of small things
Revision 84 as of 2019-02-10 21:43:56
Size: 14762
Editor: ClintonEbadi
Comment: scratching down some thoughts on managing member data stored outside of afs
Deletions are marked like this. Additions are marked like this.
Line 101: Line 101:

=== Managing Member Data Better ===

Trying to hash out how we can better manage user data when they depart; right now we reply on someone (me...) to go in 30 days after removing a member to remove all of their non-afs data manually, which is prone to error and needs to be managed better if we want to comply with things like GDPR (although I don't think it applies to us directly as a US corporation, it would be ''nice'' to do so since we really don't want to be retaining data longer than needed anyway, and the US will likely have stringent privacy laws aimed at facebook/google that will punish all the same for failing to remove data immediately too...).

 * Databases (/srv/databases/u/us/user/*)
 * SSL certs /etc/apache2/ssl/user/$domain
   * Maybe add username to the path; we are managing these with domtool perms so it's easy enough to find which certs belonged to a member... until we remove all of their permissions when removing their account.
 * Portal database data
   * Payment records / archived member app: cannot be deleted, we are legally required to have open books for our members
   * URL and location database entries: these should definitely be purged after the retention period
 * Firewall rules
   * Currently rules are stored in one file, would be easier to manage if we split each member's rules into a separate file
 * GNU Mailman lists
   * We don't track who owns which list outside of the list control request in the portal
 * ejabberd rosters (assuming ejabberd keeps these forever)
 * Incidental data in the roundcube webmail database (address book, preferences)
   * Including data for any vmail users associated with the member (additional wrinkle there: should we be removing this data as soon as a member deletes a vmail user?)

Not personal data, but things we should clear for housekeeping in general:

 * Apache davlockdb directory

We already manage mail and $HOME data fine, both are easy to clear (delete the volumes, done).

One aspect of the solution I've been thinking about that would also make it easier for members to export their data in general: we could set up a `backups.$user` volume, and store dumps of data we hold on behalf of the data there: database backups, mailing list archives, an exported dump of their ejabberd roster, maybe even copies of their ssl certs.

We also need a registry of data we keep on behalf of members that can't immediately be identified based on their username: at least ssl certs and mailing lists (those might be it though... and certs are already tracked with domtool perms so maybe just adding a "list" perm would be adequate, which could then be used to maybe allow members to perform some list management through a domtool program). And we need to store things like domtool permissions when destroying (might be able to achieve some of this by first freezing members before removing them), so that we can use them later when performing the final purge of data after the retention period ends (important to keep a brief retention period for those "oh right, I have to actually pay dues" moments...).

I am Clinton Ebadi. I am the Treasurer of the coop (someone has to do it), and the current lead sysadmin / DomTool maintainer / lo-fi AdamChlipala replacement.

1. Board Statements

See /BoardStatements

2. General Coop Goals

Unstructured musing on when/what I think the coop ought to be.

  • Summer 2018: Shiny new servers (achieved! in Winter 2018)

  • Spring 2019: Ponies for everyone.

3. Contact

  • <clinton at unknownlamer dot org> (Email)

  • <clinton at hcoop dot net> (XMPP)

  • unknown_lamer on freenode (IRC) in #hcoop

  • +1 443 538 8058 (Phone, SMS preffered as I will ignore your call if I don't have your number already and check voicemail approximately once per century)

4. Websites

5. Immediate Tasks

  • Letsencrypt integration into domtool
  • Managing data when members depart better
  • Useful, member accessible backups

5.1. January/Feburary 2019 High Priority Tasks

These must get done this month

  • (./) File e-postcard with IRS

  • File change of directories form with PA
  • Fix portal list subscriptions
  • Get on-site database backups working again so we can get rid of mysql-fixperms
    • Need regular backups in case members drop their entire db by accident
  • Implement SetEnvIf[NoCase] in domtool

    • Nextcloud needs this
    • Done, but not 100% on current syntax, waiting a few days before announcing formally

6. Admin Stuff

6.1. Improve SSL Experience

  • We are not managing certificates well
    • we don't check and warn members when their certs are going to expire
    • requesting certs via the filesystem is clunky for most users, we should support upload via the portal instead
  • Need to add letsencrypt support

6.1.1. Domtool

Replace use_cert cert function that just takes the final name instead of the full pathname, providing the full pathname is kind of clunky.

extern type your_cert;
extern val cert : your_cert -> ssl
(* SSL = cert "mydomain.pem"; *)

We also need to support letencrypt, perhaps like so:

extern type your_letsencrypt_cert;
extern val letsencrypt : your_letsencrypt_cert -> ssl
(* SSL = letsencrypt "my.domain"; *)

Which would find the certs in the standard location used by certbot... or we could use symlinks there from our location.

6.1.2. Portal

We should allow submission of certs / keys through the web interface. Can use an insert-only afs dir to securely allow hcoop.daemon to write certs without being able to access them, which would then be installed manually by an admin using ca-install.

The portal request page should display all certs a user is permitted to use already, their common name, and their expiration date.

6.1.3. Managing Certificates

Needs Updating Not very fleshed out, also does not consider how we're going to manage letsencrypt certs

Since we no longer need to support explicit intermediate certs (everything nowadays accepts the chain in the main certificate file), we can just use domtool's cert permission to track things.

A cron should check for things like:

  • Certs that are not owned by any member in domtool
  • Permissions to certs that don't exist
  • Expired or soon to expire certs (should exponentially back off notice until expiration date changes, like quotacheck does for low space), emailing the member as well as admins@

Certificate CN and validity dates should be shown on the portal ssl page; ssl check cron should cache this somewhere the portal can read it.

destroy-user needs to nuke certs for leaving members ... we need to overhaul this generally and stash all member data in one location when destroying for later removal / restoration (if they return in the 30 day deletion window).

6.1.4. Managing Member Data Better

Trying to hash out how we can better manage user data when they depart; right now we reply on someone (me...) to go in 30 days after removing a member to remove all of their non-afs data manually, which is prone to error and needs to be managed better if we want to comply with things like GDPR (although I don't think it applies to us directly as a US corporation, it would be nice to do so since we really don't want to be retaining data longer than needed anyway, and the US will likely have stringent privacy laws aimed at facebook/google that will punish all the same for failing to remove data immediately too...).

  • Databases (/srv/databases/u/us/user/*)
  • SSL certs /etc/apache2/ssl/user/$domain
    • Maybe add username to the path; we are managing these with domtool perms so it's easy enough to find which certs belonged to a member... until we remove all of their permissions when removing their account.
  • Portal database data
    • Payment records / archived member app: cannot be deleted, we are legally required to have open books for our members
    • URL and location database entries: these should definitely be purged after the retention period
  • Firewall rules
    • Currently rules are stored in one file, would be easier to manage if we split each member's rules into a separate file
  • GNU Mailman lists
    • We don't track who owns which list outside of the list control request in the portal
  • ejabberd rosters (assuming ejabberd keeps these forever)
  • Incidental data in the roundcube webmail database (address book, preferences)
    • Including data for any vmail users associated with the member (additional wrinkle there: should we be removing this data as soon as a member deletes a vmail user?)

Not personal data, but things we should clear for housekeeping in general:

  • Apache davlockdb directory

We already manage mail and $HOME data fine, both are easy to clear (delete the volumes, done).

One aspect of the solution I've been thinking about that would also make it easier for members to export their data in general: we could set up a backups.$user volume, and store dumps of data we hold on behalf of the data there: database backups, mailing list archives, an exported dump of their ejabberd roster, maybe even copies of their ssl certs.

We also need a registry of data we keep on behalf of members that can't immediately be identified based on their username: at least ssl certs and mailing lists (those might be it though... and certs are already tracked with domtool perms so maybe just adding a "list" perm would be adequate, which could then be used to maybe allow members to perform some list management through a domtool program). And we need to store things like domtool permissions when destroying (might be able to achieve some of this by first freezing members before removing them), so that we can use them later when performing the final purge of data after the retention period ends (important to keep a brief retention period for those "oh right, I have to actually pay dues" moments...).

6.2. etc.

  • default DirectoryIndex does not include index.shtml, should this be changed?

  • vhostDefault makes configuring the default vhost slightly unpleasant. Extend host with host_default token and eliminate?

6.3. Website

(create Website bugzilla product and move these there)

  • Convert hcoop.net into domtool config (looks trivial, a few rewrites... except for userdir support?)
    • On the topic of user dirs: allow members to register a redirect for hcoop.net/~foo?)
    • Perhaps: Move userdirs to http://users.hcoop.net/~foo (302ing from hcoop.net/~foo)

  • (./) Replace facebook links with other "get to know the members" text

    • Inspire members to join the planet
    • Make the locations tool usable again (something we can use with Openstreetmap).
  • Give RobinTempleton access as needed

6.3.1. Wiki

6.4. domtool plans

  • Feature backlog
  • Networked domtool-tail
  • Improve fwtool as needs become clearer (FirewallTool)

6.4.1. restricted modules for apache

Inspiration: hcoop.net's vhost is non generated by domtool, and only because it enabled mod_userdir

Idea: have a set of restricted modules that can only be used by superusers. Easiest to just have another ad-hoc list setting in config for domtool. ACL example: hcoop priv www, www priv overloaded to also allow use of restricted module.

Problems: no way currently to restrict access to actions or lib files.

Deficiences: priv www is a blunt instrument. priv system in general is mediocre. It might be nice to be able to do something like hcoop priv www apache-module/userdir mail/hopper.hcoop.net (i.e. access to all www nodes, access to the userdir module only, access to hopper). Keys gain some hierarchy polluting the purity of the triples db, but it is already a bit polluted... is there any difference between adding hierarchy to priv keys and the existing implicit hierarchies in domains and paths?

Solution: might be overkill just for mod_userdir, if it looks like minimal additional code is required perhaps implement hierarchical privs (extending www and mail privs to support limiting to particular admin hosts) and restricted apache modules.

6.4.2. Pattern Matching and New Types

A vague idea that may prove to be unworkable. I think at least implementing list matching in domtool would be quite useful. Abstraction syntax would be need to be improved to support multiple clauses. case would also be needed to make it useful. Syntax would be easy enough to add except for having to deal with runtime non-exhaustive match exceptions (perhaps requiring exhaustive matches and living with the limitation). Ambitious, probably time consuming, might require adding tail call optimization to the interpreter. Example:

(* A map operator *)
val map = \action -> \list -> 
  case list of head::tail =>
    begin
      action head;
      map action tail;
    end
     | [] => Skip;
(* Alias a list of email addresses to *)
val multiAlias = \sources -> \target -> map (\source -> emailAlias source target) sources;

I probably lack the skill/willpower in the short term... alternative idea, just implement a loop primitive in SML and magic the types away by making it a primitive construct (defining its type on DomTool/LanguageReference). Maybe implement polymorphic actions if adding then is secretly easy:

extern val map : (('a -> 'b) -> ['a]) -> [^Root];

(* Alias a list of email addresses to *)
val multiAlias = \sources -> \target -> map (\source -> emailAlias source target) sources;

Most of the gain, none of the pain.

New types: even more ambitious. Supporting at least tuples or named records, and perhaps a construct for querying the domtool acl database. Idea would be to use it for something like the firewall, where only primitive "generate one firewall rule" constructs would be needed, and then user firewalls could be constructed by querying the ports available to each user and matching/looping.

One pattern that has recurred in domtool is that of a special purpose client + server commands that operates on a simple database. E.g. spamassassin prefs, vmail users, firewall rules, and the domtool acl database. It would be useful to have a generalized serialize/unserialize sets of sml records library, perhaps with a generalized/queryable tuples database built on top of the primitive raw-records database. Even better would be to allow databases to be exposed to domtool, and simple queries performed on them. Maybe.

val writeRecord' : [('record -> 

6.4.3. ip / ipv4 / ipv6

There are a few places (mostly apache) where it would be great to be able to interchange ip and ipv6 addresses. But there's no way to subtype in domtool (except for refining base int and string).

It's been shoehorned in for now (always requiring a node to have an ipv6 address), but this can be a bit awkward (e.g. webAtIp requires that an ipv4 and ipv6 address be provided).

Also might make sense to be able to pass an array of IPs in a few spots instead of just fixing it at one ipv4 and one ipv6 address per WebPlace. you can just pass more than one WebPlace already.

7. Board Stuff

  • What address should we be using for the "owner" of hcoop services? It's a mix of the current treasurer, registered agent, and data center right now... is the registered agent correct? Do we need to get a PO Box or something?
  • Possible policy change: pro-rate the first month of dues for new members. It seems a bit unfair to make folks pay an entire month of dues, especially if they join near the end of the month.
    • Board did approve this, but was never implemented :-\
  • Do we need a private wiki of some sort for filing sensitive information like welcome emails for services and various credentials... is it safe enough to do that using the public moin and acls? I'm not sure sure...

8. etc

Barely formed sentences.

8.1. tt-rss at hcoop

Our postgresql does not use passwords. The installer needs a single tweak to remove the required attribute of the database password to install.

8.2. Fastcgi Problems with openafs

Old Content: This section is from before fastcgi was implemented. Leaving so I might remember one day to try the idea of adding an suexec hook to apache.

The mod_fcgid spawner runs in its own process pool and therefore without tokens or the ability to acquire tokens for processes it launches. Thus, all fcgi processes must be wrapped to avoid surprising behavior. The FcgidWrapper directive is not very expressive: the "wrapper" is the fastcgi application that is launched and then passed any files matching the extension using SCRIPT_NAME. A wrapper wrapping script is needed to grab tokens before launching the actual wrapper, and just using {Add,Set}Handler fcgid-script won't work as expected (users could of course arrange for programs run that way to grab tokens manually).

mod_wsgid and mod_cgid have identical problems. Inspecting the apache source code makes it appear that it would be possible to fix the situation generally by adding a pre/post suexec hook. The process managers for mod_cgid/mod_fcgid/mod_wsgid are forked from the primordial apache process which I understand has all modules loaded. Modules like mod_auth_kerb and mod_waklog could then inject tickets/tokens/etc. into the environment from which external processes were spawned using the suexec hooks.


CategoryHomepage

ClintonEbadi (last edited 2019-05-10 14:33:51 by ClintonEbadi)