welcome: please sign in

Diff for "SoftwareArchitecturePlans"

Differences between revisions 2 and 3
Revision 2 as of 2006-07-08 17:52:52
Size: 5450
Editor: AdamChlipala
Comment:
Revision 3 as of 2006-07-08 18:20:07
Size: 8379
Editor: AdamChlipala
Comment:
Deletions are marked like this. Additions are marked like this.
Line 215: Line 215:


= Security =

Here are the security issues we need to worry about, sorting by resource categories of varying abstraction levels. What we mostly deal with here is avoiding negative consequences of actions by members with legitimate access to our servers.


== CPU time ==

We haven't really encountered any trouble with this literal resource yet. However, potential problems come in when we're talking about user dynamic web site programs called by a shared Apache daemon. Apache allocates a fixed set of child processes, and each pending dynamic web site program takes up one child process for the duration of its life. Enough infinite-looping or slow CGI scripts can bring Apache down for everyone.

=== Current remedies ===

As per ResourceLimits, we use patched `suexec` programs to limit dynamic page generation programs to 10 seconds of running time. We also have a time-out for `mod_proxy` accesses, which we provide to allow members to implement dynamic web sites through their own daemons that the main Apache proxies.


== Disk usage ==

We can't let one person use up all of the disk space, now can we?

=== Current remedies ===

We use group quotas so that members can be charged for files that they don't own. This is still hackish and allows some unintended behaviors. DaemonFileSecurity has more detail.


== Network bandwidth ==

We don't do a thing to limit this now, since our current host provides significantly more bandwidth than we need.

=== Questions to be resolved ===

 1. Should we start doing anything beyond monitoring?


== Network connection privileges ==

It's good to follow least privilege in who is allowed to connect to/listen on which ports.

=== Current remedies ===

We have a firewall system in place now. It uses a custom tool documented partially on FirewallRules.


== Number of processes ==

Fork bombs are no fun, and many resource limiting schemes are per-process and so require a limit on process creation to be effective.

=== Current remedies ===

As per ResourceLimits, we use the `nproc` ulimit.


== RAM ==

This is probably the most surprising thing for novices to the hosting co-op planning biz. If you would classify yourself as such, then I bet you would leave RAM off your list of resources that need to be protected with explicit security measures!

Nonetheless, it may just be the most critical resource to control. In our experiences back when everything ran on Abulafia, the most common cause of system outage was some user running an out-of-control process that allocated all available memory, causing other processes to drop dead left and right as memory allocation calls failed. We're letting people run their own daemons 24/7, so this just can't be ignored.

=== Current remedies ===

As per ResourceLimits, we use the `as` ulimit to put a cap on how much virtual memory a process can allocate.

1. Terminology

To save space below, we'll use the following working names for the different pieces of hardware involved:

  • Main is the machine hosting most services.

  • Dynamic is the machine hosting member dynamic web sites and other services where we run arbitrary code written by members.

  • Shell is the "most anything goes" shell server.

2. Daemons shared by members

2.1. DNS

2.1.1. Decisions that we've agreed on

  • Running djbdns on Main

2.1.2. Questions to be resolved

  1. How do we arrange redundant DNS infrastructure?

2.1.3. References to how we do things now

DnsConfiguration, DomainRegistration

2.2. FTP

2.2.1. Decisions that we've agreed on

  • Run an FTP daemon on Main
  • Only allow encrypted authentication methods
  • Only allow users on a white-list to use FTP; they should be using SCP if possible

2.2.2. References to how we do things now

FtpConfiguration, FileTransfer

2.3. HTTP

2.3.1. Decisions that we've agreed on

  • Using Apache 2
  • Running all official/administrative HCoop web sites on Main
  • Running all member dynamic web sites on Dynamic

2.3.2. Questions to be resolved

  1. Do we completely separate adminstrative web sites from the rest, or do we allow any member static web site to be served by Main?

2.3.3. References to how we do things now

UserWebsites, DynamicWebSites, VirtualHostConfiguration

2.4. IMAP/POP

2.4.1. Decisions that we've agreed on

  • Running the primary IMAP/POP daemons on Main
  • Running both SSL and normal versions, where the normal versions can only be used over the local network

2.4.2. Questions to be resolved

  1. Do we keep using Courier IMAP or do we switch to something like Cyrus?

2.4.3. References to how we do things now

UsingEmail, EmailConfiguration

2.5. Jabber

2.5.1. Decisions that we've agreed on

  • Run the same thing we're running now, on Main

2.5.2. References to how we do things now

JabberServer

2.6. Mailing lists

2.6.1. Decisions that we've agreed on

  • Using the Mailman software
  • Running the daemon on Main

2.6.2. Questions to be resolved

  1. How/where do we store mailing list data so that it is appropriately charged towards a member's storage quota?

2.6.3. References to how we do things now

MailingListConfiguration

2.7. Relational database servers

2.7.1. Decisions that we've agreed on

  • Running PostgreSQL and MySQL servers on Main

2.7.2. Questions to be resolved

  1. Are we satisfied with the latest versions from Debian stable, or do we want to do something special?
  2. Do remote PostgreSQL authentication (from Dynamic, etc.) via the ident method?

2.7.3. References to how we do things now

UsingDatabases

2.8. SMTP

2.8.1. Decisions that we've agreed on

  • Using Exim 4
  • Running the primary SMTP daemon on Main

2.8.2. Questions to be resolved

  1. Run secondary MX on Dynamic or elsewhere?

2.8.3. References to how we do things now

UsingEmail, EmailConfiguration

2.9. Spam detection

2.9.1. Decisions that we've agreed on

  • Running the SpamAssassin spamd daemon on Main

  • Running it via the spamc client on all mail to opted-in addresses, but leaving filtering based on the added headers up to the individual recipients
  • Keeping a shared Bayes filtering database that can be trained by members by depositing misclassified messages into shared folders

2.9.2. References to how we do things now

UsingEmail, SpamAssassin, FeedingSpamAssassin, SpamAssassinAdmin

2.10. SSH

2.10.1. Decisions that we've agreed on

  • Use the standard SSH daemon in Debian
  • Run it on all of our servers, with varying access permissions based on the shared user list

2.10.2. References to how we do things now

SshConfiguration

3. Services run on top of these daemons

3.1. Domtool

Everyone's favorite spiffy system for letting legions of users manage the same daemons securely.

AdamChlipala says:

  • I would like to rewrite this completely, for reasons including: From a software engineering perspective, the implementation is not so nice. There is no support for configuring multiple machines from the same configuration file source. Scalability with the increasing amount of configuration is not so hot. The current configuration scheme encourages copying-and-pasting, which makes it hard to make sweeping changes to our suggested configuration base.

3.1.1. References to how we do things now

DomainTool

3.2. Portal

3.2.1. Decisions that we've agreed on

  • Keep doing the same as now, running on Main

3.2.2. References to how we do things now

[https://members.hcoop.net/ The portal]

3.3. Web e-mail client

3.3.1. Decisions that we've agreed on

3.3.2. References to how we do things now

[http://mail.hcoop.net/ SquirrelMail]

3.4. Webmin/Usermin

3.4.1. Decisions that we've agreed on

  • Keep doing the same as now, running on Main

3.4.2. References to how we do things now

[https://members.hcoop.net/usermin/ Usermin]

3.5. Wiki

3.5.1. Decisions that we've agreed on

  • Keep the current MoinMoin wiki, starting from the same data, run on Main

3.5.2. References to how we do things now

[http://wiki.hcoop.net/ This wiki]

4. Security

Here are the security issues we need to worry about, sorting by resource categories of varying abstraction levels. What we mostly deal with here is avoiding negative consequences of actions by members with legitimate access to our servers.

4.1. CPU time

We haven't really encountered any trouble with this literal resource yet. However, potential problems come in when we're talking about user dynamic web site programs called by a shared Apache daemon. Apache allocates a fixed set of child processes, and each pending dynamic web site program takes up one child process for the duration of its life. Enough infinite-looping or slow CGI scripts can bring Apache down for everyone.

4.1.1. Current remedies

As per ResourceLimits, we use patched suexec programs to limit dynamic page generation programs to 10 seconds of running time. We also have a time-out for mod_proxy accesses, which we provide to allow members to implement dynamic web sites through their own daemons that the main Apache proxies.

4.2. Disk usage

We can't let one person use up all of the disk space, now can we?

4.2.1. Current remedies

We use group quotas so that members can be charged for files that they don't own. This is still hackish and allows some unintended behaviors. DaemonFileSecurity has more detail.

4.3. Network bandwidth

We don't do a thing to limit this now, since our current host provides significantly more bandwidth than we need.

4.3.1. Questions to be resolved

  1. Should we start doing anything beyond monitoring?

4.4. Network connection privileges

It's good to follow least privilege in who is allowed to connect to/listen on which ports.

4.4.1. Current remedies

We have a firewall system in place now. It uses a custom tool documented partially on FirewallRules.

4.5. Number of processes

Fork bombs are no fun, and many resource limiting schemes are per-process and so require a limit on process creation to be effective.

4.5.1. Current remedies

As per ResourceLimits, we use the nproc ulimit.

4.6. RAM

This is probably the most surprising thing for novices to the hosting co-op planning biz. If you would classify yourself as such, then I bet you would leave RAM off your list of resources that need to be protected with explicit security measures!

Nonetheless, it may just be the most critical resource to control. In our experiences back when everything ran on Abulafia, the most common cause of system outage was some user running an out-of-control process that allocated all available memory, causing other processes to drop dead left and right as memory allocation calls failed. We're letting people run their own daemons 24/7, so this just can't be ignored.

4.6.1. Current remedies

As per ResourceLimits, we use the as ulimit to put a cap on how much virtual memory a process can allocate.

SoftwareArchitecturePlans (last edited 2018-04-22 01:34:40 by ClintonEbadi)