welcome: please sign in

Diff for "SystemArchitecturePlans"

Differences between revisions 7 and 26 (spanning 19 versions)
Revision 7 as of 2006-03-25 17:23:38
Size: 7014
Comment:
Revision 26 as of 2012-12-14 17:03:35
Size: 2124
Editor: ClintonEbadi
Comment: categorize (historical info)
Deletions are marked like this. Additions are marked like this.
Line 1: Line 1:
= Details about the next Hcoop Architecture = = Details about the next HCoop Architecture =
Line 3: Line 3:
This page is intended to facilitate discussion of details relating to our next server architecture. Currently, the first draft of this page, written on Sat Mar 25 10:18:12 EST 2006 by JustinLeitgeb, is based upon discussions from the hcoop mailing list. Please feel free to contribute or change anything here! This page will serve as the blueprint for the architecture that HCoop is constructing right now, to be colocated at Peer 1.
Line 5: Line 5:
== Hcoop Future Network Overview == <<TableOfContents(2)>>

== Network Overview ==
Line 9: Line 11:
 * A fileserver, running AFS which is accessible via ssh only for administrative purposes
 * A public login and http server, accessible by all members. User files are stored on the fileserver mentioned above. This will host all user pages, including dynamic content.
 * A server for hcoop needs that most users won't need direct shell access to. This will run Cyrus IMAP, exim, and Apache primarily for hcoop administrative purposes.
 * A back-end server, which will serve IMAP, MySQL, PostgreSQL, primary mail, and AFS.
 * A public shell server for development and deployment of files to web server.
 * A public web server, also used for daemon processes written by our users. This will eventually be one node of a web cluster.
Line 13: Line 15:
Additionally, we will need certain networking equipment: == High-Level Architecture Description ==
Line 15: Line 17:
 * A gigabit switch that will be the initial backbone of the hcoop LAN.
 * Perhaps (still not finalized in plans) a hardware firewall for the hcoop LAN. Ideas on this from members?
The new HCoop architecture initially involves three servers; one for user shell logins, one for back-end services, and one for web service. Our goal is to build an architecture that serves us well based on our current needs, and can be expanded for increased capacity in the future with little effort.
Line 18: Line 19:
We should also remember that all of our servers will most likely have at least two NIC's. How can we utilize these best? Some sites have one NIC doing backups or logging, and another handling requests from the Internet. Perhaps we could segment our traffic to two local area networks, one for services to the Internet and another for local file access (i.e., traffic between the two "public" servers and the file server). Based on this, we should have one shell server that users edit and develop their web sites from. This will allow users to modify files on one web server in the present, and a cluster as we grow.
Line 20: Line 21:
== Hcoop Future Network Diagrams == == Design Goals ==
Line 22: Line 23:
The following are a preliminary version of a network plan that JustinLeitgeb created on March 25, 2006, after discussions on the hcoop.net mailing list. Included in the design is a hardware firewall, which was not finalized in previous discussions. Let's collect thoughts and alternate plans here as we work towards solidifying plans. We should be able to plug new web servers into our architeture in the future in a manner that doesn't break our software systems. We may also want to think about doing the same thing with shell and file services.
Line 24: Line 25:
 * [attachment:network_diagram_20060325.dia Network planning diagram in "dia" format for editing]
 * [attachment:network_diagram_20060325.png Network planning diagram in PNG format for easier viewing]
== Physical Network Layout ==
Line 27: Line 27:
== Server Hardware == This section describes the HCoop physical network layout.
Line 29: Line 29:
This may be a moot point as we are looking for a shop that can give us hardware support, and this may require that we buy their supported machines. However, it seems that many colocation providers will try to push us into a deal where their support consists in a "remote hands" plan where they will fix any reasonably standard hardware that we send to them for an hourly rate. If that is the case, our discussions on possible server hardware on the list may still be valid. Generally, we have decided that what we need in terms of hardware is more or less as follows: === Physical Network Description ===
Line 31: Line 31:
 * Two web servers with at least 1GB of RAM each. Redundancy should include a RAID 1 configuration with two 73 GB drives, and dual power supplies.
 * One file server with more storage space and room to grow. It doesn't need to be exceptionally fast because of AFS's caching mechanisms. Perhaps a small RAID 5 configuration of 3 x 500 GB SATA devices would be a good place to start. It should certainly be hardware - based RAID so that main CPU power is not needed for read and write operations. JustinLeitgeb suggested a [http://www.3ware.com/products/serial_ata2-9000.asp 3Ware Escalade Controller] in this machine.
Gigabit switch, divided into VLAN for server inter-connections. LAN out to Peer 1.
Line 34: Line 33:
The list also discussed hardware vendors. If this isn't a moot point based on our decision of a colo provider with specific needs, the following list of possibilities may still be relevant: === Physical Network Diagram(s) ===
Line 36: Line 35:
 * Dell PowerEdge servers. JustinLeitgeb suggested [http://www1.us.dell.com/content/products/productdetails.aspx/pedge_1850?c=us&cs=04&l=en&s=bsd Dell 1850's] for the web servers, and a [http://www1.us.dell.com/content/products/productdetails.aspx/pedge_2850?c=us&cs=04&l=en&s=bsd 2850] for the fileserver. One drawback to this server line is that it uses Intel, which is less desireable currently than AMD. However, this server line has been in production for quite a while and it has proven stable in many situations.
 * Sun fire servers. These machines use AMD processors but are considerably more expensive than comparable Dell machines.
 
== Networking Hardware ==
XXX add diagram here.
Line 41: Line 37:
Here we should talk about the specific networking equipment that we need. Ideas on vendors or models for the gigabit switch? Thoughts on if we should start with a hardware firewall device? Also it was mentioned that we should invest in a serial console for remote access when a machine goes down. Thoughts on this? == Software/System Layout ==
Line 43: Line 39:
== Backup Configuration == This deserves the separate page SoftwareArchitecturePlans.
Line 45: Line 41:
All are in agreement that we need a robust backup plan in our new architecture. It seems that it will include the continued use of [
http://www.rsnapshot.org rsnapshot], and that this utility will save even the front-end server data to the fileserver with RAID 5. Additionally, we should have data stored off-site in a manner that allows us to recover, even in the event that we are "rooted". We are looking for backup capabilities in colocation providers. Another option could be to have rsync-style backups to some administrators connection over the Internet, but this might not be tenable given the amount of data, the need for quick restores, etc. Let's continue to edit this section!
=== Software/System Description ===
Line 48: Line 43:
== Scaling Out ==

The next configuration should be reasonably scaleable, as we are expecting to grow rapidly in size. How should we scale our systems? Some ideas follow:

 * Make sure our software is able to publish to a web cluster. Eventually, we may have a dedicated "development" web node with a testing apache configuration. Once they are satisfied with their changes, they could push out these changes to a cluster. This could just mean putting them in a special location on the fileserver that is pulled from by a set of web nodes. Think about how to do load balancing on this cluster. F5 Networks has a device that will load-balance, but perhaps a linux solution would be more affordable.
 * Precisely how do we scale our fileserver? Does AFS have a mechanism where new fileservers can be added to the available space? Is RAID 5 the best solution, or should be start with something like RAID 10 for better IO throughput? Will AFS caching to the front-end web nodes be sufficient to mitigate the latency that will be introduced by a relatively slow RAID 5 configuration?
 * Database clustering possibilities: Many of our sites seem to depend on MySQL. In the future, something like MySQL clustering may help us by having a tuned database cluster to be shared by a lot of dynamic web sites. This could then be put behind a VIP (Virtual IP).

== Page version history ==

Initial page created Sat Mar 25 11:52:03 EST 2006 JustinLeitgeb.
=== Software/System Diagram(s) ===
Line 63: Line 48:
----
CategorySystemAdministration CategoryHistorical

1. Details about the next HCoop Architecture

This page will serve as the blueprint for the architecture that HCoop is constructing right now, to be colocated at Peer 1.

1.1. Network Overview

The architecture for the next hcoop.net network involves three physical servers:

  • A back-end server, which will serve IMAP, MySQL, PostgreSQL, primary mail, and AFS.
  • A public shell server for development and deployment of files to web server.
  • A public web server, also used for daemon processes written by our users. This will eventually be one node of a web cluster.

1.2. High-Level Architecture Description

The new HCoop architecture initially involves three servers; one for user shell logins, one for back-end services, and one for web service. Our goal is to build an architecture that serves us well based on our current needs, and can be expanded for increased capacity in the future with little effort.

Based on this, we should have one shell server that users edit and develop their web sites from. This will allow users to modify files on one web server in the present, and a cluster as we grow.

1.3. Design Goals

We should be able to plug new web servers into our architeture in the future in a manner that doesn't break our software systems. We may also want to think about doing the same thing with shell and file services.

1.4. Physical Network Layout

This section describes the HCoop physical network layout.

1.4.1. Physical Network Description

Gigabit switch, divided into VLAN for server inter-connections. LAN out to Peer 1.

1.4.2. Physical Network Diagram(s)

XXX add diagram here.

1.5. Software/System Layout

This deserves the separate page SoftwareArchitecturePlans.

1.5.1. Software/System Description

1.5.2. Software/System Diagram(s)

ColocationPlans is the main page for items related to the new architecture. ColocationPlansServiceProviders provides information about the service providers we are currently looking at.


CategorySystemAdministration CategoryHistorical

SystemArchitecturePlans (last edited 2012-12-14 17:03:35 by ClintonEbadi)