3748
Comment:
|
4633
reworking and organizing - again
|
Deletions are marked like this. | Additions are marked like this. |
Line 1: | Line 1: |
= Deleuze = | #pragma section-numbers off |
Line 3: | Line 3: |
This machine donated by Justin Leitgeb seems real nice. Buffered disk throughput is about 1.5 GB/s. Raw disk reads are 60 MB/s for the two 36 GB disks and 120 MB/s for the 4-disk array. Not bad at all. | This contains a list of pages that are of interest to the admins. |
Line 5: | Line 5: |
== Tasks done == | <<TableOfContents>> |
Line 7: | Line 7: |
* Removed excessive packages, cleaned up the system * Installed ''changetrack'' to monitor all config file changes. The program uses ''rcs'' and automatically keeps previous revisions. It is ran from ''cron'' on a daily basis. * Installed ''debsums'' to monitor file md5sums * Installed Courier IMAP and IMAP-SSL * Installed LDAP for user authentication. The system is currently configured to use LDAP and fallback to the usual ''/etc/'' files. Admin users will be added locally on all machines and will be able to log in even when LDAP is not operational. * Installed MIT Kerberos 5 * Fixed date/time on the system. Installed ''ntpd'' * Installed TLS support for LDAP. Certificate file is ''/etc/ldap/server.pem'', and ldap/ldaps ports are 389/636. * Installed Linux 2.6.18.3-grsec with 2.6.18-mm3 patches (2) for megaraid. * The patches and source tree installed, along with the .deb generated, is under /usr/src/ntk2. I set up sockets groups as on fyodor (7070-7072). SMP, with hyperthreading enhancements, is enabled. I also installed a bunch of packages that someone were uninstalled while I was gone (e.g., gcc). I also fixed the sudoers, wheel group, and admin home directories. --NathanKennedy * Kerberos works. |
= Sysadmin work = |
Line 19: | Line 9: |
== TODO == | Links to detailed policies, procedures and information specific to HCoop. The resources here should allow HCoop admin team members to share information about every part of the complete system, and to allow easier training of future team members. |
Line 21: | Line 11: |
In order of implementation (soonest first): | The linked pages are sorted based on relevance in day to day operations. That means the current admins will most often consult pages from the top of the list, while new admins or people wishing to get familiar with the setup will start from the end and move upwards. |
Line 23: | Line 13: |
* LDAP working with kerberos. Documentation on this is lacking. -- DavorOcelic * Fix resolv.conf on both servers to have multiple good DNS servers for now, set it to use localhost once BIND is running and configured. * Install AFS (need to repeat the reading on AFS and how it really works. Also it will influence the decision how to format ''/dev/sdb'' in the system) -- DavorOcelic * Install MySQL and PostgreSQL (input from AFS step and admin discussion needed to see how to exactly configure this) -- DavorOcelic * Install BIND -- DavorOcelic * Review kernel configuration and install testnet. -- DavorOcelic * See why db4.2 recover takes a long time on LDAP restart if anything is modified in the directory -- DavorOcelic * Install and configure Apache, to serve static web content only. * Get domtool2 working (this to be done concurrent with mire). |
Admins: it is recommended that you create a wiki account and subscribe to the page regex `.*` (all pages) to keep informed of what everyone is up to. Documenting your work here is recommended. |
Line 33: | Line 15: |
== Problems == | = To be an admin = Sections you should read if you are interested in being an admin. |
Line 35: | Line 18: |
* With ''debsums'', once you break md5sum of a config file, the file keeps being reported as mismatching even if you completely regenerate md5sums for a package!! -- DavorOcelic * The logical volume for /dev/sdb is supposed to be a 4-drive raid array, each drive ~73GB. Right now it seems to be configured as RAID1 mirroring the two drives, for a capacity of ~146G (see dmesg, for instance). This would be faster and the volume would be 73G bigger if it was set up as RAID5. I might need to do this from console, and I need to talk to Justin about it, since he set up the logical volumes and I thought he said that sdb was RAID5. --NathanKennedy |
* TipsAndTricks |
Line 38: | Line 20: |
= Custom software = | === Admins and Admin Responsibilities === * TaskDistribution: What each sysadmin is responsible for. * VolunteerResponsePolicy: Guidelines for responding to requests and email. * AdminArea/ListOfVolunteers who can help us do stuff... * AdminGroup: Listing of people who can delete pages and despam pages on the wiki. |
Line 40: | Line 26: |
* DomtoolTwo * Vmail tools * Web portal * Watchdog process to kill resource hogs |
=== Introductory material === |
Line 45: | Line 28: |
These are my responsibility. Right now, I'm waiting for the more traditional stuff to be set up and stable before beginning. --AdamChlipala | Refer to documentation of each of the listed components. The information in our Wiki pages covers only the most basic principles, and quickly focuses on HCoop-specific setup, assuming skillset with the technology. * DaemonFileSecurity * DomTool * AuthenticationScheme * [[OpenLDAP]] * MitKerberos * AndrewFileSystem * EtcKeeper = Planning and Records = * RoadMaps: Announcements of future plans and events. * [[Migration2009]] * [[Migration2009/SoftwareSetup]] * Migration2010Notes: Notes about the new server setup and way to transfer over old data === Technical Records === * IpAddresses: Listing of IPs that we use. * [[Hardware]]: Information on HCoop hardware. * HcoopAddresses: Physical addresses relevant to us. * OnSiteVisits: Records of visits by HCoop volunteers to our colocation facilities |
Line 48: | Line 52: |
= Global TODO = | === Views === |
Line 50: | Line 54: |
* Make ca@hcoop.net e-mail address working. It's the address used in the certificate files. | * Fritz.hcoop.net - [[http://fritz.hcoop.net/munin/|Munin reports]] |
Line 52: | Line 56: |
= Global Notes = | |
Line 54: | Line 57: |
* To edit LDAP database from a GUI tool, use ''gq'' program * To connect to hcoop's ldap server using ''gq'', create a SSH tunnel: ''' ssh -p 2222 -f -N -L 389:localhost:389 USERNAME@69.90.123.51''', and then connect to ''localhost:389'' in ''gq''. |
= Specific Machines = This documents machine-specific (hardware) things, or specific configuration necessary for ''that machine''. * [[Hardware]] * SetupNewMachines: How to install a machine that adheres to our policies * KvmAccess: How to use the remove KVM and avoid going on site. * deleuze * PowerEdge2850 is about '''deleuze''' * RebootingDeleuze: Steps to take after rebooting deleuze. * mire * RebootingMireSp: How to reboot mire using its SP interface. * hopper * HopperServiceProcessor * fritz * FritzInfo * outpost = Services = This documents all software things that are not machine specific. === General Sysadmin === * BackupInfo: Information on how to recover deleted files from our off-site backups. * DebianPackaging: How to make custom HCoop Debian packages. * ResourceLimits * InstalledSoftware lists non-debian installed software. * SystemAuthentication lists authentication * UsingResourceLimits If this is still accurate, we should move it to MemberManual area. * Member Management * UserManagement only talks about adduser/deluser right now. * MemberFreezing: How to freeze and unfreeze members who get behind on dues * AdminUserSetup lists steps to create (blank), delete, and change passwords of admin users. * ChangingAdminPassword: How admins can change their UNIX passwords. === Specific Services === * DaemonAdmin: How to set up various daemons (NOTE: many of the services below are linked from here. We should migrate the contents of this page onto the outline below.). * AFS / Kerberos * SetupNewAfsServer: How to set up a new AFS server. * PrincipalsForNonHumans talks about kerberos for automated tasks. * MailMan contains no information... * SpamAssassinAdmin * DomTool * Web * DNS * ZoneTransfers is also mostly blank. * Databases * Backups * Version Control * wiki.hcoop.net * jabber * Other * CertificateAuthority: How to sign user SSL certificates and the like. = Historical = Pages no longer considered relevant: * SoftwareArchitecturePlans: Plans for software installation. * SystemArchitecturePlans: Plans regarding our hardware. * InstallationLog contains ancient (~2005) records of installation of software and hardware * KrunkInfoz (Krunk is out of service) |
This contains a list of pages that are of interest to the admins.
Contents
Sysadmin work
Links to detailed policies, procedures and information specific to HCoop. The resources here should allow HCoop admin team members to share information about every part of the complete system, and to allow easier training of future team members.
The linked pages are sorted based on relevance in day to day operations. That means the current admins will most often consult pages from the top of the list, while new admins or people wishing to get familiar with the setup will start from the end and move upwards.
Admins: it is recommended that you create a wiki account and subscribe to the page regex .* (all pages) to keep informed of what everyone is up to. Documenting your work here is recommended.
To be an admin
Sections you should read if you are interested in being an admin.
Admins and Admin Responsibilities
TaskDistribution: What each sysadmin is responsible for.
VolunteerResponsePolicy: Guidelines for responding to requests and email.
AdminArea/ListOfVolunteers who can help us do stuff...
AdminGroup: Listing of people who can delete pages and despam pages on the wiki.
Introductory material
Refer to documentation of each of the listed components. The information in our Wiki pages covers only the most basic principles, and quickly focuses on HCoop-specific setup, assuming skillset with the technology.
Planning and Records
RoadMaps: Announcements of future plans and events.
Migration2010Notes: Notes about the new server setup and way to transfer over old data
Technical Records
IpAddresses: Listing of IPs that we use.
Hardware: Information on HCoop hardware.
HcoopAddresses: Physical addresses relevant to us.
OnSiteVisits: Records of visits by HCoop volunteers to our colocation facilities
Views
Fritz.hcoop.net - Munin reports
Specific Machines
This documents machine-specific (hardware) things, or specific configuration necessary for that machine.
SetupNewMachines: How to install a machine that adheres to our policies
KvmAccess: How to use the remove KVM and avoid going on site.
- deleuze
PowerEdge2850 is about deleuze
RebootingDeleuze: Steps to take after rebooting deleuze.
- mire
RebootingMireSp: How to reboot mire using its SP interface.
- hopper
- fritz
- outpost
Services
This documents all software things that are not machine specific.
General Sysadmin
BackupInfo: Information on how to recover deleted files from our off-site backups.
DebianPackaging: How to make custom HCoop Debian packages.
InstalledSoftware lists non-debian installed software.
SystemAuthentication lists authentication
UsingResourceLimits If this is still accurate, we should move it to MemberManual area.
- Member Management
UserManagement only talks about adduser/deluser right now.
MemberFreezing: How to freeze and unfreeze members who get behind on dues
AdminUserSetup lists steps to create (blank), delete, and change passwords of admin users.
ChangingAdminPassword: How admins can change their UNIX passwords.
Specific Services
DaemonAdmin: How to set up various daemons (NOTE: many of the services below are linked from here. We should migrate the contents of this page onto the outline below.).
- AFS / Kerberos
SetupNewAfsServer: How to set up a new AFS server.
PrincipalsForNonHumans talks about kerberos for automated tasks.
- Mail
MailMan contains no information...
- Web
- DNS
ZoneTransfers is also mostly blank.
- Databases
- Backups
- Version Control
- wiki.hcoop.net
- jabber
- Other
CertificateAuthority: How to sign user SSL certificates and the like.
Historical
Pages no longer considered relevant:
SoftwareArchitecturePlans: Plans for software installation.
SystemArchitecturePlans: Plans regarding our hardware.
InstallationLog contains ancient (~2005) records of installation of software and hardware
KrunkInfoz (Krunk is out of service)