welcome: please sign in

Revision 13 as of 2014-04-10 16:11:18

Clear message
Edit

NewServerDiscussion2014

1. New Virtualization Host

Now that we are using KernelVirtualMachine, getting systems into use without weeks or months of delay is finally feasible. We're making good strides toward getting rid of mire, and now the time has come to allow deleuze to retire as well. Unfortunately, fritz is effectively at capacity with the web server and member shell vms, so we need a new machine.

The immediate uses for the new machine:

Over the next few months other uses appear:

This means we need hardware featuring something like:

1.1. Possible Hardware

Find a price quote, and list any machines you think would suit the task.

1.1.1. Power Edge R515

Quotes:

1.1.1.1. Other Considerations

2. Backup Drives

As feared, it turns out that we cannot do an afs volume dump to the afs volume partition of the fileserver without grinding hcoop to a halt. ClintonEbadi's early test resulted in fritz thrashing itself into oblivion, and the entire site losing access to the fileserver (never fear! services recovered flawlessly on their own mere minutes after killing the dump). Neither deleuze nor mire nor hopper have the free space required for backing up all of the data we have in afs now, never mind future data.

Thus: we need a pair of drives for a third RAID1 on fritz or the new machine, dedicated to backups. We have nearly 200G in data that needs backing up now, so 500G is the absolute minimum. However, much larger drives are not much more expensive and it's not unrealistic that we would have well over 500G of data in need of backing up 18 months from now (inserting new stuff at the data center is a pain).

2.1. Drive Options

We have caddies for all six bays in FritzInfo plus an extra, and the new server will have 12 drive bays total so we have options.

We currently have two 500G drives that would work in the new machine, and less than 300G of space in use by all openafs volumes. However, we might also want to use those for the operating system disks or similar.

3. OpenAFS Drives

We're only using a bit under 200G of data now, and have a 1TB partition on fritz for afs. So we probably only need another RAID1 of 1TB disks, although we should investigate larger drives to make sure we hit the optimal price/GB vs predicated needs. We can put off expanding storage for now since we are a few months of software work away from being able to host a new openafs fileserver.

4. Fixing Remote Access

Or: what to do with hopper and mire?

Our KvmAccess is broken currently. The belkin kvm is working properly, but either the startech IpKvm or its power brick is dead. We are going to test it with a new power brick, but there are other reasons to get rid of it...

It turns out that all of our Dell machines support IPMI power control and serial console over Ethernet. So we could use hopper or mire as a frontend to Fritz/Deleuze/The-new-machine in addition to light duty as a secondary machine. Both of them have out-of-band remote power and console, although RebootingMireSp has a much saner interface than HopperServiceProcessor.

the Argument:

The KvmAccess system involves a cumbersome vga+ps/2 cable for each machine. These are the chief contributors to the mess in the rack. And now that all of our systems are native USB, we've had trouble with at least two different usb to ps/2 adapters leading to loss of remote keyboard. The IPKVM is also pretty useless if the machine powers off; our only recourse for rebooting is sending ctrl-alt-delete or SysRq-B.

All of our important machines (Deleuze, Fritz, New Server) support IPMI over ethernet. This gives us power control and access equivalent to a physical console via a serial tool. Both mire and hopper have two ethernet ports, so we could just swap a few cables around and have a private IPMI management lan behind either. If we really wanted to, we could also ditch the belkin kvm switch and attach the IpKvm directly to either provide a secondary out-of-band access method.

We also need to keep at least mire or hopper around for a few tasks. So we're going to have to keep one of them in the power budget until after we decommission deleuze. So, it seems sensible (assuming configuring out-of-band access to the Dell servers via IPMI works as well as it should) to use one as a console server.

My (ClintonEbadi) current thinking is that we should remove hopper and not mire. Pros:

Cons:

So... it's up in the air which is better. The tasks required to remove hopper when we install the new machine include:

All of the wheezy updates/pre-mire tests could be carried out on a new VM with a modest (512M seems OK) RAM allocation on fritz now that we have more disk space for disk images. The VM could even remain online doing at least spamd, but at least the KDC needs to be on mire shortly after re-installation, since we have a bit of a chicken-and-egg problem leading to slow reboots when there are not kdc or afs servers available on the kvm hosts.