## page was renamed from OnSiteVisits/20140409 Upgrade Fritz's RAM and attempt to restore KvmAccess. <> == People == * AnishJacob * SrikanthSastry == Goals == * Restore KvmAccess * Upgrade fritz to 24G of RAM == Outcome == === Short Version === * RAM upgrade was a success * Downtime was ~2h instead of 30 minutes * Belkin KVM appears to not be functioning correctly * Deleuze rebooted without (major) incident === Long Version === The actual RAM upgrade for fritz seemed to go quite smoothly. Fritz was down for ~15 minutes for hardware surgery, beating the expected time by a great margin. It was when it booted that problems developed. None were fatal, but combined caused fritz to take an ~1h15m before afs was restored and nearly another hour before all services were restored. Attempts to restore KvmAccess ended with hopper being accessible and the other machines appearing to not work. ClintonEbadi requested that hopper's cables be swapped onto fritz, but fritz continues to not display anything. The KVM may be dying. We managed to reboot deleuze. It required manually pressing the power button to make it finish halting, but it booted cleanly and quickly afterward. Problems: * We hit the ext3 mandatory file system check interval, which added ~40 minutes to the reboot. * `/var/lib/libvirt` (`/dev/md3`) was not auto-detected and failed to come up, requiring manual intervention to continue booting. * It appears the partition type was not set correctly (`Linux` instead of `Linux RAID autodetect`). The partition type was updated, but there is a high chance of that not actually fixing the boot process. * libnss-afs, ncsd, and nsswitch.conf, and the init order are interacting badly, causing fritz to pause for an additional 15-20 minutes during boot * The Belkin KVM is behaving oddly. * '''TBD''' == Supporting Material == * [[attachment:ram-upgrade-guide.pdf|Excerpts from Maintenance Manual]] relevant to installing memory {{attachment:inside-fritz.png}} == Itinerary == === Upgrade Fritz Memory === '''SERVICE IMPACTING'''. Goal: 30 minutes downtime. Stretch: One hour. (Fritz can take up to 15 minutes to reboot) * Remove Belkin KVM from rack to allow access to fritz * Power fritz down * Install memory into fritz * Power fritz up, ensure that POST succeeds and system boots === Restore KVM === ''Not Service Impacting'' * Trace and re-attach cables going into Belkin KVM * Ensure that all machines can be controlled using KVM * Re-rack Belkin KVM * Double check no cables were jostled loose while re-racking === Reboot Deleuze === ''Minor Service Impact'' (Mail will be rejected briefly) * After (or while) restoring KVM, reboot deleuze.