Wish List
From UGCS
Revision as of 05:28, 21 July 2008 by Jdhutchin@ugcs.caltech.edu (Talk | contribs)
Contents |
Summer 2008
- User scripts
- Nice wrappers around mailman stuff- Joshua
- Nice wrappers for vhost- Mostly done, not documented (Joshua)
- Wiki setup scripts
- Remctl to set up tsearch2
- Automated quota notices
- Cron stuff for users
- per-user apache logs
- Get to.ugcs working- perhaps dynamic dns updates through nsupdate?
- Better rwho/rupdate
- Documentation
- Redundancy
- HTTP
- NFS- Done Jdhutchin@ugcs.caltech.edu 22:28, 20 July 2008 (PDT)
- Off-site mail/monitoring
- Documentation
- Power stuff- see [June 2008 Downtime Plan]
- Fix cron notices and tweak Nagios settings
- Set up snort do so something useful
- Log monitoring
- Documentation
My goal is to have the cluster set up for 99.9% uptime from the end of the summer. This means we have ~9hrs of downtime for critical services throughout the year. I won't count things outside of our control (namely power and network issues), as fixing those would take more money than we have. Some things that will help us meet this goal:
- Failover on critical services. We already have this for ldap and kerberos, we need to add mail and web to this mix.
- Checks in places to prevent sysadmin mistakes. Having redundant systems will help so that we can test changes on one system first. Setting up testing services on hephaestus might help with this.
Stuff for Spring Break
- Get lab set up to use projector and sound system
- Bacula backup for servers
- Start on buildserver? I'd put a bit of time into this (Assign: alexr, ?)
- Get iodine reliable and publicized to lusers (Assign: alexr)
- Get shellservers listening to SSH on ports 80 and 53 for people behind restrictive firewalls (I'll do when I get a few mins) (Assign: alexr)
- Start looking into some of the warnings we're receiving from cron, nagios, etc, and silence (fix or set to ignore) (Assign: alexr, ?)
- Script to let people remove job listings automatically (Assign: alexr)
- This should go in a remctl command
- Netboot UGCS disc
- SVN for software in ugcs-admin
- Virtual hosting (did you mean via some sort of dropfile-esque interface? -alexr)
- Spamassassin/ldap settings- Done before break Jdhutchin@ugcs.caltech.edu 05:05, 16 March 2008 (PDT)
- Debug ldap and mail- this will involve some downtime.
- We should try to get ldaps working again if possible, or at least figure out why it doesn't work anymore. I know we're not using ldap for anything secure right now but I don't want to rule it out in the future. Also, I suspect it's behind the Alpine failures. (alexr)
- It seems to be working- ldaps will have to wait for later versions of libldap that don't suck
- Basically done Jdhutchin@ugcs.caltech.edu 05:05, 16 March 2008 (PDT)
Stuff for Winter break
- Fix account creation system- cgi principals and postgres databases
- Done Jdhutchin@ugcs.caltech.edu 21:03, 8 December 2007 (PST)
- Get cfengine to do dns/bind
- Done Jdhutchin@ugcs.caltech.edu 22:05, 8 December 2007 (PST)
- Setup gale
- Apache logs for users (Joshua)
- Get Spamassassin to let users set their own settings
- Scripts to set up wikis
- Tripwire
- Binaries are sitting on hephaestus in /var/local
- Tabulate cluster usage statistics
- Netboot UGCS Disk
Remaining
Critical
- Allow user creation of mailing lists, automated mailing list updates
- Clean up pagsh entries on poseidon to avoid needing to periodically reboot
High
- Security To-Do
- Add to debian bug#417917 regarding AFS module unload oops
- Get to.ugcs working, suggested to use http://www.stanford.edu/~riepel/lbnamed/Stanford-DNSserver/ or http://twistedmatrix.com/trac/wiki/TwistedNames
- New account scripts
- Create postgres user and database upon user creation
- Mailman list creation by users (this will be done with Mailman SSO)
- Write documentation for users- partially done, what other specific topics do we need?
- Online man pages / software documentation
- redoing the build daemon (maurer)
- Set up poseidon with services
- Gale (packaging 90% complete, need to test maintainer scripts)
- More restrictive iptables rules on Charon
- Fix issues with swap not mounting on muses
- Debug dovecot hanging issue
- Increase AFS Cache sizes--we're not using the disks for much else
- Write a script so users can update their ldap info
- Backups
Medium
- mailAlternateAddress scripts
- Let users see their own Apache logs
- Create list of standard required packages- Mostly done in cfengine package.conf
- Better solution: pkgsync
- Mailman SSO
- Automated quota notices for mail and home
Low
- Install power to center (pending MHF grant result)
- Set up Persephone for backup
- Web based manuals for software
- Do we really want to duplicate manpages online?
- You can do things like hyperlink and have an index with apropros info that man doesn't have. Also, I think the info pages are nicer to browse. Also, some software only has HTML docs / vastly superior HTML docs next to the other documentation --Goldstei@ugcs.caltech.edu 09:43, 8 October 2007 (PDT)
- Various old UGCS niceities
- finger
- dictd running somewhere
- global login records
- sl
- configure and (auto nice daemon)
- tools for viewing global login records
- review old '/ug/adm/scripts' for useful stuff and and ask Josh Goldstein if what is going on makes no sense
- Put all configs in cfengine to ease rebuilding machines from known good state
- General application-specific tweaks
- Get distcc working properly
Completed
- Decide whether to go with AFS or NFS
- Decided to do NFS for root filesystem, AFS for user homedirs and maildirs.
- Do NFS setup for netboot root
- Get DNS on Demeter up so we can properly reference 'task' hostnames
- Agree on IP allocation
- Set up Hermes for mail
- Set up core switches
- Set up Charon for routing, get snort running
- Budget planning
- PXE setup on Demeter
- Start migrating over Pukes to serve as test client machines
- Start using CFEngine to manage sudo so that all machines stay in sync and we can setup different sudoers for different machines (e.g. donut, new sysadmins, etc. when the time comes)
- Investigate pam_access group restrictions to prevent non-sysadmin login into core machine
- used pam_access and cfengine
- Migrate user data from NIS to LDAP
- Set up password migration frontend
- Send administrative e-mails warning users
- Update hostmaster with new IP allocations for rDNS
- Set up and migrate Kerberos/LDAP to Zeus
- In progress - Zeus is physically up on the Compaq_Proliant_3U.
- Set up new CA properly
- Order muses/naiads
- Migrate mailing lists
- Groundwork completed
- Set up nullmailer to redirect mail to hermes
- Set up poseidon
- Migrate network cabling
- Set up hephaestus
- Migrate Apollo IP
- Convert to task-based CNAMES i.e. ldap-head
- Configure pipermail on hermes
- Mount old UGCS NFS on Apollo
- Contact alumni association i.e. andy shaindlin and karen carlson
- Set up charon to record logging data
- chsh script
- Migrate mail data
- Prepare to migrate user public html data
- mailForwardingAddress script: mail_forward (available on pukes)
- work out suexec/afs/kerberos interactions - finally freaking done
- secure and test pseudo-suexec
- mailForwardingAddress debugging - found problem (needed to apply the alias map twice; once in virtual alias maps, second time in local alias maps)
- Pine SSL certificate issue
- Finish tweaking netboot- should be pretty much done
- Reformat old pukes to be used as muses- takes about 10min/machine
- Bring down Purchase, migrate demeter IP
- Fix POP/IMAP issue with full resend and relogins for every request
- Migrate homedir data
- Investigate LDAP slave mirror problems
- Mount hardware on racks and remove obsolete hardware
- Send in RMA athena disk
- Fixed mail_forward script
- MHF grant (October 6 deadline)
- submit progress report on previous funds
- ask for splunk, projector, money for constructing overhead power drop, money for general lab improvements
- People to touch base with for 'context' section: Elizabeth Allen (alumni); Ruthanne Bevier (imss security); Michael Vanier (cs); Chris Gonzales (ascit); Michael Woods (ihc); Wenyee Lo (imss houserep program); Marissa Cevallos (the tech); Craig Montuori (donut)
- memory rebate (Liz)
- purchase video cards (Liz)
- Perhaps a more sane way to access webmail, i.e. webmail.ugcs?
- Splunk configuration, apache logs
- reimbursement for purchases (Liz)
- Job posting scripts
- Restore Athena to operation
- Set up Hera with backup KDC and LDAP
- Migrate nfs to hestia
- set up postgres on poseidon, set up databases for each user
- mrsh replacement (mssh <classname> <command> <arg1> ...)