Website:Sysadmin Survey

From UGCS
(Difference between revisions)
Jump to: navigation, search
(Old sysadmin survey (2001))
 
(The dotted line)
 
(35 intermediate revisions by 2 users not shown)
Line 1: Line 1:
<head>
+
=UGCS Sysadmin Survey=
<title> UGCS Sysadmin Search </title>
+
</head>
+
  
<body>
+
__NOTOC__
<h2> UGCS Sysadmin Search </h2>
+
  
<h4> What's involved in being a sysadmin? </h4>
+
===What's involved in being a sysadmin?===
  
Being a sysadmin means a lot of things.  It means answering multitudes of
+
Being a sysadmin means a lot of things.  It means answering multitudes of questions from users.  It means finding and installing nifty new software, and keeping the existing software working.  It means keeping the lab's hardware working reasonably well as well as keeping the lab nice and neat.  It means dealing with obscure problems that you might otherwise just ignore.  It means being on call 24 hours a day to deal with minor and major emergencies.  In general, it means spending a lot of time keeping the lab a productive and fun place to get things done.
questions from users.  It means finding and installing nifty new software,
+
and keeping the existing software working.  It means keeping the lab's
+
hardware working reasonably well, and begging the CS department for money
+
when current hardware fails.  It means dealing with obscure problems that
+
you might otherwise just ignore.  It means getting access to just about
+
anything in Jorgensen.  It means maintaining a good deal of generally
+
well-written, but largely undocumented, local hacks.  It means being on
+
call 24 hours a day to deal with minor and major emergencies.  In general,
+
it means spending a lot of time keeping the lab a productive and fun place
+
to get things done. <p>
+
  
<h4> What's the incentive? </h4>
+
===What's the incentive?===
  
As a sysadmin, you will learn the gory details of UNIX systems inside and
+
As a sysadmin, you will learn the gory details of UNIX systems inside and out, and you will gain a lot of experience in dealing with machines and people which may be helpful in later life.  If you're the type of person we're looking for, noodling around on computers will be its own reward. You'll become a well-known person among the undergrad community and the CS department.  And you'll also experience the personal satisfaction of making UGCS a better place in which to compute.
out, and you will gain a lot of experience in dealing with machines and
+
people which may be helpful in later life.  If you're the type of person
+
we're looking for, noodling around on computers will be its own reward.
+
You'll become a well-known person among the undergrad community and the CS
+
department.  And you'll also experience the personal satisfaction of making
+
UGCS a better place in which to compute. <p>
+
  
Of course, there is some pay involved, too.  However, for historical and
+
===How much time?===
political reasons, it's considerably less than it should be.  Currently the
+
UGCS budget is roughly $5000 for the year, to be split up between buying
+
hardware and paying the sysadmins.  However, <b>we've gotten approval for
+
the sysadmin job to count as work-study</b>, which means that the financial
+
aid department will match some of the money that you get paid.  This only
+
helps those of you who have work-study aid, but it's better than before.
+
  
<h4> How much time? </h4>
+
There are no fixed hours.  When the lab crashes, we have to fix it, but otherwise we set our own schedule.  You can work when you have the time, and let other people handle things when you don't.
  
There are no fixed hoursWhen the lab crashes, we have to fix it, but  
+
Students have run this lab for twenty years while holding regular class schedulesIt can be stressful at times, but it's quite manageableIt is possible to keep this job over the summer while also working at something else at or very near Tech (like a SURF), but you have to be willing to spend a reasonable amount of time here.  Generally, though, more important than the actual number of hours that you spend is your dedication to the job.
otherwise we set our own scheduleYou can work when you have the time,
+
and let other people handle things when you don't. <p>
+
  
Students have run this lab for over ten years while holding regular class
+
===The dotted line===
schedules.  It can be stressful at times (like midterms), but it's quite
+
manageable.  It is possible to keep this job over the summer while also
+
working at something else at or very near Tech (like a SURF), but you have
+
to be willing to spend a reasonable amount of time here.  Generally,
+
though, more important than the actual number of hours that you spend is
+
your dedication to the job.<p>
+
  
<h4> The dotted line </h4>
+
There have traditionally been between two and four sysadmins at any time.  Since we are looking for people who will be able to remain sysadmins, we prefer sophomores and (especially) freshmen, but encourage everyone interested to apply.  Previous experience in system administration is helpful but not at all necessary; more important is a desire to learn and the ability to deal with people.
  
If the above hasn't scared you away from wanting to be a sysadmin, please
+
===In case of emergency... break glass...===
answer the following questions and email your answers to sysadmin@ugcs by
+
11:59 PM, Sunday, April 1, 2001.
+
 
+
We will send you e-mail acknowledging receipt of your application.  We will
+
decide which candidates to interview and let you know by Wednesday, April
+
11.  <p>
+
 
+
There are usually between two and four UGCS sysadmins at any given time.
+
The position is for the current term this year, continuing through the
+
summer, and into next year.  There is no expiration period, though,
+
and generally sysadmins stay sysadmins until they graduate, and even
+
then some.  We'd like to accept the applicants by this upcoming midterms
+
and train the second half of this term. <p>
+
 
+
Since we are looking for people who will be able to continue, we prefer
+
sophomores and (especially) freshmen, but encourage everyone interested
+
to apply.  Previous experience in system administration is helpful but
+
not at all necessary; more important is a desire to learn and the ability
+
to deal with people. <p>
+
 
+
<h4> In case of emergency... break glass... </h4>
+
  
 
Oh, and if you have any questions, contact one of us below.  Although some
 
Oh, and if you have any questions, contact one of us below.  Although some
of us seem surlier than others, we're all fine sysadmins. <p>
+
of us seem surlier than others, we're all fine sysadmins.
 
+
<hr>
+
 
+
<h3> Let the games begin! </h3>
+
 
+
<ol>
+
 
+
<li>Name: <p>
+
 
+
<li>Email address: <p>
+
 
+
<li>Class (Fr, So, Jr, Sr, S^n Sr): <p>
+
 
+
<li>Option (you don't have to be CS!): <p>
+
 
+
<li>What computing hardware, operating systems and software have you worked
+
    with, and what have you used them for? <p>
+
 
+
<li>What programming languages/scripting languages do you know?  How well?
+
    Of the ones you know, which do you like best and least?<p>
+
 
+
<li>Describe one or two of your favorite programming projects (done for a
+
    class, for a job, on your own -- it doesn't matter). <p>
+
 
+
<li>What do you find to be the most interesting aspects of computing?  When
+
    you "play around" with computers, what sort of things do you do? <p>
+
 
+
<li>Have you had any experience with system administration?  What sort of
+
    work did you do (was it mangling an enterprise-wide gigabit-capacity
+
    network for a Fortune 500 company, or was it dusting off Apple II
+
    monitors in high school)?  Have you done anything particularly
+
    interesting? <p>
+
 
+
<li>Have you worked with Unix-like systems at all?  Have you ever set
+
    one up?  (Yes, Linux counts.) <p>
+
 
+
<li>What is your biggest gripe about Unix?  What would you change? <p>
+
 
+
<li>Emacs or vi? <p>
+
 
+
<li>What's the most difficult computer-related problem you've solved?<p>
+
 
+
<li>Why do you want to be a UGCS system administrator? <p>
+
 
+
<li>Is there anything else we should know?  Be creative.  Lie, if
+
    necessary. <p>
+
 
+
</ul>
+
 
+
<hr>
+
Answer at least one question from each of the following sections.  Give
+
as much detail as you can.  You can ask people for help on any particular
+
concept, but you can't have people answer the questions for you.  Feel free
+
to look at any documentation or source that you want.  <b>Don't worry</b>
+
if you don't know or can't figure out the answers - we're much more
+
interested in your thought process than anything else.  But remember: the
+
more challenging problems you answer, the more chance you have to impress
+
us.  Geez, sounds like a final, eh? <p>
+
 
+
<hr> <h3>Technical questions</h3>
+
 
+
<ol>
+
 
+
<li>Describe NFS's method for dealing with fcntl()-style (e.g.
+
    kernel-supported) file locking.  What processes are involved?  How do
+
    they communicate?  Why is this external system needed - why can't it be
+
    integrated into standard NFS? <p>
+
 
+
    What are some other ways that file locking can be attempted over NFS?
+
    What are the problems with these methods? <p>
+
 
+
    The most critical cluster-locking problems typically involve email
+
    spools.  Describe other methods for dealing with the issue of mail
+
    locking.  Remember that you have to support all three of our major
+
    mail systems: MH, elm, and pine.
+
 
+
<li>How does the name service for "to.ugcs.caltech.edu" work?  What benefit
+
    does it have over regular name service?  Why might it not be
+
    appropriate for serving "www.ugcs.caltech.edu"? <p>
+
 
+
    The to.ugcs service is not as reliable as we'd like it to be.
+
    Describe the major problem or problems that keep it from operating
+
    consistently and how you'd deal with such problems. <p>
+
 
+
    If you were to design a load-balancing service, describe the advantages
+
    and disadvantages of both using UDP broadcast or TCP connections to
+
    assess the load on the clients. <p>
+
 
+
<li>If you haven't already, look up and read the Internet RFC standards
+
    document for the NFS (Network File System) protocol.  To understand it,
+
    you will probably also have to read the RFC for XDR (eXternal Data
+
    Representation) and the ONC implementation of RPC (Remote Procedure
+
    Call). <p>
+
 
+
    Describe how authentication works under NFS.  What is a file handle,
+
    and what role does it play?  Describe at least three ways security
+
    could be breached on a system running NFS; at least two of the ways
+
    have to work from any system on the Internet.  What can be done to fix
+
    these problems?  Do you think your breakin methods would actually work
+
    on systems running on the Internet today? <p>
+
 
+
<li>Managing a user's resource usage is a consistent problem in
+
    adminstering any cluster.  The biggest problems tend to be disk
+
    quota and mail spool quota.  Please describe solutions to both
+
    problems.  Keep in mind that user home directories are access via
+
    NFS and that resource maintainence must work without flaws and
+
    without intervention 99% of the time. <p>
+
 
+
<li>Describe, in as much detail as you want, what happens from the time
+
    you type "telnet <machine>" (or SSH, if you want) to the moment you're
+
    given a prompt.  Assume everything works normally: you're attached to
+
    a pty, etc.  You might want to comment on the differences between
+
    a network login and a console login: what different processes are
+
    involved, what is the role of telnetd, etc.<p>
+
 
+
<li>Sometimes machines, even Unix boxes, hang.  Often this is a result
+
    of high load or too few resources (file descriptors, memory,
+
    available processes, etc.)  Design a program which would allow a
+
    remote administrator to diagnose problems and free up resources.
+
    Remember, it must function in a resource-scarce environment.  State
+
    the considerations that such a program would need to account for
+
    and don't forget the problem of authenticating the user. <p>
+
 
+
<!--
+
<li>How would you write a program to dump the core of another user's
+
    processes? 
+
    Describe the steps, from userland to kernel space,
+
    that such an action would require.  What kernel features are
+
    required for this to work?  What userland considerations should
+
    you take into account? <p>
+
-->
+
 
+
<li>Suppose an important daemon, say "hosed", stops responding to remote
+
    connections, signals, etc.  You want to debug what's happened with
+
    this process, but it's essential that the process continue to run.
+
    A simple system-call trace won't tell you the events that led
+
    up to the freeze, so you need to analyze its core.  If you can,
+
    write a program for Linux to dump the core of "hosed" while it's
+
    still running, and without killing it.  Otherwise describe what
+
    changes (user- or kernel-level) need to be made for this to work.
+
    Be specific.<p>
+
 
+
<li>UGCS has a long history of using wacky directory/symlink arrangements
+
    to arrange source code, executables, and auxiliary files in a logical
+
    manner and of using various programs, OS features, and scripts to keep
+
    copies of commonly used files on the local disk of each computer.  This
+
    is a difficult problem, and one that we still haven't fully solved.
+
    There are a lot of issues to be considered, such as ability to find
+
    source code for installed program, ability to keep track of what was
+
    required to compile a program for a particular architecture, ability to
+
    tell what package a file belongs to, ability to distribute commonly
+
    used files while keeping less commonly used files on a server, ability
+
    to easily install new software, and ability to utterly and totally
+
    confuse any non-sysadmin looking for a file. <p>
+
 
+
    Describe a cool file arrangement/distribution scheme.  It need not be
+
    at all similar to UGCS's scheme, but should be applicable to the UGCS
+
    setup - a slow network of many computers of several architectures with
+
    limited root disk space.  It should simplify or solve many of the
+
    issues listed above (don't worry about the last one.)  You should
+
    describe how files are arranged, and any scripts or programs which are
+
    required.  Be sure to describe the rationale for various aspects of
+
    your scheme. <p>
+
 
+
<li>Traditionally Unix has used the "utmp" file for tracking login/logout
+
    records.  This is a binary-data file, in a mostly (but not completely)
+
    portable format.  Depending on how you log in, /bin/login, telnetd,
+
    rlogind, etc. are responsible for adding an entry to it, and
+
    /sbin/init, telnetd, rlogind, etc. are responsible for deleting an
+
    entry from it.  What are the weaknesses in this scheme?  (There are a
+
    number of them.) <p>
+
 
+
    UGCS uses a different scheme for login tracking; the program /bin/login
+
    (which has been rewritten from scratch) is now solely responsible for
+
    adding and removing entries from the login tracking "database".  Describe
+
    how the UGCS login program works, and suggest some improvements.  (It
+
    isn't perfect!)  The source for the UGCS login program lies in
+
    <tt>/ug/src/flat/login-0.0</tt> and <tt>/ug/src/flat/libugcs-0.0</tt>.<p>
+
 
+
</ol> <p>
+
 
+
<hr> <h3> Hypothetical questions </h3>
+
 
+
Note: some of these are somewhat UGCS-specific.  If you don't have a UGCS
+
account, you can either fill out an account application in 166 Jorgensen
+
(and send us mail to make sure we process it quickly), or answer the
+
questions for a similar set of Unix systems. <p>
+
 
+
<ol>
+
 
+
<li>You telnet to "to.ugcs.caltech.edu" and get: <pre>
+
  
    Trying 131.215.43.33...
 
    Connected to necro.ugcs.caltech.edu.
 
    Escape character is '^]'.
 
    </pre>
 
  
    ... and then it hangs.  What might have gone wrong?  How can you find out
+
===Let the games begin!===
    for sure?  How can you fix it? <p>
+
  
    Later on, you telnet to "to.ugcs.caltech.edu" and get: <pre>
+
#Name:
 +
#Email address:
 +
#Class (Fr, So, Jr, Sr, S^n Sr):
 +
#Option (you don't have to be CS!):
 +
#What computing hardware, operating systems and software have you worked with, and what have you used them for?
 +
#What programming languages/scripting languages do you know?  How well? Of the ones you know, which do you like best and least?
 +
#Describe one or two of your favorite programming projects (done for a class, for a job, on your own -- it doesn't matter).
 +
#What do you find to be the most interesting aspects of computing?  When you "play around" with computers, what sort of things do you do?
 +
#Have you had any experience with system administration?  What sort of work did you do (was it mangling an enterprise-wide gigabit-capacity network for a Fortune 500 company, or was it dusting off Apple II monitors in high school)?  Have you done anything particularly interesting?
 +
#Have you worked with Unix-like systems at all?  Have you ever set one up?  (Yes, Linux counts.)
 +
#What is your biggest gripe about Unix?  What would you change?
 +
#Emacs or vi?
 +
#What's the most difficult computer-related problem you've solved?
 +
#Why do you want to be a UGCS system administrator?
 +
#Is there anything else we should know?  Be creative. Lie, if necessary.
  
    Trying 131.215.43.33...
 
    </pre>
 
  
    ... and it hangs thereNow what might have gone wrong?  What tools
+
Answer as many of the following questions as you canGive as much detail as you can. You can ask people for help on any particular concept, but you can't have people answer the questions for you. Feel free to look at any documentation or source that you want. Don't worry if you don't know or can't figure out the answers - we're much more interested in your thought process than anything else. But remember: the more challenging problems you answer, the more chance you have to impress us. Please remember to cite your sources! Geez, sounds like a final, eh?
    would you use to find out for sure, and how would you use them? What
+
    other machines might you consult for information? <p>
+
  
<li>A couple questions about resource usage: <p>
+
Remember, don't worry if you can't figure out all of the questions- some of these questions are tricky.  As we said before, we are more interested in your thought processes and reasoning than we are in the technicalities of your answer.
  
    One of the UGCS machines has a full root filesystem. What might have
+
====General UNIX====
    gone wrong?  What's the best thing to do about it? <p>
+
# What is NFS? What problems does it have?  How can these problems be solved?
 +
# What is AFS?  How does it differ from NFS?  We're not looking for a detailed, comprehensive answer- we're just looking for a general overview and brief discussion.  See http://www.openafs.org, specifically the User's Guide, for more information.
 +
# Does Kerberos have something like SSH keys?  If it doesn't, what would it take to give Kerberos something like them?  Is this desirable?  ( We suggest you read [http://www.oliebol.org/Security%20Docs/moron%20guide%20to%20kerberos%20brian_security_kerberos.pdf The Moron's Guide to Kerberos])
 +
# UGCS has a cluster of similar machines.  Sometimes we may want to take some of them down for maintenance.  To minimize disruption to our users, it'd be nice to be able to migrate their processes off of a machine and on to another machine.  Is this possible with the standard Linux kernel?  What steps would you have to do to freeze a currently running process, send it to another machine, and then thaw it?
  
    The load on envy, UGCS's primary mail server, is 14.68.  Is this
+
====Resource Usage====
    okayIf it is, explain what's it's doing, and if not, how would
+
# Suppose the mail server at UGCS has a load average of 100.  Is this okWhy or why not?  (Big Hint: We have networked filesystems)
    you fix it?  Cover as many potential cases as you can.  <p>
+
# Suppose the web server on UGCS is locking up every 10 minutes.  What could be going wrong?  List a couple of possible scenarios.  How would you fix it?  Could you do it remotely?
  
<li>If you were a cracker or a miscreant trying to make the lives of the
+
===Hypothetical questions===
    UGCS sysadmins utterly miserable, what sorts of things could you do? <p>
+
  
    In a similar vein, what are some of the biggest security
 
    vulnerabilities in a Unix-like system?  Suggest ways (policies,
 
    background processes, etc.) to circumvent these security problems.<p>
 
  
</ol>
+
====Troubleshooting====
 +
One of the biggest issues that the UGCS sysadmins face is dealing with machines when they stop working.  Let's say you ssh to "to.ugcs.caltech.edu" and it hangs.  What might be wrong?
  
<hr>
+
Later on, you ssh to "to.ugcs.caltech.edu" and it keeps rejecting your password.  What could be wrong?
  
<address> <a href="/cgi-bin/finger?rliu">Harry Liu</a> </address>
+
For each scenario, list as many possible issues as you can think of and how you would distinguish between them.
<address> <a href="/cgi-bin/finger?stew">Shannon Stewman</a> </address>
+
<address> <a href="http://dylex.caltech.edu">Dylan Simon</a> </address>
+
  
</body>
+
====Security====
 +
If you were a cracker or a miscreant trying to make the lives of the  UGCS sysadmins utterly miserable, what sorts of things could you do? You may assume that they have access to a UGCS account.  In a similar vein, what are some of the biggest security  vulnerabilities in a Unix-like system?  Suggest ways (policies, background processes, etc.) to circumvent these security problems.

Latest revision as of 01:49, 20 December 2010

UGCS Sysadmin Survey

What's involved in being a sysadmin?

Being a sysadmin means a lot of things. It means answering multitudes of questions from users. It means finding and installing nifty new software, and keeping the existing software working. It means keeping the lab's hardware working reasonably well as well as keeping the lab nice and neat. It means dealing with obscure problems that you might otherwise just ignore. It means being on call 24 hours a day to deal with minor and major emergencies. In general, it means spending a lot of time keeping the lab a productive and fun place to get things done.

What's the incentive?

As a sysadmin, you will learn the gory details of UNIX systems inside and out, and you will gain a lot of experience in dealing with machines and people which may be helpful in later life. If you're the type of person we're looking for, noodling around on computers will be its own reward. You'll become a well-known person among the undergrad community and the CS department. And you'll also experience the personal satisfaction of making UGCS a better place in which to compute.

How much time?

There are no fixed hours. When the lab crashes, we have to fix it, but otherwise we set our own schedule. You can work when you have the time, and let other people handle things when you don't.

Students have run this lab for twenty years while holding regular class schedules. It can be stressful at times, but it's quite manageable. It is possible to keep this job over the summer while also working at something else at or very near Tech (like a SURF), but you have to be willing to spend a reasonable amount of time here. Generally, though, more important than the actual number of hours that you spend is your dedication to the job.

The dotted line

There have traditionally been between two and four sysadmins at any time. Since we are looking for people who will be able to remain sysadmins, we prefer sophomores and (especially) freshmen, but encourage everyone interested to apply. Previous experience in system administration is helpful but not at all necessary; more important is a desire to learn and the ability to deal with people.

In case of emergency... break glass...

Oh, and if you have any questions, contact one of us below. Although some of us seem surlier than others, we're all fine sysadmins.


Let the games begin!

  1. Name:
  2. Email address:
  3. Class (Fr, So, Jr, Sr, S^n Sr):
  4. Option (you don't have to be CS!):
  5. What computing hardware, operating systems and software have you worked with, and what have you used them for?
  6. What programming languages/scripting languages do you know? How well? Of the ones you know, which do you like best and least?
  7. Describe one or two of your favorite programming projects (done for a class, for a job, on your own -- it doesn't matter).
  8. What do you find to be the most interesting aspects of computing? When you "play around" with computers, what sort of things do you do?
  9. Have you had any experience with system administration? What sort of work did you do (was it mangling an enterprise-wide gigabit-capacity network for a Fortune 500 company, or was it dusting off Apple II monitors in high school)? Have you done anything particularly interesting?
  10. Have you worked with Unix-like systems at all? Have you ever set one up? (Yes, Linux counts.)
  11. What is your biggest gripe about Unix? What would you change?
  12. Emacs or vi?
  13. What's the most difficult computer-related problem you've solved?
  14. Why do you want to be a UGCS system administrator?
  15. Is there anything else we should know? Be creative. Lie, if necessary.


Answer as many of the following questions as you can. Give as much detail as you can. You can ask people for help on any particular concept, but you can't have people answer the questions for you. Feel free to look at any documentation or source that you want. Don't worry if you don't know or can't figure out the answers - we're much more interested in your thought process than anything else. But remember: the more challenging problems you answer, the more chance you have to impress us. Please remember to cite your sources! Geez, sounds like a final, eh?

Remember, don't worry if you can't figure out all of the questions- some of these questions are tricky. As we said before, we are more interested in your thought processes and reasoning than we are in the technicalities of your answer.

General UNIX

  1. What is NFS? What problems does it have? How can these problems be solved?
  2. What is AFS? How does it differ from NFS? We're not looking for a detailed, comprehensive answer- we're just looking for a general overview and brief discussion. See http://www.openafs.org, specifically the User's Guide, for more information.
  3. Does Kerberos have something like SSH keys? If it doesn't, what would it take to give Kerberos something like them? Is this desirable? ( We suggest you read The Moron's Guide to Kerberos)
  4. UGCS has a cluster of similar machines. Sometimes we may want to take some of them down for maintenance. To minimize disruption to our users, it'd be nice to be able to migrate their processes off of a machine and on to another machine. Is this possible with the standard Linux kernel? What steps would you have to do to freeze a currently running process, send it to another machine, and then thaw it?

Resource Usage

  1. Suppose the mail server at UGCS has a load average of 100. Is this ok? Why or why not? (Big Hint: We have networked filesystems)
  2. Suppose the web server on UGCS is locking up every 10 minutes. What could be going wrong? List a couple of possible scenarios. How would you fix it? Could you do it remotely?

Hypothetical questions

Troubleshooting

One of the biggest issues that the UGCS sysadmins face is dealing with machines when they stop working. Let's say you ssh to "to.ugcs.caltech.edu" and it hangs. What might be wrong?

Later on, you ssh to "to.ugcs.caltech.edu" and it keeps rejecting your password. What could be wrong?

For each scenario, list as many possible issues as you can think of and how you would distinguish between them.

Security

If you were a cracker or a miscreant trying to make the lives of the UGCS sysadmins utterly miserable, what sorts of things could you do? You may assume that they have access to a UGCS account. In a similar vein, what are some of the biggest security vulnerabilities in a Unix-like system? Suggest ways (policies, background processes, etc.) to circumvent these security problems.

Personal tools