Website:Sysadmin Survey

From UGCS
(Difference between revisions)
Jump to: navigation, search
(Old sysadmin survey (2001))
 
(Wikified the old html and removed some outdated questions)
Line 1: Line 1:
<head>
+
==UGCS Sysadmin Search==
<title> UGCS Sysadmin Search </title>
+
</head>
+
  
<body>
+
===What's involved in being a sysadmin?===
<h2> UGCS Sysadmin Search </h2>
+
 
+
<h4> What's involved in being a sysadmin? </h4>
+
  
 
Being a sysadmin means a lot of things.  It means answering multitudes of
 
Being a sysadmin means a lot of things.  It means answering multitudes of
 
questions from users.  It means finding and installing nifty new software,
 
questions from users.  It means finding and installing nifty new software,
 
and keeping the existing software working.  It means keeping the lab's
 
and keeping the existing software working.  It means keeping the lab's
hardware working reasonably well, and begging the CS department for money
+
hardware working reasonably well as well as keeping the lab nice and neat.  It means dealing with obscure problems that
when current hardware fails.  It means dealing with obscure problems that
+
you might otherwise just ignore.  It means being on
you might otherwise just ignore.  It means getting access to just about
+
anything in Jorgensen.  It means maintaining a good deal of generally
+
well-written, but largely undocumented, local hacks.  It means being on
+
 
call 24 hours a day to deal with minor and major emergencies.  In general,
 
call 24 hours a day to deal with minor and major emergencies.  In general,
 
it means spending a lot of time keeping the lab a productive and fun place
 
it means spending a lot of time keeping the lab a productive and fun place
to get things done. <p>
+
to get things done.
  
<h4> What's the incentive? </h4>
+
===What's the incentive?===
  
 
As a sysadmin, you will learn the gory details of UNIX systems inside and
 
As a sysadmin, you will learn the gory details of UNIX systems inside and
Line 28: Line 20:
 
You'll become a well-known person among the undergrad community and the CS
 
You'll become a well-known person among the undergrad community and the CS
 
department.  And you'll also experience the personal satisfaction of making
 
department.  And you'll also experience the personal satisfaction of making
UGCS a better place in which to compute. <p>
+
UGCS a better place in which to compute.
 
+
Of course, there is some pay involved, too.  However, for historical and
+
political reasons, it's considerably less than it should be.  Currently the
+
UGCS budget is roughly $5000 for the year, to be split up between buying
+
hardware and paying the sysadmins.  However, <b>we've gotten approval for
+
the sysadmin job to count as work-study</b>, which means that the financial
+
aid department will match some of the money that you get paid.  This only
+
helps those of you who have work-study aid, but it's better than before.
+
  
<h4> How much time? </h4>
+
===How much time?==
  
 
There are no fixed hours.  When the lab crashes, we have to fix it, but  
 
There are no fixed hours.  When the lab crashes, we have to fix it, but  
 
otherwise we set our own schedule.  You can work when you have the time,
 
otherwise we set our own schedule.  You can work when you have the time,
and let other people handle things when you don't. <p>
+
and let other people handle things when you don't.
  
Students have run this lab for over ten years while holding regular class
+
Students have run this lab for over fifteen years while holding regular class
 
schedules.  It can be stressful at times (like midterms), but it's quite
 
schedules.  It can be stressful at times (like midterms), but it's quite
 
manageable.  It is possible to keep this job over the summer while also
 
manageable.  It is possible to keep this job over the summer while also
Line 50: Line 34:
 
to be willing to spend a reasonable amount of time here.  Generally,
 
to be willing to spend a reasonable amount of time here.  Generally,
 
though, more important than the actual number of hours that you spend is
 
though, more important than the actual number of hours that you spend is
your dedication to the job.<p>
+
your dedication to the job.
  
<h4> The dotted line </h4>
+
==The dotted line===
  
 
If the above hasn't scared you away from wanting to be a sysadmin, please
 
If the above hasn't scared you away from wanting to be a sysadmin, please
Line 60: Line 44:
 
We will send you e-mail acknowledging receipt of your application.  We will  
 
We will send you e-mail acknowledging receipt of your application.  We will  
 
decide which candidates to interview and let you know by Wednesday, April
 
decide which candidates to interview and let you know by Wednesday, April
11. <p>
+
11.
  
 
There are usually between two and four UGCS sysadmins at any given time.
 
There are usually between two and four UGCS sysadmins at any given time.
Line 67: Line 51:
 
and generally sysadmins stay sysadmins until they graduate, and even
 
and generally sysadmins stay sysadmins until they graduate, and even
 
then some.  We'd like to accept the applicants by this upcoming midterms
 
then some.  We'd like to accept the applicants by this upcoming midterms
and train the second half of this term. <p>
+
and train the second half of this term.
  
 
Since we are looking for people who will be able to continue, we prefer
 
Since we are looking for people who will be able to continue, we prefer
Line 73: Line 57:
 
to apply.  Previous experience in system administration is helpful but
 
to apply.  Previous experience in system administration is helpful but
 
not at all necessary; more important is a desire to learn and the ability
 
not at all necessary; more important is a desire to learn and the ability
to deal with people. <p>
+
to deal with people.  
  
<h4> In case of emergency... break glass... </h4>
+
===In case of emergency... break glass...===
  
 
Oh, and if you have any questions, contact one of us below.  Although some
 
Oh, and if you have any questions, contact one of us below.  Although some
of us seem surlier than others, we're all fine sysadmins. <p>
+
of us seem surlier than others, we're all fine sysadmins.
  
 
<hr>
 
<hr>
  
<h3> Let the games begin! </h3>
+
===Let the games begin!===
  
<ol>
+
#Name:
  
<li>Name: <p>
+
#Email address:
  
<li>Email address: <p>
+
#Class (Fr, So, Jr, Sr, S^n Sr):  
  
<li>Class (Fr, So, Jr, Sr, S^n Sr): <p>
+
#Option (you don't have to be CS!):
  
<li>Option (you don't have to be CS!): <p>
+
#What computing hardware, operating systems and software have you worked
 +
    with, and what have you used them for?
  
<li>What computing hardware, operating systems and software have you worked
+
#What programming languages/scripting languages do you know?  How well?
     with, and what have you used them for? <p>
+
     Of the ones you know, which do you like best and least?
  
<li>What programming languages/scripting languages do you know?  How well?
+
#Describe one or two of your favorite programming projects (done for a
     Of the ones you know, which do you like best and least?<p>
+
     class, for a job, on your own -- it doesn't matter).
  
<li>Describe one or two of your favorite programming projects (done for a
+
#What do you find to be the most interesting aspects of computing?  When
     class, for a job, on your own -- it doesn't matter). <p>
+
     you "play around" with computers, what sort of things do you do?
  
<li>What do you find to be the most interesting aspects of computing?  When
+
#Have you had any experience with system administration?  What sort of
    you "play around" with computers, what sort of things do you do? <p>
+
 
+
<li>Have you had any experience with system administration?  What sort of
+
 
     work did you do (was it mangling an enterprise-wide gigabit-capacity
 
     work did you do (was it mangling an enterprise-wide gigabit-capacity
 
     network for a Fortune 500 company, or was it dusting off Apple II
 
     network for a Fortune 500 company, or was it dusting off Apple II
 
     monitors in high school)?  Have you done anything particularly
 
     monitors in high school)?  Have you done anything particularly
     interesting? <p>
+
     interesting?
  
<li>Have you worked with Unix-like systems at all?  Have you ever set
+
#Have you worked with Unix-like systems at all?  Have you ever set
     one up?  (Yes, Linux counts.) <p>
+
     one up?  (Yes, Linux counts.)
  
<li>What is your biggest gripe about Unix?  What would you change? <p>
+
#What is your biggest gripe about Unix?  What would you change?
  
<li>Emacs or vi? <p>
+
#Emacs or vi?
  
<li>What's the most difficult computer-related problem you've solved?<p>
+
#What's the most difficult computer-related problem you've solved?
  
<li>Why do you want to be a UGCS system administrator? <p>
+
#Why do you want to be a UGCS system administrator?
 +
 
 +
#Is there anything else we should know?  Be creative.  Lie, if
 +
    necessary.
  
<li>Is there anything else we should know?  Be creative.  Lie, if
 
    necessary. <p>
 
  
</ul>
 
  
 
<hr>
 
<hr>
Line 132: Line 114:
 
as much detail as you can.  You can ask people for help on any particular
 
as much detail as you can.  You can ask people for help on any particular
 
concept, but you can't have people answer the questions for you.  Feel free
 
concept, but you can't have people answer the questions for you.  Feel free
to look at any documentation or source that you want.  <b>Don't worry</b>
+
to look at any documentation or source that you want.  '''Don't worry'''
 
if you don't know or can't figure out the answers - we're much more
 
if you don't know or can't figure out the answers - we're much more
 
interested in your thought process than anything else.  But remember: the
 
interested in your thought process than anything else.  But remember: the
 
more challenging problems you answer, the more chance you have to impress
 
more challenging problems you answer, the more chance you have to impress
us.  Geez, sounds like a final, eh? <p>
+
us.  Geez, sounds like a final, eh?
  
<hr> <h3>Technical questions</h3>
+
===Technical questions===
  
<ol>
 
  
<li>Describe NFS's method for dealing with fcntl()-style (e.g.
+
#Describe NFS's method for dealing with fcntl()-style (e.g.
 
     kernel-supported) file locking.  What processes are involved?  How do
 
     kernel-supported) file locking.  What processes are involved?  How do
 
     they communicate?  Why is this external system needed - why can't it be
 
     they communicate?  Why is this external system needed - why can't it be
     integrated into standard NFS? <p>
+
     integrated into standard NFS?
  
 
     What are some other ways that file locking can be attempted over NFS?
 
     What are some other ways that file locking can be attempted over NFS?
     What are the problems with these methods? <p>
+
     What are the problems with these methods?
  
 
     The most critical cluster-locking problems typically involve email
 
     The most critical cluster-locking problems typically involve email
Line 155: Line 136:
 
     mail systems: MH, elm, and pine.
 
     mail systems: MH, elm, and pine.
  
<li>How does the name service for "to.ugcs.caltech.edu" work?  What benefit
+
#How does the name service for "to.ugcs.caltech.edu" work?  What benefit
 
     does it have over regular name service?  Why might it not be
 
     does it have over regular name service?  Why might it not be
     appropriate for serving "www.ugcs.caltech.edu"? <p>
+
     appropriate for serving "www.ugcs.caltech.edu"?  
  
 
     The to.ugcs service is not as reliable as we'd like it to be.
 
     The to.ugcs service is not as reliable as we'd like it to be.
 
     Describe the major problem or problems that keep it from operating
 
     Describe the major problem or problems that keep it from operating
     consistently and how you'd deal with such problems. <p>
+
     consistently and how you'd deal with such problems.  
  
 
     If you were to design a load-balancing service, describe the advantages
 
     If you were to design a load-balancing service, describe the advantages
 
     and disadvantages of both using UDP broadcast or TCP connections to
 
     and disadvantages of both using UDP broadcast or TCP connections to
     assess the load on the clients. <p>
+
     assess the load on the clients.
  
<li>If you haven't already, look up and read the Internet RFC standards  
+
#If you haven't already, look up and read the Internet RFC standards  
 
     document for the NFS (Network File System) protocol.  To understand it,
 
     document for the NFS (Network File System) protocol.  To understand it,
 
     you will probably also have to read the RFC for XDR (eXternal Data
 
     you will probably also have to read the RFC for XDR (eXternal Data
 
     Representation) and the ONC implementation of RPC (Remote Procedure
 
     Representation) and the ONC implementation of RPC (Remote Procedure
     Call). <p>
+
     Call).
  
 
     Describe how authentication works under NFS.  What is a file handle,
 
     Describe how authentication works under NFS.  What is a file handle,
Line 178: Line 159:
 
     have to work from any system on the Internet.  What can be done to fix
 
     have to work from any system on the Internet.  What can be done to fix
 
     these problems?  Do you think your breakin methods would actually work
 
     these problems?  Do you think your breakin methods would actually work
     on systems running on the Internet today? <p>
+
     on systems running on the Internet today?
  
<li>Managing a user's resource usage is a consistent problem in
+
#Managing a user's resource usage is a consistent problem in
 
     adminstering any cluster.  The biggest problems tend to be disk
 
     adminstering any cluster.  The biggest problems tend to be disk
 
     quota and mail spool quota.  Please describe solutions to both
 
     quota and mail spool quota.  Please describe solutions to both
 
     problems.  Keep in mind that user home directories are access via
 
     problems.  Keep in mind that user home directories are access via
 
     NFS and that resource maintainence must work without flaws and
 
     NFS and that resource maintainence must work without flaws and
     without intervention 99% of the time. <p>
+
     without intervention 99% of the time.
  
<li>Describe, in as much detail as you want, what happens from the time
+
#Describe, in as much detail as you want, what happens from the time
     you type "telnet <machine>" (or SSH, if you want) to the moment you're
+
     you type "ssh <machine>" to the moment you're
 
     given a prompt.  Assume everything works normally: you're attached to
 
     given a prompt.  Assume everything works normally: you're attached to
 
     a pty, etc.  You might want to comment on the differences between
 
     a pty, etc.  You might want to comment on the differences between
 
     a network login and a console login: what different processes are  
 
     a network login and a console login: what different processes are  
     involved, what is the role of telnetd, etc.<p>
+
     involved, what is the role of sshd, etc.  
  
<li>Sometimes machines, even Unix boxes, hang.  Often this is a result
+
#Sometimes machines, even Unix boxes, hang.  Often this is a result
 
     of high load or too few resources (file descriptors, memory,
 
     of high load or too few resources (file descriptors, memory,
 
     available processes, etc.)  Design a program which would allow a
 
     available processes, etc.)  Design a program which would allow a
Line 200: Line 181:
 
     Remember, it must function in a resource-scarce environment.  State
 
     Remember, it must function in a resource-scarce environment.  State
 
     the considerations that such a program would need to account for
 
     the considerations that such a program would need to account for
     and don't forget the problem of authenticating the user. <p>
+
     and don't forget the problem of authenticating the user.
  
<!--
+
#Suppose an important daemon, say "hosed", stops responding to remote
<li>How would you write a program to dump the core of another user's
+
    processes? 
+
    Describe the steps, from userland to kernel space,
+
    that such an action would require.  What kernel features are
+
    required for this to work?  What userland considerations should
+
    you take into account? <p>
+
-->
+
 
+
<li>Suppose an important daemon, say "hosed", stops responding to remote
+
 
     connections, signals, etc.  You want to debug what's happened with
 
     connections, signals, etc.  You want to debug what's happened with
 
     this process, but it's essential that the process continue to run.
 
     this process, but it's essential that the process continue to run.
Line 221: Line 193:
 
     Be specific.<p>
 
     Be specific.<p>
  
<li>UGCS has a long history of using wacky directory/symlink arrangements
 
    to arrange source code, executables, and auxiliary files in a logical
 
    manner and of using various programs, OS features, and scripts to keep
 
    copies of commonly used files on the local disk of each computer.  This
 
    is a difficult problem, and one that we still haven't fully solved.
 
    There are a lot of issues to be considered, such as ability to find
 
    source code for installed program, ability to keep track of what was
 
    required to compile a program for a particular architecture, ability to
 
    tell what package a file belongs to, ability to distribute commonly
 
    used files while keeping less commonly used files on a server, ability
 
    to easily install new software, and ability to utterly and totally
 
    confuse any non-sysadmin looking for a file. <p>
 
 
    Describe a cool file arrangement/distribution scheme.  It need not be
 
    at all similar to UGCS's scheme, but should be applicable to the UGCS
 
    setup - a slow network of many computers of several architectures with
 
    limited root disk space.  It should simplify or solve many of the
 
    issues listed above (don't worry about the last one.)  You should
 
    describe how files are arranged, and any scripts or programs which are
 
    required.  Be sure to describe the rationale for various aspects of
 
    your scheme. <p>
 
 
<li>Traditionally Unix has used the "utmp" file for tracking login/logout
 
    records.  This is a binary-data file, in a mostly (but not completely)
 
    portable format.  Depending on how you log in, /bin/login, telnetd,
 
    rlogind, etc. are responsible for adding an entry to it, and
 
    /sbin/init, telnetd, rlogind, etc. are responsible for deleting an
 
    entry from it.  What are the weaknesses in this scheme?  (There are a
 
    number of them.) <p>
 
 
    UGCS uses a different scheme for login tracking; the program /bin/login
 
    (which has been rewritten from scratch) is now solely responsible for
 
    adding and removing entries from the login tracking "database".  Describe
 
    how the UGCS login program works, and suggest some improvements.  (It
 
    isn't perfect!)  The source for the UGCS login program lies in
 
    <tt>/ug/src/flat/login-0.0</tt> and <tt>/ug/src/flat/libugcs-0.0</tt>.<p>
 
  
</ol> <p>
 
  
<hr> <h3> Hypothetical questions </h3>
+
===Hypothetical questions===
  
Note: some of these are somewhat UGCS-specific. If you don't have a UGCS
+
Note: some of these are somewhat UGCS-specific.  
account, you can either fill out an account application in 166 Jorgensen
+
(and send us mail to make sure we process it quickly), or answer the
+
questions for a similar set of Unix systems. <p>
+
  
<ol>
 
  
<li>You telnet to "to.ugcs.caltech.edu" and get: <pre>
+
#You telnet to "to.ugcs.caltech.edu" and get: <pre>
  
 
     Trying 131.215.43.33...
 
     Trying 131.215.43.33...
Line 286: Line 217:
 
     ... and it hangs there.  Now what might have gone wrong?  What tools
 
     ... and it hangs there.  Now what might have gone wrong?  What tools
 
     would you use to find out for sure, and how would you use them?  What
 
     would you use to find out for sure, and how would you use them?  What
     other machines might you consult for information? <p>
+
     other machines might you consult for information?
  
<li>A couple questions about resource usage: <p>
+
#A couple questions about resource usage:
  
 
     One of the UGCS machines has a full root filesystem.  What might have
 
     One of the UGCS machines has a full root filesystem.  What might have
     gone wrong?  What's the best thing to do about it? <p>
+
     gone wrong?  What's the best thing to do about it?  
  
     The load on envy, UGCS's primary mail server, is 14.68.  Is this
+
     The load on hermes, UGCS's primary mail server, is 14.68.  Is this
 
     okay?  If it is, explain what's it's doing, and if not, how would
 
     okay?  If it is, explain what's it's doing, and if not, how would
     you fix it?  Cover as many potential cases as you can. <p>
+
     you fix it?  Cover as many potential cases as you can.  
  
<li>If you were a cracker or a miscreant trying to make the lives of the
+
#If you were a cracker or a miscreant trying to make the lives of the
     UGCS sysadmins utterly miserable, what sorts of things could you do? <p>
+
     UGCS sysadmins utterly miserable, what sorts of things could you do?
  
 
     In a similar vein, what are some of the biggest security
 
     In a similar vein, what are some of the biggest security
 
     vulnerabilities in a Unix-like system?  Suggest ways (policies,
 
     vulnerabilities in a Unix-like system?  Suggest ways (policies,
     background processes, etc.) to circumvent these security problems.<p>
+
     background processes, etc.) to circumvent these security problems.
 
+
</ol>
+
 
+
<hr>
+
 
+
<address> <a href="/cgi-bin/finger?rliu">Harry Liu</a> </address>
+
<address> <a href="/cgi-bin/finger?stew">Shannon Stewman</a> </address>
+
<address> <a href="http://dylex.caltech.edu">Dylan Simon</a> </address>
+
 
+
</body>
+

Revision as of 06:32, 14 January 2008

Contents

UGCS Sysadmin Search

What's involved in being a sysadmin?

Being a sysadmin means a lot of things. It means answering multitudes of questions from users. It means finding and installing nifty new software, and keeping the existing software working. It means keeping the lab's hardware working reasonably well as well as keeping the lab nice and neat. It means dealing with obscure problems that you might otherwise just ignore. It means being on call 24 hours a day to deal with minor and major emergencies. In general, it means spending a lot of time keeping the lab a productive and fun place to get things done.

What's the incentive?

As a sysadmin, you will learn the gory details of UNIX systems inside and out, and you will gain a lot of experience in dealing with machines and people which may be helpful in later life. If you're the type of person we're looking for, noodling around on computers will be its own reward. You'll become a well-known person among the undergrad community and the CS department. And you'll also experience the personal satisfaction of making UGCS a better place in which to compute.

=How much time?

There are no fixed hours. When the lab crashes, we have to fix it, but otherwise we set our own schedule. You can work when you have the time, and let other people handle things when you don't.

Students have run this lab for over fifteen years while holding regular class schedules. It can be stressful at times (like midterms), but it's quite manageable. It is possible to keep this job over the summer while also working at something else at or very near Tech (like a SURF), but you have to be willing to spend a reasonable amount of time here. Generally, though, more important than the actual number of hours that you spend is your dedication to the job.

The dotted line=

If the above hasn't scared you away from wanting to be a sysadmin, please answer the following questions and email your answers to sysadmin@ugcs by 11:59 PM, Sunday, April 1, 2001.

We will send you e-mail acknowledging receipt of your application. We will decide which candidates to interview and let you know by Wednesday, April 11.

There are usually between two and four UGCS sysadmins at any given time. The position is for the current term this year, continuing through the summer, and into next year. There is no expiration period, though, and generally sysadmins stay sysadmins until they graduate, and even then some. We'd like to accept the applicants by this upcoming midterms and train the second half of this term.

Since we are looking for people who will be able to continue, we prefer sophomores and (especially) freshmen, but encourage everyone interested to apply. Previous experience in system administration is helpful but not at all necessary; more important is a desire to learn and the ability to deal with people.

In case of emergency... break glass...

Oh, and if you have any questions, contact one of us below. Although some of us seem surlier than others, we're all fine sysadmins.


Let the games begin!

  1. Name:
  1. Email address:
  1. Class (Fr, So, Jr, Sr, S^n Sr):
  1. Option (you don't have to be CS!):
  1. What computing hardware, operating systems and software have you worked
   with, and what have you used them for?
  1. What programming languages/scripting languages do you know? How well?
   Of the ones you know, which do you like best and least?
  1. Describe one or two of your favorite programming projects (done for a
   class, for a job, on your own -- it doesn't matter).
  1. What do you find to be the most interesting aspects of computing? When
   you "play around" with computers, what sort of things do you do?
  1. Have you had any experience with system administration? What sort of
   work did you do (was it mangling an enterprise-wide gigabit-capacity
   network for a Fortune 500 company, or was it dusting off Apple II
   monitors in high school)?  Have you done anything particularly
   interesting?
  1. Have you worked with Unix-like systems at all? Have you ever set
   one up?  (Yes, Linux counts.)
  1. What is your biggest gripe about Unix? What would you change?
  1. Emacs or vi?
  1. What's the most difficult computer-related problem you've solved?
  1. Why do you want to be a UGCS system administrator?
  1. Is there anything else we should know? Be creative. Lie, if
   necessary.



Answer at least one question from each of the following sections. Give as much detail as you can. You can ask people for help on any particular concept, but you can't have people answer the questions for you. Feel free to look at any documentation or source that you want. Don't worry if you don't know or can't figure out the answers - we're much more interested in your thought process than anything else. But remember: the more challenging problems you answer, the more chance you have to impress us. Geez, sounds like a final, eh?

Technical questions

  1. Describe NFS's method for dealing with fcntl()-style (e.g.
   kernel-supported) file locking.  What processes are involved?  How do
   they communicate?  Why is this external system needed - why can't it be
   integrated into standard NFS?
   What are some other ways that file locking can be attempted over NFS?
   What are the problems with these methods?
   The most critical cluster-locking problems typically involve email
   spools.  Describe other methods for dealing with the issue of mail
   locking.  Remember that you have to support all three of our major
   mail systems: MH, elm, and pine.
  1. How does the name service for "to.ugcs.caltech.edu" work? What benefit
   does it have over regular name service?  Why might it not be
   appropriate for serving "www.ugcs.caltech.edu"? 
   The to.ugcs service is not as reliable as we'd like it to be.
   Describe the major problem or problems that keep it from operating
   consistently and how you'd deal with such problems. 
   If you were to design a load-balancing service, describe the advantages
   and disadvantages of both using UDP broadcast or TCP connections to
   assess the load on the clients.
  1. If you haven't already, look up and read the Internet RFC standards
   document for the NFS (Network File System) protocol.  To understand it,
   you will probably also have to read the RFC for XDR (eXternal Data
   Representation) and the ONC implementation of RPC (Remote Procedure
   Call).
   Describe how authentication works under NFS.  What is a file handle,
   and what role does it play?  Describe at least three ways security
   could be breached on a system running NFS; at least two of the ways
   have to work from any system on the Internet.  What can be done to fix
   these problems?  Do you think your breakin methods would actually work
   on systems running on the Internet today?
  1. Managing a user's resource usage is a consistent problem in
   adminstering any cluster.  The biggest problems tend to be disk
   quota and mail spool quota.  Please describe solutions to both
   problems.  Keep in mind that user home directories are access via
   NFS and that resource maintainence must work without flaws and
   without intervention 99% of the time.
  1. Describe, in as much detail as you want, what happens from the time
   you type "ssh <machine>" to the moment you're
   given a prompt.  Assume everything works normally: you're attached to
   a pty, etc.  You might want to comment on the differences between
   a network login and a console login: what different processes are 
   involved, what is the role of sshd, etc. 
  1. Sometimes machines, even Unix boxes, hang. Often this is a result
   of high load or too few resources (file descriptors, memory,
   available processes, etc.)  Design a program which would allow a
   remote administrator to diagnose problems and free up resources.
   Remember, it must function in a resource-scarce environment.  State
   the considerations that such a program would need to account for
   and don't forget the problem of authenticating the user.
  1. Suppose an important daemon, say "hosed", stops responding to remote
   connections, signals, etc.  You want to debug what's happened with
   this process, but it's essential that the process continue to run.
   A simple system-call trace won't tell you the events that led
   up to the freeze, so you need to analyze its core.  If you can,
   write a program for Linux to dump the core of "hosed" while it's
   still running, and without killing it.  Otherwise describe what
   changes (user- or kernel-level) need to be made for this to work.
Be specific.

Hypothetical questions

Note: some of these are somewhat UGCS-specific.


  1. You telnet to "to.ugcs.caltech.edu" and get:
   Trying 131.215.43.33...
   Connected to necro.ugcs.caltech.edu.
   Escape character is '^]'.
   ... and then it hangs.  What might have gone wrong?  How can you find out
   for sure?  How can you fix it? <p>
Later on, you telnet to "to.ugcs.caltech.edu" and get:

    Trying 131.215.43.33...
    
   ... and it hangs there.  Now what might have gone wrong?  What tools
   would you use to find out for sure, and how would you use them?  What
   other machines might you consult for information?
  1. A couple questions about resource usage:
   One of the UGCS machines has a full root filesystem.  What might have
   gone wrong?  What's the best thing to do about it? 
   The load on hermes, UGCS's primary mail server, is 14.68.  Is this
   okay?  If it is, explain what's it's doing, and if not, how would
   you fix it?  Cover as many potential cases as you can. 
  1. If you were a cracker or a miscreant trying to make the lives of the
   UGCS sysadmins utterly miserable, what sorts of things could you do?
   In a similar vein, what are some of the biggest security
   vulnerabilities in a Unix-like system?  Suggest ways (policies,
background processes, etc.) to circumvent these security problems.

Personal tools