Shellserver Systemimager

From UGCS
(Difference between revisions)
Jump to: navigation, search
 
(9 intermediate revisions by one user not shown)
Line 7: Line 7:
 
* Be netbootable
 
* Be netbootable
 
* Requires no kernel patches
 
* Requires no kernel patches
 +
* Setup a new machine with no interaction
  
 
==Overview==
 
==Overview==
The system uses a list of precomputed md5sums as well as a list of all directories and symlinks to check the system integrity.
+
The system uses a list of precomputed md5sums as well as a list of all directories and symlinks to check the system integrity. Needed files are copied from a central rsync server via the rsync protocol (not using rsync-through-ssh)
  
 
* The machine netboots off of a standard kernel and initramfs.  It starts up normally, but hits a few scripts that we have added.
 
* The machine netboots off of a standard kernel and initramfs.  It starts up normally, but hits a few scripts that we have added.
 
* Before the system tries to mount the root filesystem, a script (local-top/make_partitions) will check for the given partitions and make them if necessary
 
* Before the system tries to mount the root filesystem, a script (local-top/make_partitions) will check for the given partitions and make them if necessary
* After mounting the root filesystem, the main script goes into action (init-bottom/ugcs_rsync).  This script:
+
* After mounting the root filesystem, the main script goes into action (init-bottom/ugcs_rsync).  This runs fschecker which removes old files and prints a list of files to be rsync'd from the image server.
*# Downloads the md5sum, directory, and symlink lists (from /var/lib/ugcs)
+
*# Checks the directories, and removes "stale" directories and creates new ones (new ones are created by rsync'ing them off of the image server)
+
*# Computes md5sums of every file on the system
+
*# Diff's the md5sum list of this system with the reference system
+
*# All changed files are first removed, then rsync'd back (this makes sure that files that shouldn't be there no longer exist)
+
*# All symlinks are removed and then re-created through rsync
+
 
* After the system is booted into /sbin/init, we run some scripts in rcS.d to get the initial configuration (using cfengine).
 
* After the system is booted into /sbin/init, we run some scripts in rcS.d to get the initial configuration (using cfengine).
  
 
=Overview of debian initramfs boot process=
 
=Overview of debian initramfs boot process=
When a debian system boots, it runs an init shell script from the initramfs (it is copied from /usr/share/initramfs-tools/init)  This runs through several script directories to mount the root filesystem and run any initialization.
+
When a debian system boots, it runs an init shell script from the initramfs (it is copied from /usr/share/initramfs-tools/init)  This runs through several script directories to mount the root filesystem and run any initialization.  These scripts need to have one special feature: when run with "prereqs" as the first argument, it should print a list of scripts that should be run before it is run.  See existing scripts to see how this should be done.
  
When the initramfs is created, a series of "hooks" are run to add additional content.  These hooks can copy configuration files or executables over.
+
When the initramfs is created, a series of "hooks" are run to add additional content.  These hooks can copy configuration files or executables over.  See existing ones to figure out how to make them.
 +
 
 +
There are a couple of important command-line kernel arguments that affect initramfs's init.  One of the most interesting is "break=###".  This lets you drop into a shell at various points- see /usr/share/initramfs-tools/init and look for "maybe_break" to see possible break points.
  
 
If you make changes to any of these scripts, you need to re-run update-initramfs to create the new initramfs, and then you need to somehow copy it over to the netboot server.
 
If you make changes to any of these scripts, you need to re-run update-initramfs to create the new initramfs, and then you need to somehow copy it over to the netboot server.
 +
 +
=Things you might need to change=
 +
* You might need to change the LVM partitioning scheme.  Look at the bottom of local-top/make_partitions and follow the stuff that's already there.
 +
* You can add a file to the rsync blacklist by adding it to /etc/rsyncimager/blacklist
 +
* The image server location is stored in /etc/rsyncimager/rsync_server
  
 
=Implementation Notes=
 
=Implementation Notes=
 +
==update-md5lists==
 +
This is the script that creates the appropriate md5sum lists.  It is currently in /usr/local/bin.  After you make changes to the golden client, you need to re-run this script so that the new md5sum list is re-generated.  It has a long list of find excludes to keep certain areas away from rsync (things like /var/log, etc)
 +
 +
Eventually we will have a utility that lets you update the md5sums of just a few files without re-generating the entire list.
 +
 
==local-top/make_partitions==
 
==local-top/make_partitions==
 +
Partition names and sizes are hard-coded into this script.  New LVM logical volumes are only created if they don't exist, so you can safely change these if you need to setup a new class of machines.  Eventually, it might have some intelligence.
 +
 
For some reason, mkfs.ext3 doesn't seem to work when it is copied into the initramfs.  To overcome this, we have an init script in rcS.d make the ext3 partitions (currently just afscache)
 
For some reason, mkfs.ext3 doesn't seem to work when it is copied into the initramfs.  To overcome this, we have an init script in rcS.d make the ext3 partitions (currently just afscache)
 +
 
==init-bottom/ugcs_rsync==
 
==init-bottom/ugcs_rsync==
A simple way to do this would be to use rsync to copy everything.  However, rsync has high overhead, even when most of the files are the same.  To overcome this, we use our own md5sum lists and diff them.  Also, customizing rsync to do exactly what we want can be difficult and error-prone.  Find commands are much more robust, so our system uses them extensively.
+
A simple way to do this would be to use rsync to copy everything.  However, rsync has high overhead, even when most of the files are the same.  To overcome this, we use our own md5sum lists and diff them.  Also, customizing rsync to do exactly what we want can be difficult and error-prone.   
 +
 
 +
To improve speed and features (like including permission checks), we have a set of custom C programs that do this. The source is in /afs/.ugcs/ugcs-admin/source/rsyncimager/src/fschecker.
 +
 
 
==init-bottom/ldconfig==
 
==init-bottom/ldconfig==
Occasionally you will have an issue where the linker cache doesn't match up to what is on the system.  To fix this, we run ldconfig just to make sure (it doesn't take very long)
+
Occasionally you will have an issue where the linker cache doesn't match up to what is on the system.  To fix this, we run ldconfig just to make sure (it doesn't take very long) If you get weird errors about init not found, it is probably a linker error.
 +
 
 
==hooks/make_partitions==
 
==hooks/make_partitions==
 
This copies over the appropriate executables for the make_partitions initramfs script
 
This copies over the appropriate executables for the make_partitions initramfs script
==hooks/ugcs_rsync==
+
==hooks/rsyncimager==
This copies over the appropriate files so ugcs_rsync has rsync, etc. It also copies over find and xargs since the busybox versions don't have all the features we need.
+
This copies over the appropriate binaries for rsyncimager as well as the config files.
 
+
  
 
[[Category:Sysadmin_Documentation]]
 
[[Category:Sysadmin_Documentation]]

Latest revision as of 19:54, 20 February 2010

We have many shellservers, and we need a way to make sure they stay up to date as well as a system to automatically set up new ones. We have a custom set of scripts that takes care of this for us.

Contents

Features

  • On any boot, will check the system integrity and update new files as necessary
  • Can easily handle the 500000+files we have.
  • Use our current cfengine setup to configure machines
  • Be netbootable
  • Requires no kernel patches
  • Setup a new machine with no interaction

Overview

The system uses a list of precomputed md5sums as well as a list of all directories and symlinks to check the system integrity. Needed files are copied from a central rsync server via the rsync protocol (not using rsync-through-ssh)

  • The machine netboots off of a standard kernel and initramfs. It starts up normally, but hits a few scripts that we have added.
  • Before the system tries to mount the root filesystem, a script (local-top/make_partitions) will check for the given partitions and make them if necessary
  • After mounting the root filesystem, the main script goes into action (init-bottom/ugcs_rsync). This runs fschecker which removes old files and prints a list of files to be rsync'd from the image server.
  • After the system is booted into /sbin/init, we run some scripts in rcS.d to get the initial configuration (using cfengine).

Overview of debian initramfs boot process

When a debian system boots, it runs an init shell script from the initramfs (it is copied from /usr/share/initramfs-tools/init) This runs through several script directories to mount the root filesystem and run any initialization. These scripts need to have one special feature: when run with "prereqs" as the first argument, it should print a list of scripts that should be run before it is run. See existing scripts to see how this should be done.

When the initramfs is created, a series of "hooks" are run to add additional content. These hooks can copy configuration files or executables over. See existing ones to figure out how to make them.

There are a couple of important command-line kernel arguments that affect initramfs's init. One of the most interesting is "break=###". This lets you drop into a shell at various points- see /usr/share/initramfs-tools/init and look for "maybe_break" to see possible break points.

If you make changes to any of these scripts, you need to re-run update-initramfs to create the new initramfs, and then you need to somehow copy it over to the netboot server.

Things you might need to change

  • You might need to change the LVM partitioning scheme. Look at the bottom of local-top/make_partitions and follow the stuff that's already there.
  • You can add a file to the rsync blacklist by adding it to /etc/rsyncimager/blacklist
  • The image server location is stored in /etc/rsyncimager/rsync_server

Implementation Notes

update-md5lists

This is the script that creates the appropriate md5sum lists. It is currently in /usr/local/bin. After you make changes to the golden client, you need to re-run this script so that the new md5sum list is re-generated. It has a long list of find excludes to keep certain areas away from rsync (things like /var/log, etc)

Eventually we will have a utility that lets you update the md5sums of just a few files without re-generating the entire list.

local-top/make_partitions

Partition names and sizes are hard-coded into this script. New LVM logical volumes are only created if they don't exist, so you can safely change these if you need to setup a new class of machines. Eventually, it might have some intelligence.

For some reason, mkfs.ext3 doesn't seem to work when it is copied into the initramfs. To overcome this, we have an init script in rcS.d make the ext3 partitions (currently just afscache)

init-bottom/ugcs_rsync

A simple way to do this would be to use rsync to copy everything. However, rsync has high overhead, even when most of the files are the same. To overcome this, we use our own md5sum lists and diff them. Also, customizing rsync to do exactly what we want can be difficult and error-prone.

To improve speed and features (like including permission checks), we have a set of custom C programs that do this. The source is in /afs/.ugcs/ugcs-admin/source/rsyncimager/src/fschecker.

init-bottom/ldconfig

Occasionally you will have an issue where the linker cache doesn't match up to what is on the system. To fix this, we run ldconfig just to make sure (it doesn't take very long) If you get weird errors about init not found, it is probably a linker error.

hooks/make_partitions

This copies over the appropriate executables for the make_partitions initramfs script

hooks/rsyncimager

This copies over the appropriate binaries for rsyncimager as well as the config files.

Personal tools