Using the Cluster for Fun and Profit

The cluster is a parallel computing resource consisting of ~100 networked "nodes" each of which has a set of processors that can run independently. It is maintained by the UIUC CS architecture group. The initial main purpose of the cluster is for the architecture group to run their simulations for research into theoretical computer architectures. At the time of writing this document, this resource is available to users wishing to run computing- intensive processes (such as running the Charniak parser on the entire ACQUAINT corpus - over 5GB of text). At the risk of stating the obvious, you can do this by splitting up your mammoth task into a set of subtasks, and writing a script that calls one of two cluster schedulers: FAUCETS -- maintained by the parallel programming group, or CONDOR, maintained by TSG -- to run each subtask as a separate process.

These instructions are intended to get you started if you have such a task.

WARNING:
At present, the cluster appears to have some problems running Perl scripts that interact via pipes. For the present, the cluster is recommended for users with simpler processes (the majority of current users have a single executable).


A TALE OF TWO SCHEDULING SYSTEMS

There are two cluster schedulers available: FAUCETS, a home-grown scheduler intended to facilitate explicitly parallel programming, and CONDOR, a more general-purpose system for managing processes.

Some of the differences:

  1. FAUCETS has a more simplistic scheduling system. There are some heuristics applied to prevent a single user monopolizing the system, but once one of your processes is running, it will run until it has completed or its scheduled time has elapsed.

    CONDOR can be set up to run a number of different ways, but the 'vanilla' version installed on the main cluster can stop and reschedule processes based on user demand.

  2. In the past, FAUCETS has had some problems with apparent non-deterministic 'dropping' of processes. It has been significantly improved in 2005-6, but it is not clear that this problem is entirely resolved.

    CONDOR is reported by its users to have no such problems. It is now available on two different clusters, both of which are accessed via arch-server. One is the same cluster as FAUCETS -- 32-bit INTEL machines -- and the other, smaller cluster has 64-bit AMD machines. The second cluster's machines are sometimes individually used by grad students, so often, many are unavailable for use as a cluster. However, these machines are very fast.

INSTRUCTIONS

  1. Getting a cluster account: ask userhelp@cs.uiuc.edu for a cluster account to use the FAUCETS scheduler.


  2. The cluster is reached via arch-server. All development (compiling, testing) should be done on arch-devel. login via ssh. There is a cluster mailing list which gives useful information about preferred practices, maintenance etc.; the list is 'arch-users', URL is:

    http://lists.cs.uiuc.edu/mailman/listinfo/arch-users

    NOTE: you should use arch-devel for everything possible (like debugging, installing local libraries etc.). Use arch-server only for submitting your jobs to FAUCETS or CONDOR.


  3. TSG has mounted one directory from our group's flake server on arch-devel/ arch-server:

    The flake directory is

    /mounts/flake/disks/1/arch/

    It is mounted in the arch-devel (and arch-server) directory

    /external-mount/rothgroup/

    Alternatively, you can copy scads of data to arch-devel (where it will also be accessible from arch-server).


  4. Download Pierre Salverda's extremely helpful scheduler resources.

    This tarball includes a README indicating how to set up and use the scripts and other resources he has provided. These scripts will allow you to write a single straightforward script to submit jobs to either CONDOR or FAUCETS.



  5. Write a script that runs your process with appropriate parameters. The default time allowed for a single process using FAUCETS is 4 hours (you can specify a longer time if you like), so you must plan accordingly. If you have a lot of reads from/writes to file, your process may run faster if you copy the data from the main arch-server machine to the local scratch partition.

    NOTE 1: the FAUCETS scheduler is, at the time of writing, less than 100% reliable. (It has been improved, though problems are still reported.) This means that nodes sometimes fail mid-process, so some of your processes fail, possibly without trace. For this reason, you should probably set up some kind of process tracking -- for some ideas, see ERROR CHECKING below.

    NOTE 2: the files under your working directory are visible to other nodes in the cluster. If, for example, you need a perl library not included in the standard TSG distribution, you can install it somewhere under your home directory, set the appropriate environment variables, and your processes will be able to use it.

    NOTE 3: if you plan to use the 64-bit machines, make sure you link to static libraries, or compile the appropriate code while logged in to one of the 64-bit machines. Otherwise, when your process is run on one of the cluster nodes, the dynamic libraries your code compiled on arch-devel (32-bit machine) wants to use will not sync with your 32-bit code.



  6. Write a script that sets up the n processes you wish to run. The tarball pointed to above contains useful skeleton scripts. See its README for details.

    If you use FAUCETS, you'll mainly need the command 'ufsub'. This is a command to the scheduler to run the desired command. There's a web page that documents this and other useful commands at:

    http://charm.cs.uiuc.edu/research/faucets/clusterscheduler.html

    Some important notes about ufsub:

    • You shouldn't specify number of processors per node -- this is pointless unless your executable is explicitly parallelized.

    • What happens with your K processes is that the first ~N are scheduled according to the current cluster load. As they complete, processes from the rest of the list are submitted. So there's nothing wrong with sending a huge list of 'ufsub' commands to the cluster. (This is the official line -- if anyone complains, please let me know at mssammon@uiuc.edu so I can update this information.)

    • There is a time option for the ufsub command; default is 4 hours, but feel free to specify more per process if you think it's needed. For very long jobs (>100 hours) there may be scheduling restrictions (in the form of a limit on the number of such processes that may run concurrently).

    • In the past, nodes have been known to fail, so it's a good idea to have some in-process checking/ tracking so that in the event that one of your processes dies, you can repeat it. See ERROR CHECKING below.

    • Nodes may run 2 processes at a time -- so if your processes can interact via e.g. filename collision, nasty things will happen. Be careful to give files unique names.

ERROR CHECKING

Nodes on the cluster have been known to fail, though as the cluster has just had a major overhaul, this is much less likely now. Still, it's a good idea to track those processes that successfully completed so that you can re-run any processes that died midway.

Perhaps the simplest way is to have each process write to a log file, then have a script that processes these log files to detect omissions/errors. If you use a local working directory like /scratch/, don't forget to have your process copy the log to your arch-server directory before it finishes.

Other Options

I first tried to find a spiffy way to do the checking, so considered these other options. In case this saves you some time, here's why I didn't use them.

  • perl fork()

    'ufsub' and its companions DO NOT generate a unix process id when invoked; they just enter your process in the scheduler's queue, so you can't just write a perl script that fork()s a bunch of children and waits till they terminate.

  • timer

    The process itself may not even start until hours/days/weeks after submission, so you can't start a timer when the process is submitted and wait four hours, then check for output.

  • email

    ufsub does have a command-line option whereby the scheduler will send you emails when the process starts/ends. TSG no longer supports email on client machines (by which I mean, in such a way that you can use the unix "mail" command to read incoming mails), so if you want to automate the checking process using email notifications you'll have to grab them from your regular email account.

  • fjobs

    There is also a command 'fjobs' which reports the status of queued/running jobs, which you can use to determine when your processes have completed. You could probably set this up to cue the checking script. I decided that I didn't mind checking in on the server once in a while to see when stuff finished.

SAMPLE SCRIPTS

Accompanying this info are a few sample scripts. Two are kindly provided by Vasin Punyakanok: runeach.pl (which is the process to be run) and runarch.pl (which sets up the calls to the faucets scheduler).

I have included three other scripts, processAcquaintData.pl (the main process), writeProcessScript.pl (which generates a shell script that makes the scheduler calls) and checkLogs.pl (which checks for log files written by the process and moves unprocessed files into a specified directory).

processAcquaintData.pl runs the charniak parser on column-format data (as found in our directory of TREC2 corpora). Each process reads a set of files (number of files, starting file index, input and output directories specified by command line arguments), corrects a segmentation mistake in the original files, runs charniak's parser on each sentence in each file, and writes the output and corrected sentences to new files with similar names to the originals.

writeProcessScript takes as arguments the source and target directories, the number of files per chunk, and the working directory (path to the processAcquaintData.pl script).

checkLogs.pl requires that you use the same number of files per chunk.

If you care to improve them, please let me have a copy...