Using the Cluster for Fun and Profit
The cluster is a parallel computing resource consisting of ~100 networked
"nodes" each of which has a set of processors that can run independently.
It is maintained by the
UIUC CS architecture group.
The initial main purpose of the cluster is for the architecture group to run their
simulations for research into theoretical computer architectures. At the time
of writing this document, this resource is available to users wishing to run computing-
intensive processes (such as running the Charniak parser on the entire
ACQUAINT corpus - over 5GB of text). At the risk of stating the obvious,
you can do this by splitting up your mammoth task into a set of subtasks,
and writing a script that calls one of two cluster schedulers:
FAUCETS -- maintained by the
parallel programming group, or
CONDOR, maintained by TSG --
to run each subtask as a separate process.
These instructions are intended to get you started if you have such a task.
WARNING:
At present, the cluster appears to have some problems running Perl scripts
that interact via pipes. For the present, the cluster is recommended
for users with simpler processes (the majority of current users have a
single executable).
A TALE OF TWO SCHEDULING SYSTEMS
There are two cluster schedulers available: FAUCETS, a home-grown
scheduler intended to facilitate explicitly parallel programming,
and CONDOR, a more general-purpose system for managing processes.
Some of the differences:
FAUCETS has a more simplistic scheduling system. There are some
heuristics applied to prevent a single user monopolizing the system,
but once one of your processes is running, it will run until it has
completed or its scheduled time has elapsed.
CONDOR can be set up to run a number of different ways, but the 'vanilla'
version installed on the main cluster can stop and reschedule processes
based on user demand.
In the past, FAUCETS has had some problems with apparent non-deterministic
'dropping' of processes. It has been significantly improved in 2005-6,
but it is not clear that this problem is entirely resolved.
CONDOR is reported by its users to have no such problems. It is now
available on two different clusters, both of which are accessed via
arch-server. One is the same cluster as FAUCETS -- 32-bit INTEL machines
-- and the other, smaller cluster has 64-bit AMD machines. The second
cluster's machines are sometimes individually used by grad students,
so often, many are unavailable for use as a cluster. However, these
machines are very fast.
INSTRUCTIONS
- Getting a cluster account:
ask userhelp@cs.uiuc.edu for a cluster account to use the FAUCETS scheduler.
- The cluster is reached via arch-server. All development (compiling, testing)
should be
done on arch-devel. login via ssh. There is a cluster mailing list which
gives useful information about preferred practices, maintenance etc.; the
list is 'arch-users', URL is:
http://lists.cs.uiuc.edu/mailman/listinfo/arch-users
NOTE: you should use arch-devel for everything possible (like debugging,
installing local libraries etc.). Use arch-server only for submitting your
jobs to FAUCETS or CONDOR.
- TSG has mounted one directory from our group's flake server on arch-devel/
arch-server:
The flake directory is
/mounts/flake/disks/1/arch/
It is mounted in the arch-devel (and arch-server) directory
/external-mount/rothgroup/
Alternatively, you can copy scads of data to arch-devel (where it will
also be accessible from arch-server).
Download Pierre Salverda's extremely helpful
scheduler resources.
This tarball includes a README indicating how to set up and use the scripts
and other resources he has provided. These scripts will allow
you to write a single straightforward script to submit jobs to either
CONDOR or FAUCETS.
- Write a script that runs your process with appropriate parameters. The
default time allowed for a single process using FAUCETS is 4 hours (you
can specify a longer time if you like), so you must plan accordingly.
If you have a lot of reads from/writes to file, your process may run
faster if you copy the data from the main arch-server machine to the local
scratch partition.
NOTE 1: the FAUCETS scheduler is, at the time of writing, less than 100% reliable.
(It has been improved, though problems are still reported.) This
means that nodes sometimes fail mid-process, so some of your processes
fail, possibly without trace. For this reason, you should probably set up
some kind of process tracking -- for some ideas, see ERROR CHECKING below.
NOTE 2: the files under your working directory are visible to other nodes
in the cluster. If, for example, you need a perl library not included
in the standard TSG distribution, you can install it somewhere under
your home directory, set the appropriate environment variables, and
your processes will be able to use it.
NOTE 3: if you plan to use the 64-bit machines, make sure you link to
static libraries, or compile the appropriate code while logged in
to one of the 64-bit machines. Otherwise, when your process is run
on one of the cluster nodes, the dynamic libraries your code
compiled on arch-devel (32-bit machine) wants to use will not sync
with your 32-bit code.
- Write a script that sets up the n processes you wish to run.
The tarball pointed to above contains useful skeleton scripts.
See its README for details.
If you use FAUCETS,
you'll mainly need the command 'ufsub'. This is a command to the
scheduler to run the desired command.
There's a web page that documents this and other useful commands at:
http://charm.cs.uiuc.edu/research/faucets/clusterscheduler.html
Some important notes about ufsub:
- You shouldn't specify number of processors per node -- this is pointless
unless your executable is explicitly parallelized.
- What happens with your K processes is that the first ~N are scheduled
according to the current cluster load. As they complete, processes from the
rest of the list are submitted. So there's nothing wrong with sending a huge
list of 'ufsub' commands to the cluster. (This is the official line --
if anyone complains, please let me know at mssammon@uiuc.edu so I can
update this information.)
- There is a time option for the ufsub command; default is 4 hours, but feel
free to specify more per process if you think it's needed. For very long
jobs (>100 hours) there may be scheduling restrictions (in the form of a
limit on the number of such processes that may run concurrently).
- In the past, nodes have been known to fail, so it's a good idea to have some in-process checking/
tracking so that in the event that one of your processes dies, you can repeat
it. See ERROR CHECKING below.
- Nodes may run 2 processes at a time -- so if your processes can interact
via e.g. filename collision, nasty things will happen. Be careful to
give files unique names.
ERROR CHECKING
Nodes on the cluster have been known to fail, though as the cluster has
just had a major overhaul, this is much less likely now. Still,
it's a good idea to track those processes that successfully completed
so that you can re-run any processes that died midway.
Perhaps the simplest way is to have each process write to a log file, then
have a script that processes these log files to detect omissions/errors.
If you use a local working directory like /scratch/, don't forget to have
your process copy the log to your arch-server directory before it finishes.
Other Options
I first tried to find a spiffy way to do the checking, so considered these
other options. In case this saves you some time, here's why I didn't use
them.
-
perl fork()
'ufsub' and its companions DO NOT generate a unix process id when invoked; they
just enter your process in the scheduler's queue, so you can't just write a
perl script that fork()s a bunch of children and waits till they terminate.
timer
The process itself may not even start until hours/days/weeks after submission,
so you can't start a timer when the process is submitted and wait four hours,
then check for output.
email
ufsub does have a command-line option whereby the scheduler will
send you emails when the process starts/ends. TSG no longer supports email
on client machines (by which I mean, in such a way that you can use the
unix "mail" command to read incoming mails), so if you want to automate the
checking process using email notifications you'll have to grab them from
your regular email account.
fjobs
There is also a command 'fjobs' which reports the status of queued/running
jobs, which you can use to determine when your processes have completed.
You could probably set this up to cue the checking script. I decided
that I didn't mind checking in on the server once in a while to see when
stuff finished.
SAMPLE SCRIPTS
Accompanying this info are a few sample scripts. Two are kindly provided
by Vasin Punyakanok:
runeach.pl
(which is the process to be run) and
runarch.pl (which
sets up the calls to the faucets scheduler).
I have included three other scripts,
processAcquaintData.pl
(the main process),
writeProcessScript.pl
(which generates a shell script that
makes the scheduler calls) and
checkLogs.pl
(which checks for log files
written by the process and moves unprocessed files into a specified
directory).
processAcquaintData.pl runs the charniak parser on column-format data
(as found in our directory of TREC2 corpora). Each process reads
a set of files (number of files, starting file index, input and output
directories specified by command line arguments), corrects
a segmentation mistake in the original files, runs charniak's
parser on each sentence in each file, and writes the output and corrected
sentences to new files with similar names to the originals.
writeProcessScript takes as arguments the source and target directories,
the number of files per chunk, and the working directory (path
to the processAcquaintData.pl script).
checkLogs.pl requires that you use the same number of files per chunk.
If you care to improve them, please let me have a copy...