GNU Queue is a UNIX process network
load-balancing system that features an innovative proxy daemon
mechanism which allows users to control their remote jobs in a nearly
seamless and transparent fashion. When an interactive remote job is
launched, such as say Matlab, or EMACS interfacing Allegro Lisp, a
proxy daemon runs on the local end. By sending signals to the remote
side - including control signals like hitting the suspend key - the process on the remote
end may be controlled just as easily as
if it were a local process. Resuming the stub resumes the remote job. The
user's environment is almost completely replicated, including not only
environmental variables, but nice values, rlimits, terminal settings
are all replicated on the remote end. Together with MIT_MAGIC_COOKIE_1
(or xhost + ) the system is X-windows transparent as well,
provided the users local DISPLAY variable is set to the fully
qualified pathname of the local machine.
One of the most appealing features of the stub system even with
experienced users is that asynchronous job control of remote jobs by
the shell is possible and intuitive. One simply runs the proxy job in the
background under the local shell; the shell notifies the user when the
remote job has a change in status by monitoring the proxy daemon.
Want to
migrate that CPU guzzling interactive job to another machine? Run
queue -- jobname , and it behaves as if it were local. Want to
suspend that job? Control-Z, and you're back to the local shell; the
job is suspended on the remote machine as well. Type
bg or fg and the job is seemingly running on the
local machine again; except it's not! It's running on a remote CPU. This feature
is especially practical when you wish to run many cpu-intensive jobs simultaneously
throughout the cluster --- you can control them all effortlessly through the local shell.
When the remote process has terminated, the proxy returns the exit
value to the shell; otherwise, the proxy job simulates a death by the same
signal as that which terminated or suspended the remote job. In this
way, control of the remote process is intuitive even to novice users,
as it is just like controlling a local job from the shell. Many of my
original users had to be reminded that their jobs were, in fact,
running remotely.
In addition, Queue also features a more traditional distributed batch
processing environment, with results returned to the user via
email. In addition, traditional batch processing limitations may be
placed on jobs running in either environment (proxy process or with the email
mechanism) such as suspension of jobs if the system exceeds a certain
load average, limits on CPU time, disk free requirements, limits on
the times in which jobs may run, etc. (These are documented in the
sample profile file included.)
Queue may be installed by any user on the system; root privileges are
not required.
|