|
|
|
|
|
|
| |
| |
|
|
|
|
| |
| |
|
|
I would appreciate feedback and corrections on how I managed to get
povray-3.7.0.beta.33 installed and mostly working on one node of a Rocks Linux
cluster as a regular (non-privileged) user.
* X Windows display does not work at all, as the installation notes suggest.
* I was able to get boost recognized only by compiling POV-Ray static.
* The resulting executable is about 4 MB.
* I used default optimization. (Is it worth trying variations?)
I posted details of the installation process on my wiki.
http://wiki.waggy.org/dokuwiki/povray/shamu_install
Although with this more-or-less standard installation each POV-Ray instance uses
only the eight processors on one node, I intend to use the software primarily
for animations and plan to write a script to render individual frames using a
separate instance on all 24 nodes. Eventually I may be able to convince the
system administrators to install it properly, but it will be difficult to do so
until the software is out of beta.
I am by no means any kind of Linux guru, and may well have done something
terribly wrong, or at least unconventional. But since I'll be passing this
along to the high-performance computing class I'm assisting with this term, any
suggestions to improve my method (or reduce my madness) would be most welcome.
Thanks in advance for your thoughts!
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
waggy schrieb:
> * X Windows display does not work at all, as the installation notes suggest.
This is likely just due to the required header files missing.
> I posted details of the installation process on my wiki.
You may want to change the line reading
./configure COMPILED_BY="David Wagner <hon### [at] handbasketorg>"
lest everyone takes that literally ;-)
> Although with this more-or-less standard installation each POV-Ray instance uses
> only the eight processors on one node, I intend to use the software primarily
> for animations and plan to write a script to render individual frames using a
> separate instance on all 24 nodes. Eventually I may be able to convince the
> system administrators to install it properly, but it will be difficult to do so
> until the software is out of beta.
A Linux cluster, you say? I guess that would be separate machines (each
with their own main memory) linked via a network-ish infrastructure,
right? In that case, a "properly installed" copy will not be able to
make use of the whole thing either (except by running independent
instances as you intend to do for animations): As of now, POV-Ray
supports multithreading, but not distributed computing, as parts of it
still rely on access to a common address space.
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
clipka wrote:
> This is likely just due to the required header files missing.
Thanks! I'll see if I can track these down.
> You may want to change the line reading
>
> ./configure COMPILED_BY="[snip]
I was on the fence about this due to the intended audience, a class full of
graduate engineering students with big, gnarly computational problems and
(assumed) zero Linux experience, but I took your advice and obfuscated it a
bit.
> A Linux cluster, you say? I guess that would be separate machines (each
> with their own main memory) linked via a network-ish infrastructure,
> right? In that case, a "properly installed" copy will not be able to
> make use of the whole thing either (except by running independent
> instances as you intend to do for animations): As of now, POV-Ray
> supports multithreading, but not distributed computing, as parts of it
> still rely on access to a common address space.
It's my understanding the cluster has some tools installed to manage these kinds
of batch jobs far better than I'll be able to do with a shell script. The
admins install applications (such as MatLab, Octave, and presumably POV-Ray) as
modules which need to be loaded before use, which I assumed was at least in part
for efficient resource management. I'm headed for two days of training at the
main University of Texas campus (in Austin where Shamu's "big brother" is; I'm
at UT San Antonio) in a couple of weeks to find out how wrong I am. :)
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
waggy schrieb:
> It's my understanding the cluster has some tools installed to manage these kinds
> of batch jobs far better than I'll be able to do with a shell script. The
> admins install applications (such as MatLab, Octave, and presumably POV-Ray) as
> modules which need to be loaded before use, which I assumed was at least in part
> for efficient resource management. I'm headed for two days of training at the
> main University of Texas campus (in Austin where Shamu's "big brother" is; I'm
> at UT San Antonio) in a couple of weeks to find out how wrong I am. :)
I'd be eager to hear about it.
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
clipka wrote:
> I'd be eager to hear about it.
First, thanks for the tip on X Windows. I verified that X Windows is not
available for compiling on the cluster. This is probably a good thing since I
wouldn't want to accidentally leave display on when rendering a few thousand
animation frames.
As for setting up cluster jobs, the workshop was primarily an introduction to
writing SMP code. I suppose the 'proper' way to distribute POV-Ray rendering
would be to patch it so that when it is run on a cluster one piece of the job (a
frame range or still image super-mosaic square) is rendered as usual (one mosaic
square thread per core) by each node. Also, have another thread stitching the
pieces together as they become available. (I'm taking the developers' word for
it that message overhead is prohibitively expensive for converting their pthread
implementation to MPI, even on clusters with massive internode bandwidth.)
But I put together a much simpler scheme that may be more appropriate for a
shared cluster using the installed job queue manager. All I needed to write
were two scripts: one to create job submission tickets, and one to run POV-Ray
on the frame range (or image part) associated with each task number.
The job submission script creates two job tickets and submits them to the queue
manager. The first job submits each frame range as a separate task, each of
which is pushed to a single node as one becomes available. The second job waits
until all the tasks in the first job are finished, then stitches the frames
together in an animation.
The other script simply executes POV-Ray with the appropriate command-line
arguments to render the frame range associated with the environment variable
containing the current task number.
This strategy seems more appropriate for a shared cluster since it can start
submitting tasks as soon as a single node is available, and higher-priority jobs
can get in between separate tasks. Note the render job ticket creates many more
tasks than there are nodes since it is likely that some frame ranges will take
much longer to render than others. It is also very convenient to be able to
stop the entire process with just two commands, one for each node.
The only performance problem I'm having is with the delay of as much as a few
seconds between a node finishing a frame range and the queue manager pushing the
next task onto it. I also anticipate decreased performance on images with
(single-thread) parse times approaching trace times. I have had some success
overcoming these problems by running two tasks (two POV-Ray instances) on each
node at markedly different niceness to keep each node busy. (Niceness is used
in an attempt decrease task-switching a bit and to help stagger render times.)
However, this works a bit too well as the processing environments I have
available watch for time-averaged node overloading, and during my tests, many
nodes stop accepting new jobs for a while when they have more than 12 active
threads on an 8-core node, then alarm when over 14.
I will probably need to contact the local cluster admins to set up a processing
environment better suited to distributed rendering with POV-Ray. As far as I
can tell, I am the only user running multinode jobs, and very few changes beyond
the default configuration have been made to Shamu as yet.
Although I have only implemented this for animation, it should be fairly easy to
modify the scripts to break a single image into mosaic parts (rather than frame
ranges) and then stitch them together with ImageMagick (instead of ffmpeg).
Thanks for your interest.
David Wagner
P.S. FWIW, I updated the instructions for beta 34.
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
waggy schrieb:
>
> But I put together a much simpler scheme that may be more appropriate for a
> shared cluster using the installed job queue manager. All I needed to write
> were two scripts: one to create job submission tickets, and one to run POV-Ray
> on the frame range (or image part) associated with each task number.
That sounds like a pretty plausible approach indeed.
> The only performance problem I'm having is with the delay of as much as a few
> seconds between a node finishing a frame range and the queue manager pushing the
> next task onto it.
I guess the impact of this could be reduced by increasing the number of
frames per job package.
> I also anticipate decreased performance on images with
> (single-thread) parse times approaching trace times.
Why should that /decreasse/ performance? At present, parsing is done
over and over again for each image anyway. (Of course it will not max
out the node while parsing, but this would be the case on a single-node
system as well.)
> I have had some success
> overcoming these problems by running two tasks (two POV-Ray instances) on each
> node at markedly different niceness to keep each node busy. (Niceness is used
> in an attempt decrease task-switching a bit and to help stagger render times.)
> However, this works a bit too well as the processing environments I have
> available watch for time-averaged node overloading, and during my tests, many
> nodes stop accepting new jobs for a while when they have more than 12 active
> threads on an 8-core node, then alarm when over 14.
Hmm... it might be an interesting idea for animations to have POV-Ray
run parse threads for a number of frames in parallel (depending on
available cores), then render the batch of frames.
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
clipka wrote:
[...]
> I guess the impact of this could be reduced by increasing the number of
> frames per job package.
That's my evil plan. ;)
> > I also anticipate decreased performance on images with
> > (single-thread) parse times approaching trace times.
>
> Why should that /decreasse/ performance? At present, parsing is done
> over and over again for each image anyway. (Of course it will not max
> out the node while parsing, but this would be the case on a single-node
> system as well.)
I'm thinking of it this way. Suppose a frame takes 1 second to parse, and 64
seconds for one thread to trace. On a single-core machine, it's 65 seconds
total; on eight cores it's about (1+64/8) 9 seconds and about 90% utilization of
all cores. Now consider the reverse, 64 seconds to parse, and 1 to trace. The
total time is the same 65 seconds for a single core machine, but at best
(64+1/8) 64.125 seconds on the eight-core, with barely over 12% utilization.
> Hmm... it might be an interesting idea for animations to have POV-Ray
> run parse threads for a number of frames in parallel (depending on
> available cores), then render the batch of frames.
That would be a good way to approach it, I think. The TACC folks made a pretty
big deal about trying to structure your message-based (MPI) multiprocessing
application to pass fewer, larger messages (on the order of megabytes or
greater) rather than many smaller ones. Having some number of parsing threads
pass the resulting data structures (1 thread producing one data structure for
each frame) to a balanced number of rendering nodes (perhaps 1 node for each
frame) seems like a good place to start for animations. And, since most
animation frames in a sequence are much like the ones just before, a fairly
simple adaptive parse/render load balancing scheme should work well.
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
|
|