POV-Ray: Newsgroups: povray.unix: Slow parsing from networked drives: Re: Slow parsing from networked drives

POV-Ray : Newsgroups : povray.unix : Slow parsing from networked drives : Re: Slow parsing from networked drives		Server Time 19 Apr 2024 15:56:47 EDT (-0400)

From: Le Forgeron
Date: 11 Jun 2016 03:25:18
Message: <575bbcde$1@news.povray.org>

Le 11/06/2016 01:36, mcanance a écrit :
> Friends, I'm trying to use a cluster at my university to speed up my renders.
> The rendering parallelizes well (+SR and +ER in a job array), but for some
> reason the parsing stage is just horribly slow when I'm using networked storage.
> I believe the connection to the storage is over gigabit ethernet.
> If I have povray read from the networked storage, it takes 4 minutes to parse
> 22M tokens, and 34 seconds to render.
> If, however, I copy the file to /dev/shm first, it takes 8 seconds to parse. It
> takes no more than a second to copy the file from the network drive to /dev/shm.
> Rendering still takes 34 seconds.
> 
> Unfortunately, if I copy the files to /dev/shm and then abort the render, I
> don't have an opportunity to delete the files and the temporary space fills up
> and then the cluster crashes over a holiday weekend and I have to send an
> apology to the sysadmin and bring him cookies. (He wasn't happy.)
> 
> What I'd like is for povray to read in the scene file from the network drive and
> store it in memory before it starts parsing it.
> 
> I tried looking at the source, but I didn't see the culprit. I did see that
> there's an IMemStream in fileinputoutput.h that looks like it would do the
> trick, but I haven't been able to figure out how it works.
> 
> Has anyone else had experience with speeding up the parser?
> Cheers,
> Charles.
> 
> 

The parsing is token by token, so it can be made of a lot of I/O, especially if the
network protocol is unbuffered:
instead of reading a whole segment of 1K (the kind of thing that happen when copying a
file) made of a single exchange (one request, one reply), the parsing asks for a few
charactes at a time (I even wonder if it is not one character each time). For 1K, 1024
requests,
1024 replies. GigaEthernet still have a latency between each request & reply.
It should be to the tuning of the OS to buffer the device/filesystem to optimise its
data exchange.
And this was the obvious... but there is a hidden cost too: if the filesystem is not
mounted with -noatime, each reading of a byte of a file is also a write to the
metadata of the file)

instead of /shm (used by other processes, hence the crash), why not having your own
ramdisk:
mkdir -p /media/nameme
mount -t tmpfs -o size=2048M tmpfs /media/nameme/

Of course, on your start of your render, if the ramdisk is already there, clean it
first (it is all yours, not like /shm) because it was an aborted previous session

Post a reply to this message