POV-Ray: Newsgroups: povray.programming: SIMD implementation of dot-product in POV-Ray???

POV-Ray : Newsgroups : povray.programming : SIMD implementation of dot-product in POV-Ray???		Server Time 12 Jul 2025 20:55:01 EDT (-0400)

<<< Previous 10 Messages

Goto Initial 10 Messages

From: Thomas Willhalm
Subject: Memory allocation (was: Re: SIMD implementation of dot-product in POV-Ray???)
Date: 29 Nov 1999 05:59:32
Message: <qqmiu2l5zsb.fsf_-_@goldach.fmi.uni-konstanz.de>

DISCLAMER:
I'm not too common with the way POV-Ray handles its memory.

"Thorsten Froehlich" <tho### [at] trfde> writes:
>
> Whenever I profiled, I found that POV-Ray spends a lot of time doing memory
> allocations...

Couldn't it be useful to have two methods for allocation memory?
1) the POV_MALLOC used so far
2) a new method of memory that can only be freed on exit of the program.

With method 2), POV-Ray can allocate large parts of memory instead of little
chunks. The size of the "little chunks" doesn't matter anymore and therefore
needn't be stored. This decreases memory consumption and its handling
by the OS. The "large parts" could be handled by a linked list. Only for the
last part -- the current one -- a pointer to the last byte (word) that is 
used is necessary.

However, I don't know whether:
a) it's really faster
b) the scenario for 2) occurs often enough in POV-Ray to justify the effort
of implementing this.

Thomas

-- 
http://thomas.willhalm.de/ (includes pgp key)

Post a reply to this message

From: Mark Wagner
Subject: Re: Memory allocation (was: Re: SIMD implementation of dot-product in POV-Ray???)
Date: 30 Nov 1999 00:29:36
Message: <384360c0@news.povray.org>

Thomas Willhalm wrote in message ...
>Couldn't it be useful to have two methods for allocation memory?
>1) the POV_MALLOC used so far
>2) a new method of memory that can only be freed on exit of the program.
>
>With method 2), POV-Ray can allocate large parts of memory instead of
little
>chunks. The size of the "little chunks" doesn't matter anymore and
therefore
>needn't be stored. This decreases memory consumption and its handling
>by the OS. The "large parts" could be handled by a linked list. Only for
the
>last part -- the current one -- a pointer to the last byte (word) that is
>used is necessary.


With a linked list, you still have the overhead of remembering which parts
of the large chunk of memory are in use.

Mark

Post a reply to this message

From: Thorsten Froehlich
Subject: Re: Memory allocation (was: Re: SIMD implementation of dot-product in POV-Ray???)
Date: 30 Nov 1999 01:06:58
Message: <38436982@news.povray.org>

In article <qqmiu2l5zsb.fsf_-_@goldach.fmi.uni-konstanz.de> , Thomas 
Willhalm <tho### [at] willhalmde>  wrote:

> I'm not too common with the way POV-Ray handles its memory.

OK :-)

> "Thorsten Froehlich" <tho### [at] trfde> writes:
>>
>> Whenever I profiled, I found that POV-Ray spends a lot of time doing memory
>> allocations...
>
> Couldn't it be useful to have two methods for allocation memory?
> 1) the POV_MALLOC used so far
> 2) a new method of memory that can only be freed on exit of the program.

Well, that would work for the command line versions, but it would create
various problems for the Windows and Macintosh GUI versions.

> With method 2), POV-Ray can allocate large parts of memory instead of little
> chunks. The size of the "little chunks" doesn't matter anymore and therefore
> needn't be stored. This decreases memory consumption and its handling
> by the OS.

The Mac version is currently doing this. You can get a memory allocation
down to about 100 cycles (plus a new allocation by calling the system every
- in case of the current Mac method - 16th call, on average) this way, but
there are easily a ten million or more allocations even for simple renders.
That makes billion cycles, or even on fast processors about 2 seconds just
for the cached allocations, plus about a 100000 cycles for the system memory
allocation functions (Mac OS). A simple example can be pyramid2.pov (a
sample scene that comes with POV-Ray 3.1). Rendering its default recursion
level (six) results in 23437 objects. Rendering those with 640 * 480 and
anti-aliasing (Method 1, Threshold 0.300, Depth 5, Jitter 0.00) ends up in
about five million memory allocations. Changing the whole thing to a glass
spheres

> The "large parts" could be handled by a linked list. Only for the
> last part -- the current one -- a pointer to the last byte (word) that is
> used is necessary.

You don't want to walk through lists, it is easier and more efficient to use
a bitmap (not the image term, the computer science term), a simple array of
bits which mark if memory in a particular location is used or not. You can
than use a few "bit tricks" and find an empty cell.  in order to not have to
mess around with different cell sizes you just divide cells into groups of
sizes, i.e. if a block of memory with 47 bytes is allocated, you allocate 47
* 32 bytes and manage those yourself. The next time an allocation for 47
bytes will be much faster (and POV-Ray uses a lot of blocks of the same
sizes). If you limit yourself to "caching" only the lower range of
allocations, i.e. 1 to 4096 bytes you can manage the whole "cache" of
allocated memory easily.

> However, I don't know whether:
> a) it's really faster

It is, but eliminated memory allocations in s"strategic" places all together
would speed things up even more. However, to do so quite a few modifications
in the source code of POV-Ray would be needed while changing the allocation
functions is simpler because they are external.

> b) the scenario for 2) occurs often enough in POV-Ray to justify the effort
> of implementing this.

It does, and doing it is not very difficult or a lot of work - the current
implementation on the PowerMac version of POV-Ray has just a few hundred
lines of code (however, due to a single inline assembler instruction used to
find a zero bit in a word (32 bits) it is not fully portable with the same
speed right now - but that is a very, very long story).

  Thorsten

____________________________________________________
Thorsten Froehlich, Duisburg, Germany
e-mail: tho### [at] trfde

Visit POV-Ray on the web: http://mac.povray.org

Post a reply to this message

From: Thomas Willhalm
Subject: Re: Memory allocation (was: Re: SIMD implementation of dot-product in POV-Ray???)
Date: 30 Nov 1999 06:11:25
Message: <qqmemd85j4y.fsf@goldach.fmi.uni-konstanz.de>

Sorry for replying to my own post, but since Thorsten and Mark didn't get
my point (which is probably due to my limited knowledge about the English 
language) I will give a more detailed description of my idea.

Thomas Willhalm <tho### [at] willhalmde> writes:
> 
> "Thorsten Froehlich" <tho### [at] trfde> writes:
> >
> > Whenever I profiled, I found that POV-Ray spends a lot of time doing memory
> > allocations...
> 
> Couldn't it be useful to have two methods for allocation memory?
> 1) the POV_MALLOC used so far
> 2) a new method of memory that can only be freed on exit of the program.
> 
> With method 2), POV-Ray can allocate large parts of memory instead of little
> chunks. The size of the "little chunks" doesn't matter anymore and therefore
> needn't be stored. This decreases memory consumption and its handling
> by the OS. The "large parts" could be handled by a linked list. Only for the
> last part -- the current one -- a pointer to the last byte (word) that is 
> used is necessary.

I imagine that there are a lot of objects in POV-Ray for which memory is 
allocated once and used until the end of the rendering. To give a concrete
example, imaging a box being parsed. The corresponding memory is allocated
at this time. The memory will be used until rendering finishes.

Now, I make the following assumption: For a lot of objects (in the sense of
programming) we can guaranty that we will use them until the end of the
rendering. This might be the case for objects (in the sense of POV-Ray),
textures, density maps and so on.

After accepting this assumption my idea comes into play: Why should I store
the necessary information to free the memory of every single object when
I will free them all at once? So, let us reserve a large part of memory 
(e.g. 30 MB). When an object is created (i.e. memory is allocated) we put it 
on top of the memory used to far. All we need is to store a pointer to the 
last address that has been used, because we will free our large part of 
memory all at once (when the rendering is finished).

The problem is of course that we don't know at the beginning of parsing,
how much memory we will need. Thus, we have to split the memory into 
smaller parts (e.g. 64 KB). This is where I want to use a linked list:
to connect the parts of memory. We will use this linked list only once
when freeing the memory at the end of the rendering. That's why this method
is still in O(n). (Allocating memory for a new object takes constant time
and should be faster than the standard way.)

What Thorsten writes about the memory management in POV-Ray for Mac is very
similar to what I suggest -- except that I want to "forget" where the
memory belongs to and the size of the corresponding object. As mentioned in
the introduction, I expect this method to work only for some (perhaps even
most) objects in POV-Ray.

I hope that my description is much clearer now.
Thomas

-- 
http://thomas.willhalm.de/ (includes pgp key)

Post a reply to this message

From: Ron Parker
Subject: Re: Memory allocation (was: Re: SIMD implementation of dot-product in POV-Ray???)
Date: 30 Nov 1999 08:23:58
Message: <3843cfee@news.povray.org>

On 30 Nov 1999 12:11:25 +0100, Thomas Willhalm wrote:
>
>Sorry for replying to my own post, but since Thorsten and Mark didn't get
>my point (which is probably due to my limited knowledge about the English 
>language) I will give a more detailed description of my idea.
[...]

I think Thorsten got your point, in that he mentioned it would cause problems
for the GUI versions because they don't terminate.  Of course, they do stop 
rendering at some point, and you could easily clean up the large chunks o' 
memory at that point.  Just make sure that GUI stuff never allocates memory
from there. 

I think this idea has some merit, but I am concerned somewhat about the 
possibility of wasting some large blocks of memory.  Consider what happens
when the current chunk o' memory has 31K left available and we ask to allocate
a 32k block of memory.  Not that 31K of wasted memory is horribly significant
these days, but those blocks could add up.  I suppose it's a tradeoff for the
expected time savings.

Also... it should be made a little more flexible, somehow.  If your scene has
lots of #declared objects, textures, etc. there are also a fairly large number 
of allocations that get freed at the end of parsing.  It might be nice to free
them all at once too.

-- 
These are my opinions.  I do NOT speak for the POV-Team.
The superpatch: http://www2.fwi.com/~parkerr/superpatch/
My other stuff: http://www2.fwi.com/~parkerr/traces.html

Post a reply to this message

<<< Previous 10 Messages

Goto Initial 10 Messages