POV-Ray: Newsgroups: povray.beta-test: POV-Ray v3.8.0-x.freetype.1: Re: POV-Ray v3.8.0-x.freetype.1

POV-Ray : Newsgroups : povray.beta-test : POV-Ray v3.8.0-x.freetype.1 : Re: POV-Ray v3.8.0-x.freetype.1		Server Time 19 Apr 2024 04:48:34 EDT (-0400)
From: William F Pokorny
Date: 7 Feb 2019 11:31:43
Message: <5c5c5d6f$1@news.povray.org>
On 2/3/19 10:35 AM, William F Pokorny wrote:
> On 2/3/19 4:04 AM, clipka wrote:
>> Am 17.01.2019 um 15:05 schrieb William F Pokorny:
>>
>>>> - Performance has degraded a bit, but I'm willing to accept this for 
>>>> the sake of extended functionality and easier maintenance.
>>>>
>>>
>>> Hmm, I'm surprised some by this. Are your test character strings 
>>> really short? In the existing text shape code all the characters 
>>> ended up more or less as one huge glyph as you know. As the string to 
>>> the text shape got large, performance slowed substantially.
>>
>> Actually, wading through the old code for unrelated reasons, I just 
>> noticed that this isn't true: The old `text` primitive has actually 
>> been a CSG union all along, with one child per character.
> 
> Hmm. Not my recollection or experience. I was focused on inside tests if 
> those were perhaps done differently than intersections. The glyph loop 
> range testing I added helped regular intersection performance too, but 
> less.
> 
...
> 

Trying to avoid sliding too much sideways into work I did almost two 
years ago given I've already got more going than I'll ever finish. Can't 
completely help it I guess. Over the past days, kept asking myself how 
could I see such large performance improvements if the text object was 
already a union (Christoph and Alain being almost certainly right).

- Remembered two years ago I was mostly going after performance using 
hardware counter analysis. I was going after hot spots.

- Remembered a decade ago with the objectAsIso experiments how I looked 
hard at converting other than the simplest csg to a mesh because the 
inside test performance got so slow with larger/complex csg.

- Remembered Lanuhum's Blender hair scene and thinking it shouldn't 
really be that slow, even with super slow sphere_sweeps.

- Remembered thinking on trying it years ago the +bm2 mode should 
provide more performance than it does.

- Remembered learning early last year while finding the cause of a 
particular speckling bug, that POV-Ray does inside tests when it creates 
original rays.

- Remembered thinking fixes to that particular speckling bug and issues 
like: https://github.com/POV-Ray/povray/issues/139 will likely require 
multiple inside tests over one to account for floating point noise no 
matter any improved accuracy.

---
These last two thoughts led me to doing a new performance test with our 
ttf1.pov sample scene. One where all I changed was whether or not AA 
used. My thinking is the performance improvement of my text branch 
should be more or less the same with and without AA. It should be the 
same, unless, the text object is a union and the real performance 
problem is the csg inside test mechanism doing something like just 
trundling through ALL the shapes in a csg.

Results were not similar. This leads me to a new suspicion / unproven 
theory we are sitting on a csg inside test performance issue. One 
perhaps affecting things generally. My code changes to the text object 
only treated another problem.

I've spent zero time in the csg code and I'm working/playing elsewhere 
for the foreseeable future. I'm put digging here on my, maybe, someday 
list - for what little that's worth! My new theory is perhaps wrong too 
- but there is something going on inside test wise which is not very 
efficient.

Aside: Even looking at single shapes of a type, the inside test 
performance is often awful. This is why that isosurface peeling paint 
skin test of Thomas's superellipsoid using the hard/soft object patch 
was so extremely slow. IIRC, a box replacement was more than 20x faster.

Bill P.

--------------------------- Data for those interested.

/usr/bin/time povray ttf1.pov +am2 +a0.1 +r4 +wt1 \
                      -j -cc -fn -d -p +w2000 +h2000

p380
---------
12.39user 0.04system 0:12.71elapsed With +a0.1
12.21user 0.03system 0:12.51elapsed
12.27user 0.03system 0:12.56elapsed
-----
36.87

4.40user 0.03system 0:04.70elapsed  With -a0.1
4.36user 0.05system 0:04.65elapsed
4.40user 0.02system 0:04.66elapsed
----
13.16

p380 + my text shape branch
---------
12.08user 0.04system 0:12.40elapsed With +a0.1
12.08user 0.04system 0:12.36elapsed
12.18user 0.03system 0:12.45elapsed
-----
36.34                               -1.44%

4.39user 0.01system 0:04.64elapsed  With -a0.1
4.32user 0.04system 0:04.59elapsed
4.34user 0.03system 0:04.64elapsed
----
13.05                               -0.84%  (??? Hmm)


----------- Repeating a few previous perf tests for my text branch.

soft_object.pov   p380 + hard/soft object only.
----
345.96user 0.08system 5:46.50elapsed   As 'text union'

64.91user 0.04system 1:05.52elapsed    Individual chars at top (-81.24%)


soft_object.pov   p380 + hard/soft object + my text shape branch.
----
161.22user 0.05system 2:41.78elapsed  As 'text union' -53.40%

69.42user 0.02system 1:09.99elapsed   Individual chars at top  (+6.95%)
Post a reply to this message