|
|
|
|
|
|
| |
| |
|
|
From: William F Pokorny
Subject: Re: Filed() macro for CSV data file handling
Date: 27 Oct 2021 07:02:54
Message: <617931de$1@news.povray.org>
|
|
|
| |
| |
|
|
On 10/26/21 9:57 AM, jr wrote:
> "jr" <cre### [at] gmailcom> wrote:
...
>
> with 100k sphere centre and radius records for instance, POV-Ray needs around
> one second for the writing and the same for the read back. Filed() comes in at
> around eight seconds write and ten seconds read.
>
Hi,
I'm a little surprised at the magnitude of the slow down. Are you timing
with a v3.8 beta or something else? I'm expecting the files are all on a
ramdisk?
Thinking aloud...
If a povr branch, was it heavily optimized ahead of the build/make via
configure script options?
Re: The povr branch is running with a C++11 string hash function for the
symbol tables in the parser / vm over the traditional POV-Ray one. In my
testing the C++11 one is only faster if the compile options are
relatively aggressive(1,2).
In general, with the dictionary stuff being relatively new to v3.8,
there is likely room to make the parser faster around that functionality.
I'd be willing to play a little with your 100K spheres case - if you
want?
Bill P.
(1) - A reminder, the povr branch build system does NO optimization by
default when the configure script is run.
(2) - The C++11 hashing of strings is much better, but it's more complex
code and takes longer than POV-Ray's method. The old hashing tends to
clump table entries around the leading ascii character value(s).
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
hi,
William F Pokorny <ano### [at] anonymousorg> wrote:
> On 10/26/21 9:57 AM, jr wrote:
> > with 100k sphere centre and radius records for instance, POV-Ray needs around
> > one second for the writing and the same for the read back. Filed() comes in at
> > around eight seconds write and ten seconds read.
> >
>
> I'm a little surprised at the magnitude of the slow down. Are you timing
> with a v3.8 beta or something else? I'm expecting the files are all on a
> ramdisk?
the reading did not surprise much, every item returned is "handled" three times,
but the (lack of) write performance I do not understand.
> Thinking aloud...
>
> If a povr branch, ...
ah, no, "stock" alpha.9945627. fyi (snipped the context below), all POV-Rays
and povr etc built with '-O2'.
> I'd be willing to play a little with your 100K spheres case - if you
> want?
thanks, sure. have attached a reader and writer scene. for the writer can use
'declare=N=number' to create number records, for both scenes can use
'declare=FM=true' to use Filed() instead of "raw" directives.
regards, jr.
Post a reply to this message
Attachments:
Download 'wfp_filed.tar.xz.dat' (1 KB)
|
|
| |
| |
|
|
From: William F Pokorny
Subject: Re: Filed() macro for CSV data file handling
Date: 28 Oct 2021 06:11:36
Message: <617a7758$1@news.povray.org>
|
|
|
| |
| |
|
|
On 10/28/21 3:02 AM, jr wrote:
...
>
> the reading did not surprise much, every item returned is "handled" three times,
> but the (lack of) write performance I do not understand.
...
>> I'd be willing to play a little with your 100K spheres case - if you
>> want?
>
> thanks, sure. have attached a reader and writer scene. for the writer can use
> 'declare=N=number' to create number records, for both scenes can use
> 'declare=FM=true' to use Filed() instead of "raw" directives.
...
Thank you. I've grabbed it. I'll need my morning coffee too! :-)
Thinking I'll do some real profiling again. It's gotten to be a year or
more since I've done any - and I'd like to look at rtr that way too.
With rtr I can tell some of the delay with smaller render sizes is the
mutex locking mechanism itself. This aligning with a note I'd guess
Chris made way back when in the code. Beyond that I'm not sure what
profiling might reveal - or not.
Bill P.
Post a reply to this message
|
|
| |
| |
|
|
From: William F Pokorny
Subject: Re: Filed() macro for CSV data file handling
Date: 28 Oct 2021 08:49:03
Message: <617a9c3f$1@news.povray.org>
|
|
|
| |
| |
|
|
On 10/28/21 6:11 AM, William F Pokorny wrote:
> On 10/28/21 3:02 AM, jr wrote:
...
>
> Thank you. I've grabbed it. I'll need my morning coffee too! :-)
>
...
Not yet to profiling, but I've turned up a few things. First, I don't
see the +700% write side slow down you see, but rather about +280% so
this is more in line with your expectations for write vs read.
I'm interested in where povr is and I was surprised to find it slower
than p380b1.
On investigation found the parser updates coming after what is in the
v3.8 betas(1) - which povr adopted / branched from - are themselves
slower. More significantly so on the write side. My povr branch is
running faster than the branch point, but... Guess for me that is the
first thing to try and figure out.
Bill P.
(1) - Christoph backtracked the newest parser updates for the v3.8 release.
Ref:
stress_wr.pov (100k)
p380 raw - 2.05user 0.25system 0:02.84elapsed 81%CPU
p380 fld - 6.87user 0.33system 0:07.74elapsed 93%CPU
p380b1 raw - 1.57user 0.02system 0:02.14elapsed 74%CPU
p380b1 fld - 6.02user 0.06system 0:06.61elapsed 91%CPU
povr raw - 1.92user 0.24system 0:02.71elapsed 79%CPU
povr fld - 6.39user 0.28system 0:07.21elapsed 92%CPU
p380b1 raw ---> fld 1.57 -> 6.02 ---> +283.44% (You suggested +700%)
p380 raw ---> fld 2.05 -> 6.97 ---> +240.00%
povr raw ---> fld 1.92 -> 6.39 ---> +232.81%
p380b1 -> p380 raw 1.57 -> 2.05 ---> +30.57% (p380 povr branch point)
p380 -> povr raw 2.05 -> 1.92 ---> -6.34% (povr < 380 branch point)
p380b1 -> p380 fld 6.02 -> 6.87 ---> +14.12% (p380 povr branch point)
p380 -> povr fld 6.87 -> 6.39 ---> -6.99% (povr < 380 branch point)
stress_rd.pov (100k)
p380 raw - 0.64user 0.02system 0:01.20elapsed 55%CPU
p380 fld - 8.39user 0.08system 0:09.00elapsed 94%CPU
p380b1 raw - 0.62user 0.03system 0:01.19elapsed 54%CPU
p380b1 fld - 7.69user 0.07system 0:08.29elapsed 93%CPU
povr raw - 0.61user 0.01system 0:01.17elapsed 53%CPU
povr fld - 7.90user 0.06system 0:08.50elapsed 93%CPU
p380b1 raw ---> fld 0.62 -> 7.69 ---> +1140.32%
p380 raw ---> fld 0.64 -> 8.39 ---> +1210.94%
povr raw ---> fld 0.61 -> 7.90 ---> +1195.08%
p380b1 -> p380 raw 0.62 -> 0.64 ---> +3.23% (p380 povr branch point)
p380 -> povr raw 0.64 -> 0.61 ---> -4.69% (povr < 380 branch point)
p380b1 -> p380 fld 7.69 -> 8.39 ---> +9.10% (p380 povr branch point)
p380 -> povr fld 8.39 -> 7.90 ---> -5.84% (povr < 380 branch point)
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
hi,
William F Pokorny <ano### [at] anonymousorg> wrote:
> ...
> Not yet to profiling, but I've turned up a few things. First, I don't
> see the +700% write side slow down you see, but rather about +280% so
> this is more in line with your expectations for write vs read.
intend to install the new povr in the coming days. will then run some tests,
including another 100k. thanks for posting the timings. (1200% is a high price
to pay, for convenience.. :-))
regards, jr.
Post a reply to this message
|
|
| |
| |
|
|
From: William F Pokorny
Subject: Re: Filed() macro for CSV data file handling
Date: 29 Oct 2021 09:36:04
Message: <617bf8c4$1@news.povray.org>
|
|
|
| |
| |
|
|
On 10/29/21 5:35 AM, jr wrote:
> hi,
>
> William F Pokorny <ano### [at] anonymousorg> wrote:
>> ...
>> Not yet to profiling, but I've turned up a few things. First, I don't
>> see the +700% write side slow down you see, but rather about +280% so
>> this is more in line with your expectations for write vs read.
>
> intend to install the new povr in the coming days. will then run some tests,
> including another 100k. thanks for posting the timings. (1200% is a high price
> to pay, for convenience.. :-))
>
Ultimately it's user time which matters. If machines are paying and
users still win measuring task time, it's the right trade off.
---
I understand more as of this morning. A large chunk of the p380 branch /
povr slow down was due always-active soft asserts Christoph added to
some of the last parser updates. These took me a while to understand and
untangle/turn off for the 'regular' povr build.
So! Timing with my in hand povr.
stress_wr.pov
p380b1 ---> povr fld 6.87 -> 5.42 ---> -21.11%
p380b1 raw ---> fld +283.44%
povr raw ---> fld +234.57% (-40%)
stress_rd.pov
p380b1 ---> povr fld 8.39 -> 6.38 ---> -23.98%
p380b1 raw ---> fld +1140.32%
povr raw ---> fld +1103.77% (-36%)
So, I'm happy to see povr and likely P380 too, without the soft asserts,
is quite a bit faster. I expect some of the speed up is due Christoph's
newer parser updates and some due my changes, but I've not done the p380
builds and timings to see how the savings break out - I'm not all that
interested so that work probably won't happen.
A puzzle I've not been able to run down is the larger 'system' write
times post v3.8 beta. It's very roughly around +12.5% of the raw and fld
write slow down v3.8 beta to p380 (so it hits povr too though it's
faster overall). Something is driving up that system time from raw and
fld writes, but I've not been able to find it.
For now I'm walking away from it - Christoph cleaned up / changed quite
a lot related to file and text streams over the last months ahead of the
branch point for povr. I don't see an obvious cause - and my
guesses(shots in the dark) as to cause have missed.
I hope to get to some actual code profiling today.
Bill P.
Post a reply to this message
|
|
| |
| |
|
|
From: William F Pokorny
Subject: Re: Filed() macro for CSV data file handling
Date: 30 Oct 2021 19:10:25
Message: <617dd0e1$1@news.povray.org>
|
|
|
| |
| |
|
|
On 10/29/21 9:36 AM, William F Pokorny wrote:
> A puzzle I've not been able to run down is the larger 'system' write
> times post v3.8 beta. It's very roughly around +12.5% of the raw and fld
> write slow down v3.8 beta to p380 (so it hits povr too though it's
> faster overall). Something is driving up that system time from raw and
> fld writes, but I've not been able to find it.
An update.
--- stress_wr.pov
We can improve(1) the write side cpu time by -7.0% and the elapsed time
by -9.3% by not using rand.inc's (VRand(rng_)) macro, but rather:
#local arr_[i_] = array mixed [2] \
{<rand(rng_),rand(rng_),rand(rng_)>,rand(rng_)};
Further, it appears the macro caching of VRand() is somehow tangled in
the increased system time v3.8 beta to p380 branch/povr - I still don't
see how. It still might be that if that cause could be run down, one
could use VRand() without as much penalty.
(1) - To speed compiles I dropped the link time optimization. The
improvement where configuring with --enable-lto will be somewhat different.
--- stress_rd.pov
A big chunk of the time is in the scanner.cpp code qualify bits and
pieces of potential tokens. As to how to make that all 'significantly'
faster within the current framework - no luck thus far. We'll see.
Bill P.
Post a reply to this message
|
|
| |
| |
|
|
From: William F Pokorny
Subject: Re: Filed() macro for CSV data file handling
Date: 31 Oct 2021 10:47:09
Message: <617eac6d$1@news.povray.org>
|
|
|
| |
| |
|
|
On 10/30/21 7:10 PM, William F Pokorny wrote:
> A big chunk of the time is in the scanner.cpp code qualify bits and
> pieces of potential tokens. As to how to make that all 'significantly'
> faster within the current framework - no luck thus far. We'll see.
OK. I chased one pocket of scanner expense by changing from a set of
conditional tests for character classification to a boolean array[256]
with predetermined answers. I wanted to see how much movement /
improvement I could get. It was a gain of about 3% on the write side and
%2 percent on the read.
I think I see my way clear to folding up to eight character
classifications in the space the boolean array is actually taking. This
would allow us to change some(all?) of the 'if else if... else'
conditional chains to a more direct switch construct. A WILD guess is
we'd at most gain 10-15% total (2/3% above being in that savings).
If the guess about right, it would not be a game changer - but the
relative ease of implementation is there. I haven't decided whether to
attempt the change in my povr branch.
Random thinking.
---------------
- In the end we are working character by character in the parser and
this is expensive. Further, we are repeatedly classifying characters and
tokens.
- The non-pure ascii (<=255) character encoding costs. An advantage utf8
encoding for the SDL has over others is it's smaller and so faster.
- Relatedly, Christoph is using an internal to POV-Ray utf8 class based
upon std::string. This is common practice with c++ programs and it's
safe. It's the case though that as we walk that utf8 class by character,
incrementing pointers the length checking is costing a few percent of
the overall run time. Well... at least the profiling indicates this
length checking cost.
- My updates above are related to the newest version of the parser and
not the older one in v3.8 beta*.
- The newer parser changes in v4.0 and povr are substantial and they
were only the initial push for what Christoph ultimately had in mind.
Testing of those changes showed improved performance. I'm not, however,
sure what all else was intended. Heck, to a degree the new parser is
still new-ish to me. Until now, I've only dug around in the parser code
to try and fix particular bugs.
- On staring at the profile results now over some days, I have few wild
parser ideas banging around in my head for the parser. Unfortunately,
most all are not that easy to just code up and try.
- A reminder the v4.0 / povr parsing is only faster than v3.8 beta* for
the test cases in this thread - if a collection of parser asserts is
turned off (see configparser.h). Christoph was actively working on the
parser in late 2018 and early 2019. These would have eventually been
turned off for normal release compiles.
Bill P.
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
hi,
William F Pokorny <ano### [at] anonymousorg> wrote:
> ... It was a gain of about 3% on the write side and
> %2 percent on the read.
> ...
faster parsing is better of course, but I now think I need to do the read side
different.
regards, jr.
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
hi,
'filed.inc' has been updated with a new '.Verbatim' key, inspired by the recent
"animation display window does not reset" thread. example use below.
<https://drive.google.com/file/d/10pGH0yi_-8aBTQvTwQPB4AdfRl9JsTGg/view?usp=sharing>
enjoy, jr.
-----<snip>-----
#version 3.8;
global_settings {assumed_gamma 1}
box {0,1}
#declare fild_workingDir = "/tmp/";
#include "filed.inc"
#declare D = dictionary {
.File : "bash1.sh",
.Access : "write",
.Fields : array [1] {"S"},
.Data : array [3] {
"echo \\\"new feature\\\" '.Verbatim' key,",
"echo *thanks* JB :').'\necho $(date)",
"ls --hide=systemd* ; echo"
},
.Strict : false,
.Verbatim : true,
.Verbose : true
};
Filed(D)
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
|
|