|
|
|
|
|
|
| |
| |
|
|
|
|
| |
| |
|
|
Warp <war### [at] tagpovrayorg> wrote:
> OTOH, if you are making every operation to 64 items at a time in a loop,
> then maybe JIT compiling won't make *that* significantly faster...
Warp <war### [at] tagpovrayorg> wrote:
> OTOH, if you are making every operation to 64 items at a time in a loop,
> then maybe JIT compiling won't make *that* significantly faster...
Hello,
First, Thank you for your work on fparser, it's really a great job.
I just made some tests with multithreading support and it's also promising.
I don't know about the JIT compiling but I think that graphical applications
like Povray and K3DSurf should take advantage of there math objetcs and give
a parser some hints on the math objects they are handling: isosurfaces
(dim=3)should not be treated like a single point (dim=0) by the parser and
the internal stack of the parser should be one dimension more than the
object it performs. This new parser has the advantage that it uses 100% of
the old one, will take advantage of all opcode optimisation and we don't
have to write a new one from scratch: all to do is just write some special
Eval() fcts suitables for every math object the application is able to
handle (Isosurfaces (dim=3), parametriques(dim=2), simple fcts(dim=1),
point (dim=0) ). It's even possible to write a single new fct Eval() and
use it for heigher dimensional object: we can for example use an Eval() fct
made for "simple fcts" (dimension = 1) to perform parametric object ( by
calculating the parametric map line by line) and Isosurfaces (by
calculating line by line and layer by layer). We can also use an Eval() fct
suitible for calculating objetc with dim =2 (parametric map) to perform
isosurfaces layer by layer (that's what I'm planing to do in the future).
Also, it's a scalable parser since it's easy to change the number of
operations it can perform in the loop and by doing so approching the
"theorical" maximum number of calculations the hardware is able to do
(since most of the CPU time will be used in series of independants
calculations).
As tests shows, examples run up to 5 times faster (the fastest examples are
"Shmutov_2" and "Pseudo_Duplin" in the last k3dsurf-0613-Unstable) but it's
easy to see that it will run faster with "big" functions (since some time is
economised from the loop).
Cheers,
Taha
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
Hi again,
Just to say that the monster has just started to show it's real face to
me...calculations can go really mad by exploting Isosurfaces geometry: New
calcul simplifitions can be done , which are impossible to do by another
way.
Example: Cos(x) + cos(y) + cos(z) (Schwartz). Well, beleave it or not but a
grid of 1000x1000x1000 will require only 3000 cosinus fct calculations (I'm
already able to use this kind of simplifications), rather than the normal
10^9.
Not only it can run fast but it can also run handreds or thousands times
faster than a "hard coded" function...in short, the parser can make the
hardwar run faster than it's original purpose!
Thorsten and Warp: Please consider to take a closer look at this way of
making geometrical math calculations...it looks to me that I'm still seeing
only the top of the iceberg and skilled peoples like you can certainly make
things much more better.
Cheers,
Taha
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
"virtualmeet" <tah### [at] yahoofr> wrote:
> Example: Cos(x) + cos(y) + cos(z) (Schwartz). Well, beleave it or not but a
> grid of 1000x1000x1000 will require only 3000 cosinus fct calculations (I'm
> already able to use this kind of simplifications), rather than the normal 10^9.
Correction: rather than the normal 3x10^9.
PS: This works not only for "cos" but for all kind of unary fcts, ie:sin,
atan, log, "pow" fct with a constant...with the same ratio of calculs
simplifications (1/10^6 for a grid of 10^3x10^3x10^3).
Taha
Post a reply to this message
|
|
| |
| |
|
|
From: Thorsten Froehlich
Subject: Re: Which code evaluate Math functions (ie: isosurfaces) ?
Date: 30 Jan 2007 12:21:15
Message: <45bf7e8b$1@news.povray.org>
|
|
|
| |
| |
|
|
virtualmeet wrote:
> grid of 1000x1000x1000
You do realize that such a grid would also require eight gigabyte (!!!) of
memory? Using the fastest personal computer memory available, using a 128
bit wide bus, you still need more than a second to even fill that grid with
any single identical value.
Assuming you render a 1280 * 1028 pixel image, in the time it takes to fill
the memory of your grid, you can actually compute the intersections of an
isosurface at those 1.25 million points for a cosine function!
Your enthusiasm is certainly commendable, but unfortunately also
misdirected. What you suggest simply won't be faster for anything but very,
very, very special case in very, very, very limited circumstance.
Thorsten
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
Thorsten Froehlich <tho### [at] trfde> wrote:
> You do realize that such a grid would also require eight gigabyte (!!!) of
> memory? Using the fastest personal computer memory available, using a 128
> bit wide bus, you still need more than a second to even fill that grid with
> any single identical value.
Hello,
I'm aware of that since my "workstation" is only a P3 550MH/256M :). This
parser is useful also for smaller grid and "big" isosurfaces fcts: I'm
usually using grids of 120x120x120 and some isosurfaces can take from 1 to
60 minutes to compute. Isosurfaces aren't usually as simples as Schwartz
example and can take hours to compute even for smaller grids.
Perhaps I didn't explain the kind of simplification we can do :
An isosurface formulas with handreds of "cos(x)", "cos(y)" and "cos(z)"
terms inside it will also require as much cosinus fct calculations as
Schwartz!
A normal parser can make (very difficult to generalise but possible) some
simplifications like : cos(x) + cos(x)*y*log(abs(cos(x)) +1)) =
cos(x)*(1+y*log(abs(cos(x)) +1)). However, it still need to compute two
cosinus for every point of the grid (for a grid of 100x100x100, the number
of cosinus is 2x10^6).
Now for the geometrical parser:
1) we dont have to apply any factorisation.
2) The number of required cosinus calculation is 100 for the grid
100x100x100.
3) you can add as much as you want termes of "cos(x)" inside the isosurface
formula, the required number of cosinus calculations is always 100...
In fact, "cos(x)" termes are treated like constantes and this is possible
only because we use a geometrical parser (the parser is able to retreave
the value of "cos(x)" as easy as retreaving the value of the terme "x"
itself).
This example : cos(x)*(1+y*log(abs(cos(x)) +1))+ log(x*cos(x))+
pow(tan(abs(cos(x)+1))) + log(y*cos(x)*cos(x)) require also 100 cosinus
calculations without any hard work of factorisation.
3) This works for all unary fcts defined inside the parser (log, exp, ...)
I'm working on expanding this to binary defined fcts like "+" and "*" : this
is a little bit triky but there is a way to make termes like "cos(x*y)"
requiring as little as 10^4 cosinus calculations for the grid 100x100x100,
and this for any number of termes "cos(x*y)" inside the Isosurface formula.
Also, this way of factorisation has nothing to do with the inclusion of
mutiple instructions inside the evaluator fct of the parser, it's just
another way of sampling the calculations by using the cube symetrys.
> Your enthusiasm is certainly commendable, but unfortunately also
> misdirected.
We will try to find more convincing results :)
Cheers,
Taha
Post a reply to this message
|
|
| |
| |
|
|
From: Warp
Subject: Re: Which code evaluate Math functions (ie: isosurfaces) ?
Date: 30 Jan 2007 19:20:47
Message: <45bfe0df@news.povray.org>
|
|
|
| |
| |
|
|
It's unclear to me what is it that you are doing with the isosurfaces.
Are you rendering 3D images of them, or are you doing something completely
different?
If you are rendering an image of the isosurface, what is it that you
need a grid of values for? Are you tesselating the isosurface (using
something like the marching cubes algorithm)?
POV-Ray doesn't tesselate isosurfaces. It traces them on the fly.
A random ray would basically never hit exactly the node of a regular
3D grid of values. The only way to take advantage of such a grid would
be to interpolate from nearby values (which is, in fact, what tesselation
does basically; the "interpolation" is just done using triangles).
--
- Warp
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
Hi Warp,
In K3DSurf, I'm using mainly three steps to draw Isosurfaces:
1) Grid initialisation:
Preparing the Grid: the grid is a 3D matrix that holds values obtained from
parsing the Isosurface math formula. This process can be simplified to that
code :
++++++++++++++++++++++++++++++++++++++++++++++++++++
void InitGrig(){
FunctionParser Isosurface;
double Var[3];
double X[100], y[100], Z[100];
// init the parser:
Isosurface.Parse("cos(x)+cos(y)+cos(z)+log(abs(cos(x))+1)", "x,y,z");
// Init the axes to a Cube [-1,1][-1,1][-1,1]
for(int i=0; i<100; i++) X[i] = y[i] = Z[i] = -1+i*(2/100);
// Compute the Grid:
for(i=0; i<limitX; i++) {
Var[0] = X[i]; // Axe x
for(j=0; j<limitY; j++) {
Var[2] = Y[j]; // Axe y
for(k=0; k<limitZ; k++) {
Var[3] = Z[k]; // Axe z
Grid[i][j][k] = Isosurface.Eval(Var);
}}}
}
+++++++++++++++++++++++++++++++++++++++++++++++++++++
That's it. This simple example show how to compute the grid for the
isosurface : F(x, y, z) = cos(x)+cos(y)+cos(z)+log(abs(cos(x))+1)
2) Generate Triangles from the grid:
Use the marching cube algorithm to generate triangles from the matrix Grid.
3) Draw the image in the window:
Use OpenGl to draw thoses triangles.
The steps 2 and 3 are now extremely fast in K3DSurf: This part of code is
able to treat
(as Thorsten said) millions of triangles in less than a second.
Infortunatly, the first step is very slow compared the the others.
PovRay is not using the marching cube algorithm but it's also using a
mathematical parser in the process of generating the 3D images: even if
PovRay holds the fastest implementation of the raytracing algo, it's going
to be very slow because of it's math parser and this is especially true
when computing big grids and complicated Isosurfaces formulas.
I know that claiming to make a new parser "technologie" is quite crazy but
I'm here to show you that it's really what it looks like. The only
requirement to understand easily what I'm doing is just to have some
understanding of the 3D drawing technics because this solution is inspired
mainly from some basic geometry thoughts.
Now I'll try to explain what's the relationship between the parser and
Isosurfaces:
The original parser you created was calculating a single point at a time.
This is abvious when you look at the code shown in the top and specialy the
line:
>> Grid[i][j][k] = Isosurface.Eval(Var); // Var holds p(x,y,z) of a single point
Imagine now that we want to define a new Eval() fct that can take as
parameter sets of N points defined like this: P(X[N], Y[N], Z[N]) where X, Y
and Z and vectors.
For N =64,the code of the newly defined fct NewEval() will looks like this:
The original code:
-------------------------------------------
for(IP=0; IP<ByteCodeSize; ++IP){
switch(ByteCode[IP])
{
case cCos: Stack[SP] = cos(Stack[SP]);
break;
........
// Variables:
default:
Stack[++SP] = Vars[ByteCode[IP]-VarBegin];
}
}
Where Stack = new double[StackSize];
and Vars = new double[3];
---------------------------------------------
Changed to :
---------------------------------------------
for(IP=0; IP<ByteCodeSize; ++IP)
{
switch(ByteCode[IP])
{
case cCos:
NewStack = &StackArray[SP*64];
for(i=0; i<64; i++) NewStack[i] = cos(NewStack[i]);
break;
........
// Variables:
default:
++SP;
NewStack = &StackArray[SP*64];
if((ByteCode[IP]-VarBegin) == 0) memcpy(NewStack, &Vars[0] ,
64*sizeof(double));
else if((ByteCode[IP]-VarBegin) == 1) memcpy(NewStack, &Vars[64] ,
64*sizeof(double));
else if((ByteCode[IP]-VarBegin) == 2) memcpy(NewStack, &Vars[124],
64*sizeof(double));
Where StackArray = new double[64*StackSize];
and Vars = new double[3*64];
-------------------------------------------------
This code holds almost all the idea behind this new kind of parsers.
Now the best part is that I'm going to use my knowlege of the isosurfaces
geometry to make the newly created function NewEval() working for real:
Imgine we have an isosurface to evaluate for a grid of 100x100x100, your
original code was used like I showd in code on the top.
To use the second newly created fct, the code on the top should be changed
to something like that :
++++++++++++++++++++++++++++++++++++++++++++++++++++
void InitGrig(){
FunctionParser Isosurface;
double Var[3*64];
double X[100], y[100], Z[100];
// init the parser:
Isosurface.Parse("cos(x)+cos(y)+cos(z)+log(abs(cos(x))+1)", "x,y,z");
// Init the axes to a Cube [-1,1][-1,1][-1,1]
for(int i=0; i<100; i++) X[i] = y[i] = Z[i] = -1+i*(2/100);
// Compute the Grid:
for(StartI=0; StartI<100; StartI+=4){
l=0;
for(i=0; i<4; i++)
for(j=0; j<4; j++)
for(k=0; k<4; k++)
Vars[l++] = X[i+StartI];
for(StartJ=0; StartJ<100; StartJ+=4){
l=0;
for(i=0; i<4; i++)
for(j=0; j<4; j++)
for(k=0; k<4; k++)
Vars[64+(l++)] = Y[j+StartJ];
for(StartK=0; StartK<100; StartK+=4){
l=0;
for(i=0; i<4; i++)
for(j=0; j<4; j++)
for(k=0; k<4; k++)
Vars[128+(l++)] = Z[k+StartK];
Grid[i][j][k] = Isosurface.NewEval(Var);
}}}
}
+++++++++++++++++++++++++++++++++++++++++++++++++++++
That's it that's all! This code can be resumed to this : I splited my
isosurfaces calculations to chunks of 64 operations and that amount of
operations is equivalent to a small grid of 4x4x4.
To calculate the hole grid of 100x100x100, I have to use NewEval() : n =
(100x100x100)/(4x4x4) times.
So, now we can use the parser to evaluate more than one operations in the
loop.
I didn't finish yet...I just discovered (and implemented) a very efficient
way of sampling the calculations by using cubic symetrys.
I'll talk about it later.
Cheers,
Taha
Post a reply to this message
|
|
| |
| |
|
|
From: Warp
Subject: Re: Which code evaluate Math functions (ie: isosurfaces) ?
Date: 31 Jan 2007 03:38:59
Message: <45c055a2@news.povray.org>
|
|
|
| |
| |
|
|
Well, that's the problem: You are tesselating the isosurface, POV-Ray
doesn't. POV-Ray wouldn't benefit from your optimizations because they
can be used only when tesselating the isosurface.
--
- Warp
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
Warp <war### [at] tagpovrayorg> wrote:
> Well, that's the problem: You are tesselating the isosurface, POV-Ray
> doesn't. POV-Ray wouldn't benefit from your optimizations because they
> can be used only when tesselating the isosurface.
Hi,
I know that the 3 steps in generating images by PovRay doesn't have clear
frontiers like in thoses used in K3DSurf. However, it seems to me that
everything can be splitted that way: If we don't see it, that's because we
aren't used to it.
How about changing the raytracing technique to make it able to treat 64 rays
at the time? Is it really impossible to imagine that process ? I don't think
so...
I think that the real problem in using this technic for a raytracer will be
that the internal parser will "hardly" take a full advantage of that
implementation: Sampling the calculations by using cubic symetrys.
I'll try to explain what's this simplification and why a geometrical parser
is so important.
I want to make my code able to simplify the calculations by calculating only
the first "cos(x)" in the formulas. All other occurence of that terme will
be copied from the first result (in the example below, the parser will have
to calculate only the first "cos(x)"):
cos(x)+cos(y)+cos(z)+log(abs(cos(x))+1)
------------------------------------------------
int cosfctx = -1; // defined outside NewEval()
int usedvariable = -1;
double cosfctxT[depth];
for(IP=0; IP<ByteCodeSize; ++IP)
{
switch(ByteCode[IP])
{
case cCos:
NewStack = &StackArray[SP*64];
// This part is executed when the vriable in the top of the stack is "X"
if(variable == 0){
if(cosfctx != 1){
//This code is executed only once in the process of generating the grid:
for(i=0; i<100; i++) cosfctxT[i] = cos(X[i]);
cosfctx = 1;
}
l=0;
for(i=0; i<8; i++)
for(j=0; j<8; j++)
for(k=0; k<8; k++)
NewStack[l++] = cosfctxT[i+StartI];
}
variable = -1;
break;
}
//This part is not always reached:
for(i=0; i<64; i++) NewStack[i] = cos(NewStack[i]);
break;
.............
// Variables:
default:
++SP;
NewStack = &StackArray[SP*64];
if((ByteCode[IP]-VarBegin) == 0) {
memcpy(NewStack, &Vars[0] ,64*sizeof(double));
variable = 0; // tell the parser which variable it's using
}
else if((ByteCode[IP]-VarBegin) == 1) memcpy(NewStack, &Vars[64] ,
64*sizeof(double));
else if((ByteCode[IP]-VarBegin) == 2) memcpy(NewStack, &Vars[124],
64*sizeof(double));
----------------------------------------------------------------
That code is now able to calculate only one occurence of "cos(x)" and use it
for all other termes in the formulas. More over, the parser is calculating
only 100 cosinus fct in the hole process. We can do the same for all unary
fct defined by the parser and for all defined variables. Inlike Thorsten, I
think that those cases are very commun and can do huge time saving.
Moreover, we can expand this technique to heigher level.
Also, this kind of simplification is possible only because the parser
"knows" the geometry it's using: it's impossible to do such thing in
general. Multiple instructions inside the loop can work but not this kind
of optimisation and that can be the case of PovRay but I still didn't
investigate all the possibilitys.
Hope it's clear to you.
Cheers,
Taha
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
"virtualmeet" <tah### [at] yahoofr> wrote:
> l=0;
> for(i=0; i<8; i++)
> for(j=0; j<8; j++)
> for(k=0; k<8; k++)
> NewStack[l++] = cosfctxT[i+StartI];
Correction :
-----------------------------------------------------
l=0;
for(i=0; i<4; i++)
for(j=0; j<4; j++)
for(k=0; k<4; k++)
NewStack[l++] = cosfctxT[i+StartI];
------------------------------------------------------
That's because I'm using an extraordinary 512=8x8x8 version :)...
Taha
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
|
|