|
|
Christoph Hormann wrote:
> I think the work probably would have been better invested in
> implementing an internal JIT compiler using the existing hooks as
> Thorsten explained. This would work on all x86 systems (and for Mac an
> implementation already exists).
>
Well, I had a look at the PPC JIT Compiler in the Mac version. If I
understand it correctly, then it is actually assembling the binary code for
the PPC in memory and then executing it by jumping into that created
code.
After thinking about it, I found 2 reasons which can keep me from
doing something like that for i386 architecture:
(1) When I saw the POV VM instruction set it immediately reminded me
somehow on the PPC instruction set or a similar RISC instruction set
with a number of general-purpose registers etc. (I do not know the PPC
instructions in detail but this opinion was based on the feeling I had
of it from reading the computer magazines and from my experiences with
other RISCs.)
So, compiling this code into PPC code turns out to be pretty
straight-forward. In contrast, the i387 does not seem to have
these general purpose registers but instead it uses a register stack with
IMO 8 registers and there is a top-of-stack pointer and so on.
Furthermore, I am by far no expert in i386/7 assembly and I do not want
to hack tons of error-prone code to perform correct translation of
POV-VM-ASM into i387-ASM. [r0 should be top of stack...]
(2) GCC does a decent work in optimizing. The POV VM compiler produces
assembly which IMO has plenty of (seemingly?) pointless register moves.
(Don't understand me wrong, Thorsten: Good register allocation is a really
tough job.) Take for example this part of the paramflower on my homepage.
------------------------
r0 = sqrt(r0);
r5 = r5 * r0;
r0 = r5;
r5 = r6; <-- completely useless
r5 = r0; <-- useless as well
r0 = r2;
r0 = sqrt(r0);
r5 = r5 + r0;
r6 = r5;
r0 = POVFPU_Consts[k];
r5 = r0; <-- (skip)
r7 = r5; <-- why not r7=r0
r0 = r2; <-- (skip)
r5 = r0; <-- why not r5=r2
r0 = r5; <-- hmm?!
r5 = r5 * r0;
r5 = r5 * r0;
r0 = r5;
r5 = r7;
------------------------
Compiling this assembly directly into i387 code would probably not give
as good runtime performance as asking GCC would.
And I do not want to implement an optimizer especially since there is
really little chance to get better than GCC if we're only 1 or 2 people...
Maybe it would be easier to translate the POV-VM-ASM into SSE2 instructions.
But 2 reasons suggest against that:
(1) Not yet widely available.
(2) My box only has SSE1 and hence I could not test it.
Wolfgang
Post a reply to this message
|
|