Here's a profile of the 3.6 windows code running the demo:
http://garrionent.com/pov/
I ran the profile to see if I could accelerate it with an FPGA module.
Unfortunately, I don't see a whole lot of ways I could help. I could
eliminate the multiplies and adds in MInvTransPoint and MInvTransDirection.
The DMA overhead kills us, though, on such a small piece of data. It would
be no improvement. What I need is a function similar to those that
opperates on half a megabyte of data and performs about three times as many
math operations. Those are the kind of functions I could really accelerate.
Post a reply to this message
|