|
|
|
|
|
|
| |
| |
|
|
|
|
| |
| |
|
|
Le 30/03/2011 15:14, Aidy a écrit :
>> Please, read.
>> __device__ is going to inline the function (or I'm just getting it
>> wrong, which is 90% likely).
>
> Sorry I missed that bit. You're right, I've just looking through the nVidia
> documentation, and the link you sent me and it will inline it.
>
> If it inlines it then, can I pass the structure itself and NOT have it create a
> copy within the function itself?? It shouldn't increase the overhead at all
> should it??
>
I would be more concerned about the content of the structure: if it hold
any pointer (and use it), you will be once again in trouble.
(for the sack of my comprehension so far: the address space used by the
main CPU is not the address space used by the Cuda element, so any
pointer filled by the CPU is useless on the Cuda element. Which means
only linear/flat data(aka structure) can be exchanged at the interface
of a __device__ function.)
Now, I did not check the definition of TNORMAL and other.
For your question: as it is inline, no copy are created, it just use the
actual data. (unless a C expert wants to show up and put me in
embarassement right for that sentence)
You might get issue with patterns that use a cache, as 3.6 is not
thread-aware and Cuda code could run into collisions.
--
Software is like dirt - it costs time and money to change it and move it
around.
Just because you can't see it, it doesn't weigh anything,
and you can't drill a hole in it and stick a rivet into it doesn't mean
it's free.
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
I've not got round to porting over the types inside the Opts structure just yet
but you're right, they won't be able to reference the original memory. But this
will be part of what I copy to the GPU when the kernel function is called.
What sort of things are you referring to when you say patterns that use a
cache??
Le_Forgeron <lef### [at] freefr> wrote:
> Le 30/03/2011 15:14, Aidy a écrit :
> >> Please, read.
> >> __device__ is going to inline the function (or I'm just getting it
> >> wrong, which is 90% likely).
> >
> > Sorry I missed that bit. You're right, I've just looking through the nVidia
> > documentation, and the link you sent me and it will inline it.
> >
> > If it inlines it then, can I pass the structure itself and NOT have it create a
> > copy within the function itself?? It shouldn't increase the overhead at all
> > should it??
> >
> I would be more concerned about the content of the structure: if it hold
> any pointer (and use it), you will be once again in trouble.
>
> (for the sack of my comprehension so far: the address space used by the
> main CPU is not the address space used by the Cuda element, so any
> pointer filled by the CPU is useless on the Cuda element. Which means
> only linear/flat data(aka structure) can be exchanged at the interface
> of a __device__ function.)
>
> Now, I did not check the definition of TNORMAL and other.
>
> For your question: as it is inline, no copy are created, it just use the
> actual data. (unless a C expert wants to show up and put me in
> embarassement right for that sentence)
>
> You might get issue with patterns that use a cache, as 3.6 is not
> thread-aware and Cuda code could run into collisions.
>
> --
> Software is like dirt - it costs time and money to change it and move it
> around.
>
> Just because you can't see it, it doesn't weigh anything,
> and you can't drill a hole in it and stick a rivet into it doesn't mean
> it's free.
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
Le 30/03/2011 16:41, Aidy nous fit lire :
> What sort of things are you referring to when you say patterns that use a
> cache??
IIRC, the crackle pattern needs one (it has been one of a major headache
for 3.7 & SMP). The 3.6 serie for crackle uses a single cache, and
overuse the fact that the rendering is progressive to manage it.
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
Am 29.03.2011 19:33, schrieb Aidy:
> That is the exact text of the error. I'm not at a stage of being able to compile
> the code yet, it is simply giving me the errors as if they are syntax errors.
> But the error messages I've given are the exact messages I am receiving. There
> is no more information provided :(
>
> The first error I'm coming across is in the method dents (cuDents for me) :
>
> __device__ static void cuDents (VECTOR EPoint, TNORMAL *Tnormal, VECTOR normal)
What's __device__? You're compiling for CUDA, right? Could it be that
Tnormal is interpreted as a pointer to CPU address space, while the
function is to be executed by the GPU and therefore local pointer
variables need to point to GPU address space?
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
|
|