whenever we want to memcpy because these methods guarantee valid pitched pointers.
“For allocations of 2D and 3D objects, it is highly recommended that programmers perform allocations using cudaMalloc3D() or cudaMallocPitch(). Due to alignment restrictions in the hardware, this is especially true if the application will be performing memory copies involving 2D or 3D objects (whether linear memory or CUDA arrays).” [from the Reference Manual]
. However, the fact that we use pitched pointer complicates the array access in the kernels. First of all we have to keep track of the data type stored in a pitched pointer (ptr is a void pointer). Furthermore, we have to be careful when dealing with pointer arithmectics because the pitch is, obviously, a byte offset (width of the array). Here is some sample code that shows how to access pitched pointer data. We assume we want to store
is linear memory which need to be accessed in the way described above.
We need to know how pitches work when we want to copy arrays (when the copy method expects a pitched pointer which we can create with
. In this case we must specify the extent, in other words the size of our array. All necessary information to execute the copy are now present (it can be ensured that no invalid memory will be accessed) . Each element can be accessed by