Dear Cuda Scholars,
Looking for solution for the below problema) I have two arrays 1) array1 of size1 which is of typename1 2) array2 of size1 which is of typename2
b) I am wanting to write a kernel of the following prototype
__global__ kernel(void* dest, void* src, int dest_sizeoftype, int src_sizeoftype, int num_array_elts);
c) Supposing I create num_array_elts cuda threads, each threads copying its elt to from src to destination.
Issue: a)开发者_运维技巧 The place I am getting stuck is which function to use to copy num_bytes from src to dest in the kernel.
Thanking you in advance Regards, Nagaraju
The copy algorithm in Thrust makes this easy.
#include <thrust/copy.h>
#include <thrust/device_ptr.h>
int * src = ...
float * dst = ...
// first wrap the 'raw' pointers
thrust::device_ptr<int> wrapped_src(src);
thrust::device_ptr<float> wrapped_dst(dst);
// then pass wrapped pointers to copy()
thrust::copy(wrapped_src, wrapped_src + num_array_elts, wrapped_dst);
Refer to the QuickStart guide for additional info about Thrust.
If you know the types of the 2 arrays this problem becomes fairly trivial.
__global__ kernel(float* dest, int* src){
int idx=blockIdx.x*blockDim.x+threadIdx.x;
dest[ idx ] = src[ idx ];
}
If your dest array used a larger word e.g. a double, this would still work and there would be no need to know the number of bytes. Just make sure you allocate the correct number of bytes when using cudamalloc.
加载中,请稍侯......
精彩评论