So I have a Particle System that I'm trying to convert over into CUDA, but am having some trouble actually checking if it's doing what I want it to do.
I am trying to allocate a temporary array in my constructor, initialize it and then pass it into a device array for the kernel calls. I'm not very familiar with CUDA and don't know of a better way to check if my kernels are doing what I want, so for now I was trying to copy the device array back into a host array and just checking the values after a single time step. Unfortunately, all I'm getting is garbage data right now.
Thanks.
In my header file:
Particle tempParticles[numParticles];
Particle* particles;
Constructor:
ParticleSystem::ParticleSystem() {
cudaMalloc((void**)&particles, numParticles * sizeof(Particle));
int count = 0;
//for loops
tempParticles[count].newPos = glm::vec3(i, j, k);
count++;
//end for
cudaMemcpy(particles, tempParticles, numParticles * sizeof(Particle), cudaMemcpyHostToDevice);
}
Update Wrapper for cuda:
void ParticleSystem::updateWrapper() {
update(particles);
cudaMemcpy(tempParticles, particles, numParticles * sizeof(Particle), cudaMemcpyDeviceToHost);
}
Example of my cuda update method:
__global__ void predictPositions(Particle* particles) {
int index = threadIdx.x + (blockIdx.x * blockDim.x);
if (index > NUM_PARTICLES_C) return;
//predict position x* = xi + dt * vi
particles[index].newPos += particles[index].velocity * deltaT;
}
void update(Particle* particles) {
predictPositions<<<dims, blockSize>>>(particles);
}
Aucun commentaire:
Enregistrer un commentaire