BLAS or LAPACK in CUDA kernel
Afficher commentaires plus anciens
Hi, I need to do x=A\b several hundred million times, along with other trivial arithmetic, for an A that is 4x4 and dense. I was thinking about writing a little CUDA kernel that would get called within MATLAB to do this, but I don't know how I would call something like DGETRS or SGETRS within a thread. CUBLAS, MAGMA, and things of that kind seem to parallelize this operation for a single, massive A, but I don't how they would help me. Is this possible?
Thanks!
Réponse acceptée
Plus de réponses (0)
Catégories
En savoir plus sur Get Started with GPU Coder dans Centre d'aide et File Exchange
Produits
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!