Benchmarking the speed of random number generation in C++11 with GCC and with VSL and ICC.
Posts tagged: C++
Load Cython modules from Pypy while also using Numpy.
GCC, ICC, PGI compilers with BLAS/LAPACK, MKL, and ACML are compared in solving an SDP with SDPA.
Self-organizing maps are computationally expensive to train -- emergent maps are even more so. This post looks at the constraints with sparse data.
Getting around Fortran-style array indexing in CuBlas from C code without transponation. Bonus Thrust vector casting added.
Thrust-based summing of the elements of a submatrix at a given offset according to a stencil.
A detailed description of how to use Thrust reduce by key to calculate the argmins of the rows of a matrix