![]() |
batmat
0.0.19
Batched linear algebra routines
|
Low-level operations such as gathers, transposes, lane-wise rotations, inverse square roots, conditional negations, etc.
Conditional negation of floating point numbers | |
| template<class T> | |
| T | batmat::ops::detail::cneg (T x, T signs) |
Conditionally negates the sign bit of x, depending on signs, which should contain only ±0 (i.e. | |
| template<class T, class Abi> | |
| T | batmat::ops::detail::cneg (T x, T signs) |
Conditionally negates the sign bit of x, depending on signs, which should contain only ±0 (i.e. | |
| template<class T, class Abi> | |
| datapar::simd< T, Abi > | batmat::ops::detail::cneg (datapar::simd< T, Abi > x, datapar::simd< T, Abi > signs) |
Conditionally negates the sign bit of x, depending on signs, which should contain only ±0 (i.e. | |
Gathering elements from memory | |
| template<class T, class AbiT, class I, class AbiI, class M> | |
| datapar::simd< T, AbiT > | batmat::ops::gather (const T *p, datapar::simd< I, AbiI > idx, M mask) |
Gathers elements from memory at the addresses specified by idx, which should be an integer SIMD vector, and returns them in a SIMD vector of type datapar::simd<T, AbiT>. | |
Lane-wise rotations of SIMD vectors | |
| template<int S, class F, class Abi> | |
| datapar::simd< F, Abi > | batmat::ops::rotl (datapar::simd< F, Abi > x) |
Rotates the elements of x by s positions to the left. | |
| template<int S, class F, class Abi> | |
| datapar::simd< F, Abi > | batmat::ops::rotr (datapar::simd< F, Abi > x) |
Rotate the elements of x to the right by S positions. | |
| template<int S, class F, class Abi> | |
| datapar::simd< F, Abi > | batmat::ops::shiftl (datapar::simd< F, Abi > x) |
Shift the elements of x to the left by S positions, shifting in zeros. | |
| template<int S, class F, class Abi> | |
| datapar::simd< F, Abi > | batmat::ops::shiftr (datapar::simd< F, Abi > x) |
Shift the elements of x to the right by S positions, shifting in zeros. | |
| template<class F, class Abi> | |
| datapar::simd< F, Abi > | batmat::ops::rot (datapar::simd< F, Abi > x, int s) |
Rotate the elements of x to the right by s positions. | |
Inverse square root | |
| template<std::floating_point T> | |
| T | batmat::ops::rsqrt (T x) |
| Inverse square root. | |
| template<class T, class Abi> | |
| datapar::simd< T, Abi > | batmat::ops::rsqrt (datapar::simd< T, Abi > x) |
| Inverse square root. | |
Transposition | |
| template<index_t R, index_t C, class T> | |
| void | batmat::ops::transpose_dyn (const T *pa, index_t lda, T *pb, index_t ldb, index_t d=R) |
Transposes the R × C matrix at pa with leading dimension lda, writing the result to pb with leading dimension ldb, writing only the d first columns of the result. | |
| template<index_t R, index_t C, class T> | |
| void | batmat::ops::transpose (const T *pa, index_t lda, T *pb, index_t ldb) |
Transposes the R × C matrix at pa with leading dimension lda, writing the result to pb with leading dimension ldb. | |
| T batmat::ops::detail::cneg | ( | T | x, |
| T | signs ) |
#include <batmat/ops/cneg.hpp>
Conditionally negates the sign bit of x, depending on signs, which should contain only ±0 (i.e.
only the sign bit of an IEEE-754 floating point number).
|
inline |
#include <batmat/ops/cneg.hpp>
Conditionally negates the sign bit of x, depending on signs, which should contain only ±0 (i.e.
only the sign bit of an IEEE-754 floating point number).
|
inline |
#include <batmat/ops/cneg.hpp>
Conditionally negates the sign bit of x, depending on signs, which should contain only ±0 (i.e.
only the sign bit of an IEEE-754 floating point number).
|
inline |
#include <batmat/ops/gather.hpp>
Gathers elements from memory at the addresses specified by idx, which should be an integer SIMD vector, and returns them in a SIMD vector of type datapar::simd<T, AbiT>.
The elements are gathered relative to the base address p. The gathering is masked by mask,
Definition at line 56 of file gather.hpp.
|
inline |
#include <batmat/ops/rotate.hpp>
Rotates the elements of x by s positions to the left.
For example, rotl<1>([x0, x1, x2, x3]) == [x1, x2, x3, x0] and rotl<-1>([x0, x1, x2, x3]) == [x3, x0, x1, x2].
Definition at line 226 of file rotate.hpp.
|
inline |
#include <batmat/ops/rotate.hpp>
Rotate the elements of x to the right by S positions.
For example, rotr<1>([x0, x1, x2, x3]) == [x3, x0, x1, x2] and rotr<-1>([x0, x1, x2, x3]) == [x1, x2, x3, x0].
Definition at line 239 of file rotate.hpp.
|
inline |
#include <batmat/ops/rotate.hpp>
Shift the elements of x to the left by S positions, shifting in zeros.
For example, shiftl<1>([x0, x1, x2, x3]) == [x1, x2, x3, 0] and shiftl<-1>([x0, x1, x2, x3]) == [0, x0, x1, x2].
Definition at line 252 of file rotate.hpp.
|
inline |
#include <batmat/ops/rotate.hpp>
Shift the elements of x to the right by S positions, shifting in zeros.
For example, shiftr<1>([x0, x1, x2, x3]) == [0, x0, x1, x2] and shiftr<-1>([x0, x1, x2, x3]) == [x1, x2, x3, 0].
Definition at line 267 of file rotate.hpp.
| T batmat::ops::rsqrt | ( | T | x | ) |
| datapar::simd< T, Abi > batmat::ops::rsqrt | ( | datapar::simd< T, Abi > | x | ) |
#include <batmat/ops/rsqrt.hpp>
Inverse square root.
May be implemented using an rsqrt instruction followed by Newton iterations for better performance, depending on the SIMD ABI. This allows it to be performed in parallel with a normal square root instruction, enabling better performance of the Cholesky micro-kernels.
|
inline |
#include <batmat/ops/transpose.hpp>
Transposes the R × C matrix at pa with leading dimension lda, writing the result to pb with leading dimension ldb, writing only the d first columns of the result.
Definition at line 20 of file transpose.hpp.
|
inline |
#include <batmat/ops/transpose.hpp>
Transposes the R × C matrix at pa with leading dimension lda, writing the result to pb with leading dimension ldb.
Definition at line 63 of file transpose.hpp.
|
inline |
#include <batmat/ops/rotate.hpp>
Rotate the elements of x to the right by s positions.
For example, rotr<1>([x0, x1, x2, x3]) == [x3, x0, x1, x2] and rotr<-1>([x0, x1, x2, x3]) == [x1, x2, x3, x0].
Definition at line 18 of file rotate.hpp.