batmat 0.0.14
Batched linear algebra routines
Loading...
Searching...
No Matches
batmat::ops Namespace Reference

Namespaces

namespace  detail

Gathering elements from memory

template<class T, class AbiT, class I, class AbiI, class M>
datapar::simd< T, AbiT > gather (const T *p, datapar::simd< I, AbiI > idx, M mask)
 Gathers elements from memory at the addresses specified by idx, which should be an integer SIMD vector, and returns them in a SIMD vector of type datapar::simd<T, AbiT>.

Lane-wise rotations of SIMD vectors

template<int S, class F, class Abi>
datapar::simd< F, Abi > rotl (datapar::simd< F, Abi > x)
 Rotates the elements of x by s positions to the left.
template<int S, class F, class Abi>
datapar::simd< F, Abi > rotr (datapar::simd< F, Abi > x)
 Rotate the elements of x to the right by S positions.
template<int S, class F, class Abi>
datapar::simd< F, Abi > shiftl (datapar::simd< F, Abi > x)
 Shift the elements of x to the left by S positions, shifting in zeros.
template<int S, class F, class Abi>
datapar::simd< F, Abi > shiftr (datapar::simd< F, Abi > x)
 Shift the elements of x to the right by S positions, shifting in zeros.
template<class F, class Abi>
datapar::simd< F, Abi > rot (datapar::simd< F, Abi > x, int s)
 Rotate the elements of x to the right by s positions.

Inverse square root

template<std::floating_point T>
rsqrt (T x)
 Inverse square root.
template<class T, class Abi>
datapar::simd< T, Abi > rsqrt (datapar::simd< T, Abi > x)
 Inverse square root.

Transposition

template<index_t R, index_t C, class T>
void transpose_dyn (const T *pa, index_t lda, T *pb, index_t ldb, index_t d=R)
 Transposes the R × C matrix at pa with leading dimension lda, writing the result to pb with leading dimension ldb, writing only the d first columns of the result.
template<index_t R, index_t C, class T>
void transpose (const T *pa, index_t lda, T *pb, index_t ldb)
 Transposes the R × C matrix at pa with leading dimension lda, writing the result to pb with leading dimension ldb.

Functions

template<class T>
cneg (T x, T signs)
 Conditionally negates the sign bit of x, depending on signs, which should contain only ±0 (i.e.

Variables

template<class T>
constexpr index_t RowsRegTranspose = 8
template<class T>
constexpr index_t ColsRegTranspose = 8
template<>
constexpr index_t RowsRegTranspose< double > = 4
template<>
constexpr index_t ColsRegTranspose< double > = 4

Function Documentation

◆ cneg()

template<class T>
T batmat::ops::detail::cneg ( T x,
T signs )

Conditionally negates the sign bit of x, depending on signs, which should contain only ±0 (i.e.

only the sign bit of an IEEE-754 floating point number).

Definition at line 42 of file cneg.hpp.

Variable Documentation

◆ RowsRegTranspose

template<class T>
index_t batmat::ops::RowsRegTranspose = 8
inlineconstexpr

Definition at line 23 of file avx-512.hpp.

◆ ColsRegTranspose

template<class T>
index_t batmat::ops::ColsRegTranspose = 8
inlineconstexpr

Definition at line 25 of file avx-512.hpp.

◆ RowsRegTranspose< double >

template<>
index_t batmat::ops::RowsRegTranspose< double > = 4
inlineconstexpr

Definition at line 29 of file avx-512.hpp.

◆ ColsRegTranspose< double >

template<>
index_t batmat::ops::ColsRegTranspose< double > = 4
inlineconstexpr

Definition at line 31 of file avx-512.hpp.