Functions

SIMDscan.scan_serial!Method
scan_serial!(f::F,x::AbstractVector) where {F <: Function}

Replaces the vector x with the scan of x using f. That is,

x[1] 
f(x[2],x[1]) 
f(x[3],f(x[2],x[1])) 
⋮

Examples

julia> scan_serial!(+,[1,2,3,4])
4-element Vector{Int64}:
  1
  3
  6
 10
source
SIMDscan.scan_serial!Method
scan_serial!(f::F,x::NTuple{K, AbstractVector{T}}) where {F <: Function, K, T}

In place scan for a function that takes two K tuples as arguments. Replaces the tuple of vectors x with the scan.

Examples

julia> scan_serial!((x,y)->(x[1]+y[1], x[2]*y[2]),([1,2,3,4],[1,2,3,4]))
([1, 3, 6, 10], [1, 2, 6, 24])
source
SIMDscan.scan_simd!Method
scan_simd!(f::F, x::AbstractVector{T}, v::Val{N}=Val(8);identity::T=zero(T)) where {F, T, N}

In place scan for an associative function.

identity should be a left identity under f. That is, f(identity, y) = y for all y.

T, must be a type that can be loaded onto registers, i.e. one of SIMD.VecTypes.

f must be associative. Otherwise, this will give incorrect results.

Val(N) specifies the SIMD vector width used. The default of 8 should give good performance on CPUs with AVX512 for which 8 Float64s fill the 512 bits available.

Examples

julia> scan_simd!(+,[1,2,3,4])
4-element Vector{Int64}:
  1
  3
  6
 10
source
SIMDscan.scan_simd!Method
scan_simd!(f::F, x::NTuple{K, AbstractVector{T}}, v::Val{N}=Val(8); identity::NTuple{K,T}=ntuple(i->zero(T),Val(K))) where {F, K, T, N}

In place scan for an associative function that takes two K tuples as arguments. Replaces the tuple of vectors x with the scan.

identity should be a left identity under f. That is, f(identity, y) = y for all y.

T, must be a type that can be loaded onto registers, i.e. one of SIMD.VecTypes.

f must be associative. Otherwise, this will give incorrect results.

Val(N) specifies the SIMD vector width used. The default of 8 should give good performance on CPUs with AVX512 for which 8 Float64s fill the 512 bits available.

Examples

julia> scan_simd!((x,y)->(x[1]+y[1], x[2]*y[2]),([1,2,3,4],[1,2,3,4]))
([1, 3, 6, 10], [1, 2, 6, 24])
source