SIMDscan
Documentation for SIMDscan.
This package provides code for doing a scan using SIMD instructions. Scans are also known as prefix operations. The Base Julia function accumulate
performs the same operation.
Given a sequence $x_1, x_2, ... , x_n$, and an associative operator, $\oplus$, the the scan is
\[x_1, x_2 \oplus x_1, x_3 \oplus x_2 \oplus x_1, ... , x_n \oplus x_{n-1} \oplus \cdots \oplus x_1\]
For parallelization, it is essential that $\oplus$ be associative. The function scan_simd!(⊕, x)
computes the scan of x
in place.
Warnings
x
must be indexed from1
tolength(x)
.⊕
must be associative forscan_simd!
Multivariate Operations
There is also a method for scanning ⊕: ℝᴷ→ℝᴷ
. In this case, ⊕
should accept two NTuple{K,T}
tuples as arguments and return a NTuple{K,T}
.
x
should be a NTuple{K,AbstractVector}
where each element of the tuple is the sequence of values.
The multivariate scan can be used to simulate an AR(1) model.
T = 10
ϵ = randn(T)
# simple recursive version for comparison
y = similar(ϵ)
y[1] = ϵ[1]
α = 0.5
for t = 2:T
y[t] = α*y[t-1] + ϵ[t]
end
# to make ⊕ associative, augment ϵ[t] with a second element that keeps track of the appropriate power of α
⊕(y,e) = (e[1] + α*y[1]*e[2], α*y[2]*e[2])
id = (0.0,1.0/α) # a left identity; ⊕(id,x) = x ∀ x
yscansimd,at = scan_simd!(⊕, (copy(ϵ), ones(T)), identity=id)
[y yscansimd]
10×2 Matrix{Float64}:
0.944978 0.944978
1.56944 1.56944
1.0615 1.0615
0.298787 0.298787
-1.61288 -1.61288
-1.39296 -1.39296
-0.802339 -0.802339
-0.266392 -0.266392
-1.00768 -1.00768
-1.29495 -1.29495
Acknowledgements
The following sources were helpful while developing this package:
- (Slotin, 2023) describes an SIMD implementation for prefix sum with C code
- (Nash *et al.*, 2021) has example code for a threaded scan (prefix sum) in Julia,