Consider these definitions from a previous question:
type Algebra f a = f a -> a
cata :: Functor f => Algebra f b -> Fix f -> b
cata alg = alg . fmap (cata alg) . unFix
fixcata :: Functor f => Algebra f b -> Fix f -> b
fixcata alg = fix $ \f -> alg . fmap f . unFix
type CoAlgebra f a = a -> f a
ana :: Functor f => CoAlgebra f a -> a -> Fix f
ana coalg = Fix . fmap (ana coalg) . coalg
fixana :: Functor f => CoAlgebra f a -> a -> Fix f
fixana coalg = fix $ \f -> Fix . fmap f . coalg
I ran some benchmarks and the results are surprising me. criterion
reports something like a tenfold speedup, specifically when O2
is enabled. I wonder what causes such massive improvement, and begin to seriously doubt my benchmarking abilities.
This is the exact criterion
code I use:
smallWord, largeWord :: Word
smallWord = 2^10
largeWord = 2^20
shortEnv, longEnv :: Fix Maybe
shortEnv = ana coAlg smallWord
longEnv = ana coAlg largeWord
benchCata = nf (cata alg)
benchFixcata = nf (fixcata alg)
benchAna = nf (ana coAlg)
benchFixana = nf (fixana coAlg)
main = defaultMain
[ bgroup "cata"
[ bgroup "short input"
[ env (return shortEnv) $ \x -> bench "cata" (benchCata x)
, env (return shortEnv) $ \x -> bench "fixcata" (benchFixcata x)
]
, bgroup "long input"
[ env (return longEnv) $ \x -> bench "cata" (benchCata x)
, env (return longEnv) $ \x -> bench "fixcata" (benchFixcata x)
]
]
, bgroup "ana"
[ bgroup "small word"
[ bench "ana" $ benchAna smallWord
, bench "fixana" $ benchFixana smallWord
]
, bgroup "large word"
[ bench "ana" $ benchAna largeWord
, bench "fixana" $ benchFixana largeWord
]
]
]
And some auxiliary code:
alg :: Algebra Maybe Word
alg Nothing = 0
alg (Just x) = succ x
coAlg :: CoAlgebra Maybe Word
coAlg 0 = Nothing
coAlg x = Just (pred x)
Compiled with O0
, the digits are pretty even. With O2
, fix~
functions seem to outperform the plain ones:
benchmarking cata/short input/cata
time 31.67 μs (31.10 μs .. 32.26 μs)
0.999 R² (0.998 R² .. 1.000 R²)
mean 31.20 μs (31.05 μs .. 31.46 μs)
std dev 633.9 ns (385.3 ns .. 1.029 μs)
variance introduced by outliers: 18% (moderately inflated)
benchmarking cata/short input/fixcata
time 2.422 μs (2.407 μs .. 2.440 μs)
1.000 R² (1.000 R² .. 1.000 R²)
mean 2.399 μs (2.388 μs .. 2.410 μs)
std dev 37.12 ns (31.44 ns .. 47.06 ns)
variance introduced by outliers: 14% (moderately inflated)
I would appreciate if someone can confirm or spot a flaw.
*I compiled things with ghc 8.2.2
on this occasion.)
postscriptum
This post from back in 2012 elaborates on the performance of fix
in quite a fine detail. (Thanks to @chi
for the link.)