Here are some convolution benchmarks using direct (<code>psImageConvolveDirect</code>) and FFT convolution (<code>psImageConvolveFFT</code>). The source is in <code>psLib/test/convolveBench.c</code>. Times are in seconds per convolution. The kernel sizes are the full sizes, distributed evenly about 0.
No multithreading, or any other tricks, were used, except that psLib was compiled with <code>--enable-optimize</code>.
The bottom line is that unless the kernel is really large (more than around 10 pix, corresponding to about 2.5 arcsec for PS1), direct convolution is the way to go.
- On <code>alala</code>:
<pre> # Image Kernel Direct FFT
100x100 7x7 0.014120 0.023974 200x200 7x7 0.004930 0.025839 400x400 7x7 0.019928 0.143949 600x600 7x7 0.045172 0.240070 800x800 7x7 0.080492 0.581588
1000x1000 7x7 0.125420 1.776494 2000x2000 7x7 0.504519 5.551337 4000x4000 7x7 2.009139 38.239530
600x600 3x3 0.016893 0.245433 600x600 5x5 0.028220 0.206202 600x600 7x7 0.045632 0.238620 600x600 9x9 0.095164 0.204965 600x600 13x13 0.176188 0.239563 600x600 17x17 0.271444 0.263421 600x600 21x21 0.394447 0.267914 600x600 31x31 0.786528 0.266132
</pre>
- on <code>mithrandir</code>:
<pre> # Image Kernel Direct FFT
100x100 7x7 0.029464 0.007483 200x200 7x7 0.013420 0.016342 400x400 7x7 0.053638 0.101447 600x600 7x7 0.123597 0.304330 800x800 7x7 0.223795 0.557118
1000x1000 7x7 0.346237 0.669387 2000x2000 7x7 1.338501 5.108009 4000x4000 7x7 5.397184 25.005011
600x600 3x3 0.036354 0.300892 600x600 5x5 0.071562 0.250613 600x600 7x7 0.122224 0.302385 600x600 9x9 0.218580 0.276135 600x600 13x13 0.412700 0.262436 600x600 17x17 0.662877 0.163675 600x600 21x21 0.969317 0.271112 600x600 31x31 2.036099 0.329363
</pre>
