Improving calls to the LogGabor library

To code image as edges, for instance in the SparseEdges sparse coding scheme, we use a model of edges in images. A good model for these edges are bidimensional Log Gabor filter. This is implemented for instance in the LogGabor library. The library was designed to be precise, but not particularly for efficiency. In order to improve its speed, we demonstrate here the use of a cache to avoid redundant computations.

Let's first initialize the notebook:

In :
from __future__ import division, print_function
import numpy as np
np.set_printoptions(precision=6, suppress=True)


timing of the library without cache¶

Let's make calls to the library and record the wall clock timing:

In :
from LogGabor import LogGabor
lg = LogGabor('https://raw.githubusercontent.com/bicv/SparseEdges/master/default_param.py')
lg.pe.use_cache = False
lg.pe.verbose = 100
lg.init()

In :
%%timeit
edge = [3*lg.pe.N_X/4, lg.pe.N_Y/2, 2, 2]
FT_lg = lg.loggabor(edge, edge, sf_0=lg.sf_0[edge], B_sf=lg.pe.B_sf, theta=lg.theta[edge], B_theta=lg.pe.B_theta)

7.77 ms ± 1.32 ms per loop (mean ± std. dev. of 7 runs, 100 loops each)

%lprun -s -m LogGabor lg.loggabor(0, 0, sf_0=lg.sf_0, B_sf=lg.pe.B_sf, theta=lg.theta, B_theta=lg.pe.B_theta)

Note that most of the time, we compute the filter at the origin and that whenever it is the case we avoid performing the translation. This makes the call systematically faster:

In :
%%timeit
edge = [0., 0., 2, 2]
FT_lg = lg.loggabor(edge, edge, sf_0=lg.sf_0[edge], B_sf=lg.pe.B_sf, theta=lg.theta[edge], B_theta=lg.pe.B_theta)

3.49 ms ± 259 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)


Using a cache¶

We will use the fact that the many calls to the logGabor library repeat the same operation. We can cache the computed matrices instead of repeating the operation. In particular, we will take advantage of using scales (bands) and orientation separately, the multiplication being rapid in numpy.

In :
lg = LogGabor('https://raw.githubusercontent.com/bicv/SparseEdges/master/default_param.py')
lg.pe.use_cache = True
lg.pe.verbose = 100
lg.init()
print ('Dictionary that will contain the matrices=', lg.cache)

Dictionary that will contain the matrices= {'band': {}, 'orientation': {}}


In the beginning, the cache is empty but every time with compute one matrix, it gets filled up:

In :
edge = [0., 0., 2, 2]
FT_lg = lg.loggabor(edge, edge, sf_0=lg.sf_0[edge], B_sf=lg.pe.B_sf, theta=lg.theta[edge], B_theta=lg.pe.B_theta)
print ('Dictionary that contains the matrices=', lg.cache)

doing band cache for tag  0.189503126127_0.4
doing orientation cache for tag  -1.1780972451_0.17453277777777776
Dictionary that contains the matrices= {'band': {'0.189503126127_0.4': array([[ 0.006271,  0.006501,  0.006738, ...,  0.006984,  0.006738,
0.006501],
[ 0.006501,  0.00674 ,  0.006988, ...,  0.007244,  0.006988,
0.00674 ],
[ 0.006738,  0.006988,  0.007246, ...,  0.007513,  0.007246,
0.006988],
...,
[ 0.006984,  0.007244,  0.007513, ...,  0.007792,  0.007513,
0.007244],
[ 0.006738,  0.006988,  0.007246, ...,  0.007513,  0.007246,
0.006988],
[ 0.006501,  0.00674 ,  0.006988, ...,  0.007244,  0.006988,
0.00674 ]])}, 'orientation': {'-1.1780972451_0.17453277777777776': array([[  2.857228e+05,   2.536583e+05,   2.249385e+05, ...,
7.830696e-14,   7.439627e-14,   7.074259e-14],
[  3.217782e+05,   2.857228e+05,   2.534195e+05, ...,
7.445581e-14,   7.077041e-14,   6.732618e-14],
[  3.626499e+05,   3.220804e+05,   2.857228e+05, ...,
7.079868e-14,   6.732618e-14,   6.407993e-14],
...,
[  1.720206e+13,   1.639461e+13,   1.561163e+13, ...,
3.499896e-06,   3.101855e-06,   2.752276e-06],
[  1.638199e+13,   1.560551e+13,   1.485306e+13, ...,
3.949803e-06,   3.499896e-06,   3.104815e-06],
[  1.559949e+13,   1.485306e+13,   1.413020e+13, ...,
4.454120e-06,   3.946026e-06,   3.499896e-06]])}}

In :
%%timeit
edge = [3*lg.pe.N_X/4, lg.pe.N_Y/2, 2, 2]
FT_lg = lg.loggabor(edge, edge, sf_0=lg.sf_0[edge], B_sf=lg.pe.B_sf, theta=lg.theta[edge], B_theta=lg.pe.B_theta)

5.12 ms ± 1.9 ms per loop (mean ± std. dev. of 7 runs, 100 loops each)

In :
%%timeit
edge = [0., 0., 2, 2]
FT_lg = lg.loggabor(edge, edge, sf_0=lg.sf_0[edge], B_sf=lg.pe.B_sf, theta=lg.theta[edge], B_theta=lg.pe.B_theta)

486 µs ± 48.3 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)


That's a great improvement! Let's now apply that to the Matching Pursuit algorithm implemented in the SparseEdges library:

application to SparseEdges¶

In :
from SparseEdges import SparseEdges
mp = SparseEdges('https://raw.githubusercontent.com/bicv/SparseEdges/master/default_param.py')
mp.pe.N = 32 # number of edges
mp.pe.use_cache = False
mp.init()

# defining a test image
image = np.zeros((mp.pe.N_X, mp.pe.N_Y))
image[mp.pe.N_X//2:mp.pe.N_X//2+mp.pe.N_X//4, mp.pe.N_X//2:mp.pe.N_X//2+mp.pe.N_X//4] = 1
image[mp.pe.N_X//2:mp.pe.N_X//2+mp.pe.N_X//4, mp.pe.N_X//4:mp.pe.N_X//2] = -1

In :
%%timeit -n1 -r1
edges, C_res = mp.run_mp(image, verbose=False)

2min 1s ± 0 ns per loop (mean ± std. dev. of 1 run, 1 loop each)

In :
from SparseEdges import SparseEdges
mp = SparseEdges('https://raw.githubusercontent.com/bicv/SparseEdges/master/default_param.py')
mp.pe.N = 32 # number of edges
mp.pe.use_cache = True
mp.init()

# defining a test image
image = np.zeros((mp.pe.N_X, mp.pe.N_Y))
image[mp.pe.N_X//2:mp.pe.N_X//2+mp.pe.N_X//4, mp.pe.N_X//2:mp.pe.N_X//2+mp.pe.N_X//4] = 1
image[mp.pe.N_X//2:mp.pe.N_X//2+mp.pe.N_X//4, mp.pe.N_X//4:mp.pe.N_X//2] = -1

In :
%%timeit -n1 -r1
edges, C_res = mp.run_mp(image, verbose=False)

1min 41s ± 0 ns per loop (mean ± std. dev. of 1 run, 1 loop each)


Which shows a performance gain of approximately 25%. These changes are now effective in the code (see this commit).

Further profiling shows that most of the time is spend in the backpropfunction:

%lprun -m SparseEdges edges, C_res = mp.run_mp(image, verbose=False)

some book keeping for the notebook¶

In :
%load_ext watermark
%watermark

2017-10-25T15:08:31+02:00

CPython 3.6.3
IPython 6.1.0

compiler   : GCC 4.2.1 Compatible Apple LLVM 9.0.0 (clang-900.0.37)
system     : Darwin
release    : 17.0.0
machine    : x86_64
processor  : i386
CPU cores  : 4
interpreter: 64bit

In :
%load_ext version_information
%version_information numpy, scipy, matplotlib, sympy, pillow, imageio

Out:
Software Version
Python 3.6.3 64bit [GCC 4.2.1 Compatible Apple LLVM 9.0.0 (clang-900.0.37)]
IPython 6.1.0
OS Darwin 17.0.0 x86_64 i386 64bit
numpy 1.13.1
scipy 0.19.1
matplotlib 2.0.2
sympy 1.1.1
pillow 4.2.1
imageio 2.1.2
Wed Oct 25 15:08:32 2017 CEST