A hitchhiker guide to Matching Pursuit
The Matching Pursuit algorithm is popular in signal processing and applies well to digital images.
I have contributed a python implementation and we will show here how we may use that for extracting a sparse set of edges from an image.
- this will be exposed in the following book chapter (see also https://laurentperrinet.github.io/publication/perrinet-15-bicv ), to be published by the end of this year:
@inbook{Perrinet15bicv,
title = {Sparse models},
author = {Perrinet, Laurent U.},
booktitle = {Biologically-inspired Computer Vision},
chapter = {13},
editor = {Keil, Matthias and Crist\'{o}bal, Gabriel and Perrinet, Laurent U.},
publisher = {Wiley, New-York},
year = {2015}
}
setting things up¶
We will take some "baby steps first to show how it works, and then apply that to an image .
At first, we will define the SparseEdges framework which will create the necessary processing steps, from the raw image, to creating the multi-resolution scheme and the Matching Pursuit algorithm:
from SparseEdges import SparseEdges
mp = SparseEdges('https://raw.githubusercontent.com/bicv/SparseEdges/master/default_param.py')
mp.pe.N = 4
mp.pe.do_mask = False
mp.pe.MP_alpha = 1.
mp.pe.do_whitening = False
This defines the following set of parameters:
print('Default parameters: ', mp.pe)
The useful imports for a nice notebook:
import matplotlib
matplotlib.rcParams.update({'text.usetex': False})
%matplotlib inline
%config InlineBackend.figure_format='retina'
import matplotlib.pyplot as plt
import numpy as np
np.set_printoptions(precision=4)#, suppress=True)
fig_width = 14
print('Range of spatial frequencies: ', mp.sf_0)
print('Range of angles (in degrees): ', mp.theta*180./np.pi)
testing one step of (Matching + Pursuit)¶
Here, we will synthesize an image using 2 log-Gabor filters and check that they are correctly retrieved using a few Matching Pursuit steps. An edge will be represented by a nd.array
with respectively [x position, y center position, indice of angle in multi resolution, index of scale in multi resolution scheme]
and a coefficient (a complex number whose argument determines the phase of the log-Gabor filter):
# one
edge_in, C_in= [3*mp.pe.N_X/4, mp.pe.N_Y/2, 2, 2], 42
# the second
edge_bis, C_bis = [mp.pe.N_X/8, mp.pe.N_Y/4, 8, 4], 4.*np.sqrt(2)*np.exp(1j*np.pi/4.)
Which results in
# filters in Fourier space
FT_lg_in = mp.loggabor(edge_in[0], edge_in[1], sf_0=mp.sf_0[edge_in[3]],
B_sf=mp.pe.B_sf, theta= mp.theta[edge_in[2]], B_theta=mp.pe.B_theta)
FT_lg_bis = mp.loggabor(edge_bis[0], edge_bis[1], sf_0=mp.sf_0[edge_bis[3]],
B_sf=mp.pe.B_sf, theta= mp.theta[edge_bis[2]], B_theta=mp.pe.B_theta)
# mixing both and shows one
FT_lg_ = C_in * FT_lg_in + C_bis * FT_lg_bis
image = mp.invert(FT_lg_)
_ = mp.show_FT(FT_lg_, axis=True)
We may start by computing the linear coefficients in the multi resolution scheme
C = mp.linear_pyramid(image)
These coefficients correspond to a measure of how well the image matches with the filters in the multi-resolution scheme. These will be used in the Matching step. We then detect the best match corresponding to the coefficient with maximum absolute activity:
edge_star = mp.argmax(C)
print('Coordinates of the maximum ', edge_star, ', true value: ', edge_in)
print('Value of the maximum ', C[edge_star], ', true value: ', C_in)
At this particular orientation and scale, the absolute activity looks like:
fig, a1, a2 = mp.show_spectrum(np.absolute(C[:, :, edge_star[2], edge_star[3]]), axis=True)
_ = a2.plot([edge_star[1]], [edge_star[0]], 'r*')
On our image, this looks like:
fig, a1, a2 = mp.show_spectrum(image, axis=True)
_ = a2.plot([edge_star[1]], [edge_star[0]], 'r*')
Let's now extract and show the "winner" of the Matching Step:
FT_star = mp.loggabor(edge_star[0], edge_star[1], sf_0=mp.sf_0[edge_star[3]],
B_sf=mp.pe.B_sf, theta= mp.theta[edge_star[2]], B_theta=mp.pe.B_theta)
im_star = mp.invert(FT_star)
_ = mp.show_FT(FT_star, axis=True)
Great, it looks like we have extracted the first edge. Let's first check that the energy of
- the sum the extracted component energy,
- maximum of its power spectrum (equivalent to the above, but we just check),
- half the mean spectrum energy (half? rememeber the trick used in Log-Gabor filters) are all equal to one:
print(np.sum(im_star**2), mp.FTfilter(im_star, FT_star).max(), np.mean(np.abs(FT_star)**2)/2)
Let's overlay the extracted edge on the image:
edge_star_in = np.array([edge_star[0], edge_star[1], mp.theta[edge_in[2]], mp.sf_0[edge_star[3]], np.absolute(C[edge_star]), np.angle(C[edge_star])])
mp.pe.figsize_edges = 9
fig, a = mp.show_edges(edge_star_in[:, np.newaxis], image=image)
Now, we may substract the residual from the image, it is the Pursuit step:
image_res = (image - C[edge_star] * im_star).real
fig, a1, a2 = mp.show_spectrum(image_res, axis=True)
_ = a2.plot([edge_star[1]], [edge_star[0]], 'r*')
Looks pretty clear now that only the second edge remains. We may now repeat another Matching step on the residual image:
C = mp.linear_pyramid(image_res)
edge_star_bis = mp.argmax(C)
print('Coordinates of the maximum ', edge_star_bis, ', true value: ', edge_bis)
print('Value of the maximum ', C[edge_star_bis], ', true value: ', C_bis)
Note: to store the complex value of the coefficient, we use the fact that it is easy to transform a complex number to 2 reals and back:
z = np.sqrt(2)*np.exp(1j*np.pi/4.)
z_, z_p = np.absolute(z), np.angle(z)
print(z, z_*np.exp(1j*z_p))
Note that the linear coefficients corresponding to the first winning filter are canceled:
print('Value of the residual ', C[edge_star], ', initial value: ', C_in)
Let's overlay the extracted edge on the image:
fig, a1, a2 = mp.show_spectrum(image_res, axis=True)
_ = a2.plot([edge_star_bis[1]], [edge_star_bis[0]], 'r*')
We have well extracted the two edges:
edge_stars = np.vstack((edge_star_in,
np.array([edge_star_bis[0], edge_star_bis[1], mp.theta[edge_star_bis[2]], mp.sf_0[edge_star_bis[3]], np.absolute(C[edge_star_bis]), np.angle(C[edge_star_bis])])))
print(edge_stars)
mp.pe.figsize_edges = 9
fig, a = mp.show_edges(edge_stars.T, image=image)
testing four steps of Matching Pursuit¶
Let's redo these steps using the run_mp
function:
# filters in Fourier space
FT_lg_in = mp.loggabor(edge_in[0], edge_in[1], sf_0=mp.sf_0[edge_in[3]],
B_sf=mp.pe.B_sf, theta= mp.theta[edge_in[2]], B_theta=mp.pe.B_theta)
FT_lg_bis = mp.loggabor(edge_bis[0], edge_bis[1], sf_0=mp.sf_0[edge_bis[3]],
B_sf=mp.pe.B_sf, theta= mp.theta[edge_bis[2]], B_theta=mp.pe.B_theta)
# mixing both and shows one
FT_lg_ = C_in * FT_lg_in + C_bis * FT_lg_bis
fig = mp.show_FT(FT_lg_)
image = mp.invert(FT_lg_)
edges, C_res = mp.run_mp(image, verbose=True)
fig, a = mp.show_edges(edges, image=image)
Testing edge detection on a natural image¶
trying out on a flikr image from @Doug88888:
!curl https://farm7.staticflickr.com/6058/6370387703_5e718ea681_q_d.jpg -o ../files/2015-05-22-a-hitchhiker-guide-to-matching-pursuit/6370387703_5e718ea681_q_d.jpg
%ls -l ../files/2015-05-22-a-hitchhiker-guide-to-matching-pursuit/MPtutorial*
import numpy as np
from SparseEdges import SparseEdges
mp = SparseEdges('https://raw.githubusercontent.com/bicv/SparseEdges/master/default_param.py')
if not mp.pe.do_whitening: print('\!/ Wrong parameters... \!/')
# where should we store the data + figures generated by this notebook
import os
name = 'MPtutorial'
mp.pe.matpath, mp.pe.figpath = '../files/2015-05-22-a-hitchhiker-guide-to-matching-pursuit', '../files/2015-05-22-a-hitchhiker-guide-to-matching-pursuit'
mp.pe.do_mask = True
mp.pe.N = 2048
mp.pe.MP_alpha = 1.
mp.pe.MP_alpha = .8
mp.pe.mask_exponent = 6.
mp.init()
# defining input image (get it @ https://www.flickr.com/photos/doug88888/6370387703)
#image = mp.imread('https://farm7.staticflickr.com/6058/6370387703_5e718ea681_q_d.jpg')
image = mp.imread(os.path.join(mp.pe.matpath, '6370387703_5e718ea681_q_d.jpg'))
white = mp.pipeline(image, do_whitening=True)
import os
matname = os.path.join(mp.pe.matpath, name + '.npy')
try:
edges = np.load(matname)
except:
edges, C_res = mp.run_mp(white, verbose=True)
np.save(matname, edges)
matname_MSE = os.path.join(mp.pe.matpath, name + '_MSE.npy')
try:
MSE = np.load(matname_MSE)
except:
MSE = np.ones(mp.pe.N)
image_rec = np.zeros_like(image)
for i_N in range(mp.pe.N):
MSE[i_N] = ((white-image_rec)**2).sum()
image_rec += mp.reconstruct(edges[:, i_N][:, np.newaxis])
np.save(matname_MSE, MSE)
Showing the original image with the edges overlaid:
#edges = np.load(matname)
fig_width_pt = 318.670 # Get this from LaTeX using \showthe\columnwidth
inches_per_pt = 1.0/72.27 # Convert pt to inches
fig_width = fig_width_pt*inches_per_pt # width in inches
mp.pe.figsize_edges = .382 * fig_width # useful for papers
mp.pe.figsize_edges = 9 # useful in notebooks
mp.pe.line_width = 1.
mp.pe.scale = 1.
fig, a = mp.show_edges(edges, image=image, show_phase=True, show_mask=True)
#mp.savefig(fig, name + '_rec')