Sparse coding of large images
In this post we would try to show how one could infer the sparse representation of an image knowing an appropriate generative model for its synthesis. We will start by a linear inversion (pseudo-inverse deconvolution), and then move to a gradient descent algorithm. Finally, we will implement a convolutional version of the iterative shrinkage thresholding algorithm (ISTA) and its fast version, the FISTA.
For computational efficiency, all convolutions will be implemented by a Fast Fourier Tansform, so that a standard convolution will be mathematically exactly similar. We will benchmark this on a realistic image size of $512 \times 512$ giving some timing results on a standard laptop.
Let's first initialize the notebook:
from __future__ import division, print_function
import numpy as np
np.set_printoptions(precision=6, suppress=True)
import os
%matplotlib inline
#%config InlineBackend.figure_format='retina'
%config InlineBackend.figure_format = 'svg'
import matplotlib.pyplot as plt
phi = (np.sqrt(5)+1)/2
fig_width = 15
figsize = (fig_width, fig_width/phi)
%load_ext autoreload
%autoreload 2
problem statement: sparse events in a static image¶
In the first case that we will study, images are produced by the combination of a single kernel with sparse events. Think of it like some brushstrokes on a canvas, somes drops of rain that fell on the surface of a pond...
For simplicity, the combination of these events will be considered linear. This is mostly true for instance in watercolor or in combining transparent surfaces and by ignoring saturation of the sensor for instance. To handle linearity, we will use linear algebra and matrices: We will denote as $x$ the (raveled) image and these "events" as $y$. The kernels that define the brushstrokes is denoted as $A$ and will be explicited below. Generally, in matrix form, the genrativve model of the image is written:
$$ x = A y $$First, coefficients are drawn from a Laplace distribution and sparse events are generated by thresholding these coefficients. These will be convolved with a kernel defined using the MotionClouds
library. Let's first illustrate that for a small image size:
N_X, N_Y = 2**9, 2**9 # BENCHMARK
# N_X, N_Y = 2**8, 2**8 # DEBUG
rho = 1.e-3
sf_0 = .15
B_sf = .15
y_lambda = 1.61803
seed = 2020
rng = np.random.RandomState(seed)
events = y_lambda * rng.laplace(size=(N_X, N_Y))
events_max = np.max(np.abs(events))
fig, ax = plt.subplots(figsize=figsize)
ax.hist(events.ravel(), bins=np.linspace(-events_max, events_max, 100, endpoint=True))
ax.set_xlabel('coefficients')
ax.set_yscale('log')
ax.set_ylabel('#');
print('mean, std=', events.mean(), events.std())
From this continuous distribution of coefficients, we will zero out to achieve a desired sparsity (in the $\ell_0$ pseudoi-norm sense):
threshold = np.quantile(np.absolute(events).ravel(), 1-rho)
print('threshold=', threshold)
events_thr = events * ((events < -threshold) + (events > threshold))
fig, ax = plt.subplots(figsize=figsize)
ax.hist(events_thr.ravel(), bins=np.linspace(-events_max, events_max, 100, endpoint=True))
ax.set_xlabel('coefficients')
ax.set_yscale('log')
ax.set_ylabel('#');
print('mean, std=', events_thr.mean(), events_thr.std())
fig, ax = plt.subplots(figsize=(fig_width, fig_width))
ax.imshow(events_thr, cmap=plt.viridis());
Let's now generate the image by convolving it with an isotropic kernel, as defined in the https://github.com/NeuralEnsemble/MotionClouds library:
import MotionClouds as mc
fx, fy, ft = mc.get_grids(N_X, N_Y, 1)
opts = dict(V_X=0., V_Y=0., B_V=0, B_theta=np.inf, sf_0=sf_0, B_sf=B_sf)
envelope = mc.envelope_gabor(fx, fy, ft, **opts).squeeze()
#env_smooth = mc.retina(fx, fy, ft, df=0.07)[:, :, 0]
fig, ax = plt.subplots(figsize=(fig_width, fig_width))
#ax.imshow(envelope[(N_X//2-N_X//4):(N_X//2+N_X//4), (N_Y//2-N_Y//4):(N_Y//2+N_Y//4)], cmap=plt.plasma());
ax.imshow(envelope, cmap=plt.plasma());
(This shape reminds me something... DOOH!)
Then we use the Fourier transform to actually perform the convolution (as implemented in the Motion Clouds library):
x = mc.random_cloud(envelope[:, :, None], events=events_thr[:, :, None])
x = x.reshape((N_X, N_Y))
print('x.shape=', x.shape)
This has all the nice properties of the 2D Discrete Fourier Transform, see for example this textbook (in French).
fig, ax = plt.subplots(figsize=(fig_width, fig_width))
vmax = np.absolute(x).max()
ax.imshow(x, cmap=plt.gray(), vmin=-vmax, vmax=vmax);
x_max = np.max(np.abs(x))
fig, ax = plt.subplots(figsize=figsize)
ax.hist(x.ravel(), bins=np.linspace(-x_max, x_max, 100, endpoint=True))
ax.set_xlabel('luminance')
ax.set_yscale('log')
ax.set_ylabel('#');
Note: one could have also used https://en.wikipedia.org/wiki/Shot_noise
All in one function:
def MC_env(N_X, N_Y, opts=dict(V_X=0., V_Y=0., B_V=0, B_theta=np.inf, sf_0=sf_0, B_sf=B_sf, alpha=0), do_norm=True, verbose=False):
fx, fy, ft = mc.get_grids(N_X, N_Y, 1)
envelope = mc.envelope_gabor(fx, fy, ft, **opts).squeeze()
envelope /= envelope.max()
if do_norm: envelope /= np.sqrt((envelope**2).sum())
if verbose: print('(envelope**2).sum()=', (envelope**2).sum())
return envelope
envelope = MC_env(N_X, N_Y, verbose=True)
%%timeit
envelope = MC_env(N_X, N_Y)
def random_cloud(envelope, events):
(N_X, N_Y) = envelope.shape
#fx, fy, ft = mc.get_grids(N_X, N_Y, N_frame)
F_events = np.fft.fftn(events)
F_events = np.fft.fftshift(F_events)
Fz = F_events * envelope
# de-centering the spectrum
Fz = np.fft.ifftshift(Fz)
#Fz[0, 0, 0] = 0. # removing the DC component
z = np.fft.ifftn(Fz).real
return z
%%timeit
random_cloud(envelope, events_thr)
def model(envelope, events, verbose=False):
if verbose: print('envelope.shape = ', envelope.shape)
if verbose: print('events.shape = ', events.shape)
N_X, N_Y = envelope.shape
x = random_cloud(envelope, events=events)
#x = x.reshape((N_X, N_Y))
if verbose: print('x.shape=', x.shape)
return x
x = model(envelope, events_thr, verbose=True)
%%timeit
model(envelope, events_thr)
Merging both functions:
def model(envelope, events, verbose=False):
if verbose: print('envelope.shape = ', envelope.shape)
if verbose: print('events.shape = ', events.shape)
N_X, N_Y = envelope.shape
F_events = np.fft.fftn(events)
F_events = np.fft.fftshift(F_events)
Fz = F_events * envelope
Fz = np.fft.ifftshift(Fz)
x = np.fft.ifftn(Fz).real
if verbose: print('x.shape=', x.shape)
return x
x = model(envelope, events_thr, verbose=True)
%%timeit
model(envelope, events_thr)
def generative_model(envelope, rho=rho, y_lambda=y_lambda, seed=seed, verbose=False):
N_X, N_Y = envelope.shape
if verbose: print('N_X, N_Y = ', envelope.shape)
rng = np.random.RandomState(seed)
events = y_lambda * rng.laplace(size=(N_X, N_Y))
threshold = np.quantile(np.absolute(events).ravel(), 1.-rho)
events = events * ((events < -threshold) + (events > threshold))
x = model(envelope, events, verbose=verbose)
return events, x
events, x = generative_model(envelope, verbose=True)
fig, ax = plt.subplots(figsize=(fig_width, fig_width))
vmax = np.absolute(x).max()
ax.imshow(x, cmap=plt.gray(), vmin=-vmax, vmax=vmax);
envelope = MC_env(N_X, N_Y)
%%timeit
events, x = generative_model(envelope)
Now, how do I retrieve the events knowing x
? This is a form of deconvolution:
Deconvolution in the noiseless case: invertibility of the image to retrieve events¶
It's relatively easy to use the shape of the enveloppe to retrieve the events:
eps = 2.e-6
thr = 5.e-2
def deconv(envelope, eps=eps, thr=thr, do_norm=False):
# mask = 1 / ( 1 + np.exp( -(envelope/envelope.max() - thr) / eps ) )# coefficients near zero
fr = mc.frequency_radius(fx, fy, ft).squeeze()
mask = 1 / ( 1 + np.exp( (fr - .45) / .025 ) ) # high frequency
mask *= 1 / ( 1 + np.exp( -(fr - thr) / .01 ) )# low frequency
F_deconv = envelope / (envelope**2 + eps*(1-mask)) # avoid division by zero
F_deconv *= mask
if do_norm: F_deconv /= np.sqrt((F_deconv**2).sum())
return F_deconv / ((envelope*F_deconv)).mean()
F_deconv = deconv(envelope, eps=eps, thr=thr)
fig, ax = plt.subplots(figsize=(fig_width, fig_width))
#cmap = ax.imshow(F_deconv[(N_X//2-N_X//4):(N_X//2+N_X//4), (N_Y//2-N_Y//4):(N_Y//2+N_Y//4)], cmap=plt.plasma())
cmap = ax.imshow(F_deconv, cmap=plt.plasma())
plt.colorbar(cmap, shrink=.8);
fig, ax = plt.subplots(figsize=(fig_width, fig_width))
cmap = ax.imshow(envelope*F_deconv, cmap=plt.plasma())
plt.colorbar(cmap, shrink=.8);
# generate
envelope = MC_env(N_X, N_Y)
events = np.zeros((N_X, N_Y))
events[N_X//2, N_Y//2] = 1.
impulse = model(envelope, events)
# invert
Fx = np.fft.fft2(impulse)
F_deconv = deconv(envelope, eps=eps, thr=thr)
# applying the filter
F_events = Fx*np.fft.ifftshift(F_deconv)
# de-centering the spectrum
impulse_pred = np.fft.ifft2(F_events).real
print('impulse_pred.max()', impulse_pred.max(), 'impulse_pred[N_X//2, N_Y//2]', impulse_pred[N_X//2, N_Y//2])
fig, axs = plt.subplots(1, 2, figsize=(fig_width, fig_width))
axs[0].imshow(impulse, cmap=plt.gray())
axs[1].imshow(impulse_pred, cmap=plt.viridis());
def invert(x, envelope, eps=eps, thr=thr):
Fx = np.fft.fft2(x)
F_deconv = deconv(envelope, eps=eps, thr=thr)
# applying the filter
# de-centering the spectrum
F_events = Fx*np.fft.ifftshift(F_deconv)
events = np.fft.ifft2(F_events).real
return events
envelope = MC_env(N_X, N_Y)
events, x = generative_model(envelope, verbose=True)
events_pred = invert(x, envelope)
fig, axs = plt.subplots(1, 2, figsize=(fig_width, fig_width))
axs[0].imshow(events, cmap=plt.viridis())
axs[1].imshow(events_pred, cmap=plt.viridis());
fig, ax = plt.subplots(figsize=figsize)
counts, bins, _ = ax.hist(events.ravel(), fc='b', bins=100, alpha=.5)
counts, bins, _ = ax.hist(events_pred.ravel(), fc='r', bins=bins, alpha=.5)
ax.set_xlabel('coefficients')
ax.set_yscale('log')
ax.set_ylabel('#');
Some statistics to show the efficiency of this method:
print('mean, std=', events.mean(), events.std())
print('mean, std=', events_pred.mean(), events_pred.std())
#print('enveloppe energy=', np.sqrt(((envelope)**2).mean()) )
def SE_y(events, events_pred):
return ((events-events_pred)**2).sum()/(events**2).sum()
print('relative Squared error on coefficients: SE_y=', SE_y(events, events_pred))
def L1_y(events, events_pred):
return np.abs(events-events_pred).sum()/np.abs(events).sum()
print('relative L1 error=', L1_y(events, events_pred))
#print('energy rec=', ((events_pred)**2).sum())
#print('energy input=', ((events.squeeze())**2).sum())
#print('ratio energy=', ((events_pred)**2).sum()/(events**2).sum())
#print('Squared ratio energy=', np.sqrt((events**2).sum()/(events_pred**2).sum()))
def SE_x(x, envelope, events_pred):
return ((x-model(envelope, events_pred))**2).sum()/(x**2).sum()
print('relative Squared image error=', SE_x(x, envelope, events_pred))
Reconstruction is perfect up to the machine's numerical precision. This is true because the problem is invertible in that instance and that the invert
function implements a generalization of the pseudo-inverse $A^+ = (A^T A )^{-1} A$ of the linear transform $x = A y$ generalized to the convolutional case defined in the model
function.
Note that a FFT and convolution operations are linear operators, follonwing the fact that A
is a Toeplitz matrix derived from the kernel we used.
This method is fast:
envelope = MC_env(N_X, N_Y)
events, x = generative_model(envelope)
%%timeit
events_pred = invert(x, envelope)
Instead of using the standard fft
as in
%timeit np.fft.fft2(x)
... one could use the knowledge that the input signal is real:
%timeit np.fft.rfft2(x)
Specifically, it is necessary to reshape the enveloppe in frequency space, see https://docs.scipy.org/doc/numpy/reference/generated/numpy.fft.rfft.html :
np.fft.rfft2(x).shape
such that finally:
def invert_r(x, envelope, eps=eps, thr=thr):
Fx = np.fft.rfft2(x)
F_deconv = deconv(envelope, eps=eps, thr=thr)
F_deconv = np.fft.ifftshift(F_deconv)[:, :(envelope.shape[1]//2+1):]
# applying the filter
# de-centering the spectrum
F_events = Fx*F_deconv
events = np.fft.irfft2(F_events).real
return events
#mc.N_X, mc.N_Y = 2**7, 2**7
envelope = MC_env(N_X, N_Y)
events, x = generative_model(envelope, verbose=True)
events_pred = invert_r(x, envelope)
print('relative L1 error=', L1_y(events, events_pred))
print('relative Squared error=', SE_y(events, events_pred))
print('relative Squared image error=', SE_x(x, envelope, events_pred))
fig, axs = plt.subplots(1, 2, figsize=(fig_width, fig_width))
axs[0].imshow(events, cmap=plt.viridis())
axs[1].imshow(events_pred, cmap=plt.viridis());
envelope = MC_env(N_X, N_Y)
events, x = generative_model(envelope, verbose=True)
%%timeit
events_pred = invert_r(x, envelope)
Deconvolution in the noisy case (colored noise): (partial) failure to retrieve sparse coefficients¶
envelope = MC_env(N_X, N_Y)
events, x = generative_model(envelope, verbose=True)
def add_noise(x, noise, B_sf=.5, do_white=False):
if do_white:
im_noise = rng.normal(size=x.shape)
else:
env_noise = mc.envelope_gabor(fx, fy, ft, B_theta=np.inf, sf_0=sf_0, B_sf=B_sf)
im_noise = mc.random_cloud(env_noise).squeeze()
im_noise /= im_noise.std()
x_noise = x + noise * x.std() * im_noise
return x_noise
noise = .5
x_noise = add_noise(x, noise)
fig, ax = plt.subplots(figsize=(fig_width, fig_width))
vmax = np.absolute(x_noise).max()
ax.imshow(x_noise, cmap=plt.gray(), vmin=-vmax, vmax=vmax);
x_noise = add_noise(x, noise)
events_pred = invert(x_noise, envelope)
print('relative L1 error=', L1_y(events, events_pred))
print('relative Squared error=', SE_y(events, events_pred))
print('relative Squared image error=', SE_x(x, envelope, events_pred))
Error as a function of noise level:
envelope = MC_env(N_X, N_Y)
events, x = generative_model(envelope, verbose=True)
N_scan = 15
SE = np.zeros((N_scan, 2))
noises = np.logspace(-2, .5, N_scan, base=10, endpoint=True)
for i_noise, noise_ in enumerate(noises):
x_noise = add_noise(x, noise_)
events_pred = invert(x_noise, envelope)
SE[i_noise, 0] = SE_x(x, envelope, events_pred)
SE[i_noise, 1] = SE_y(events, events_pred)
# recording for future comparisons
SE_noise = {'full': SE}
fig, ax = plt.subplots(1, 2, figsize=figsize)
for i, SE_l in enumerate(['SE_x', 'SE_y']):
ax[i].loglog(noises, np.ones_like(SE[:, i]), 'k--')
ax[i].loglog(noises, SE[:, i], label='full')
ax[i].set_xlabel('noise')
ax[i].set_ylabel(SE_l);
ax[0].legend(loc='lower right');
x_noise = add_noise(x, noise)
events_pred = invert(x_noise, envelope)
print('relative L1 error=', L1_y(events, events_pred))
print('relative Squared error=', SE_y(events, events_pred))
print('relative Squared image error=', SE_x(x, envelope, events_pred))
Even for this simple model, the error grows too fast wrt to noise. We need to find a solution for handling this. A first solution comes from inspecting the histogram of events' coefficients:
fig, ax = plt.subplots(figsize=figsize)
counts, bins, _ = ax.hist(events.ravel(), fc='b', bins=100, alpha=.5)
counts, bins, _ = ax.hist(events_pred.ravel(), fc='r', bins=bins, alpha=.5)
ax.set_xlabel('coefficients')
ax.set_yscale('log')
ax.set_ylabel('#');
Konwing the sparsity of the input, one could select only the proportion of coefficients which are big enough and remove the other ones:
threshold = np.quantile(np.abs(events_pred).ravel(), 1-rho)
print('threshold=', threshold)
threshold_minus = np.quantile(events_pred.ravel(), rho/2)
threshold_plus = np.quantile(events_pred.ravel(), 1.-rho/2)
print('threshold_minus=', threshold_minus, 'threshold_plus=', threshold_plus)
events_pred_thr = events_pred * ((events_pred < threshold_minus) + (events_pred > threshold_plus))
print('relative L1 error (full)=', L1_y(events, events_pred))
print('relative Squared error (full)=', SE_y(events, events_pred))
print('relative L1 error (thr)=', L1_y(events, events_pred_thr))
print('relative Squared error (thr)=', SE_y(events, events_pred_thr))
fig, ax = plt.subplots(figsize=figsize)
counts, bins, _ = ax.hist(events.ravel(), fc='b', bins=100, alpha=.5)
counts, bins, _ = ax.hist(events_pred_thr.ravel(), fc='r', bins=bins, alpha=.5)
ax.set_xlabel('coefficients')
ax.set_yscale('log')
ax.set_ylabel('#');
print('relative Squared image error (full)=', SE_x(x, envelope, events_pred))
print('relative Squared image error (thr)=', SE_x(x, envelope, events_pred_thr))
def L0_y(events, events_pred):
error = ((1- (events==0)) * (events_pred==0)).sum() # Type I error : false positive
error += ((1- (events_pred==0)) * (events==0)).sum() # Type II error : false negative
return error / N_X / N_Y #/ (1- (events==0)).sum()
print('relative L0 error=', L0_y(events, events_pred))
Doing so globally, we get
threshold = np.quantile(np.absolute(events_pred).ravel(), 1.-rho)
print('threshold=', threshold)
events_pred_thr = events_pred * ((events_pred < -threshold) + (events_pred > threshold))
print('relative L1 error (thr)=', L1_y(events, events_pred_thr))
print('relative Squared error (thr)=', SE_y(events, events_pred_thr))
print('relative Squared image error (thr)=', SE_x(x, envelope, events_pred_thr))
The result is worst in the image reconstruction: Indeed, the coefficients were found using the least-squared method for which the pseudo-inverse is the optimal solution. Deviating from this solution increases reconstruction error in the image domain.
Yet, it is better for the extraction of the sparse coefficients as we zero-ed out some of which we classified using our knowledge of the sparseness of the vector.
Yet, a problem is that we do not know a priori the sparseness. Let's see how this changes with different estimates, either by knowing the original or noisy images:
N_scan_rho = 45
x_noise = add_noise(x, noise)
events_pred = invert(x, envelope)
events_pred_noise = invert(x_noise, envelope)
SE_thr = np.zeros((N_scan_rho, 2))
SE_noisy = np.zeros((N_scan_rho, 2))
rhos = np.logspace(-5, -2, N_scan_rho, base=10, endpoint=True)
for i_rho, rho_ in enumerate(rhos):
threshold = np.quantile(np.absolute(events_pred).ravel(), 1-rho_)
events_pred_thr = events_pred * ((events_pred < -threshold) + (events_pred > threshold))
SE_thr[i_rho, 0] = SE_x(x, envelope, events_pred_thr)
SE_thr[i_rho, 1] = SE_y(events_pred, events_pred_thr)
threshold = np.quantile(np.absolute(events_pred_noise).ravel(), 1-rho_)
events_pred_thr = events_pred_noise * ((events_pred_noise < -threshold) + (events_pred_noise > threshold))
SE_noisy[i_rho, 0] = SE_x(x_noise, envelope, events_pred_thr)
SE_noisy[i_rho, 1] = SE_y(events_pred_noise, events_pred_thr)
fig, ax = plt.subplots(1, 2, figsize=figsize)
for i, SE_l in enumerate(['SE_x', 'SE_y']):
ax[i].loglog(rhos, np.ones_like(SE_thr[:, i]), 'k--')
ax[i].loglog(rhos, SE_thr[:, i], label='Original')
ax[i].loglog(rhos, SE_noisy[:, i], '--', label='Noisy')
ax[i].set_xlabel('rho_est')
ax[i].set_ylabel(SE_l);
ax[0].legend(loc='best');
Now assuming that we have an estimate of rho
, let's see how this method compares to the previous one ('full') for varying levels of noise:
rho_est = rho
SE_thr = np.zeros((N_scan, 2))
for i_noise, noise_ in enumerate(noises):
x_noise = add_noise(x, noise_)
events_pred = invert(x_noise, envelope)
threshold = np.quantile(np.absolute(events_pred).ravel(), 1-rho_est)
events_pred_thr = events_pred * ((events_pred < -threshold) + (events_pred > threshold))
SE_thr[i_noise, 0] = SE_x(x, envelope, events_pred_thr)
SE_thr[i_noise, 1] = SE_y(events, events_pred_thr)
SE_noise['thr']= SE_thr
fig, ax = plt.subplots(1, 2, figsize=figsize)
for i, SE_l in enumerate(['SE_x', 'SE_y']):
ax[i].plot(noises, np.ones(len(noises)), 'k--')
for method in ['full', 'thr']:
ax[i].plot(noises, SE_noise[method][:, i], label=method)
ax[i].set_xlabel('noise')
ax[i].set_ylabel(SE_l)
ax[i].set_xscale('log')
ax[i].set_yscale('log')
ax[0].legend(loc='best');
Thresholding coefficients gives a much better estimate of coefficients.
If we know the level of noise, one can also use Wiener filtering:
sigma_noise = noise
envelope = MC_env(N_X, N_Y)
events, x = generative_model(envelope, verbose=True)
events_pred = invert(x, envelope, eps=sigma_noise * x.std())
print('relative L1 error (thr)=', L1_y(events, events_pred_thr))
print('relative Squared error (thr)=', SE_y(events, events_pred_thr))
print('relative Squared image error (full)=', SE_x(x, envelope, events_pred))
fig, axs = plt.subplots(1, 2, figsize=(fig_width, fig_width))
axs[0].imshow(events, cmap=plt.viridis())
axs[1].imshow(events_pred, cmap=plt.viridis());
fig, ax = plt.subplots(figsize=figsize)
counts, bins, _ = ax.hist(events.ravel(), fc='b', bins=100, alpha=.5)
counts, bins, _ = ax.hist(events_pred.ravel(), fc='r', bins=bins, alpha=.5)
ax.set_xlabel('coefficients')
ax.set_yscale('log')
ax.set_ylabel('#');
rho_est = rho
SE_thr = np.zeros((N_scan, 2))
for i_noise, noise_ in enumerate(noises):
x_noise = add_noise(x, noise_)
events_pred = invert(x_noise, envelope, eps=noise_ * x.std())
threshold = np.quantile(np.absolute(events_pred).ravel(), 1-rho_est)
events_pred_thr = events_pred * ((events_pred < -threshold) + (events_pred > threshold))
SE_thr[i_noise, 0] = SE_x(x, envelope, events_pred_thr)
SE_thr[i_noise, 1] = SE_y(events, events_pred_thr)
SE_noise['wie']= SE_thr
fig, ax = plt.subplots(1, 2, figsize=figsize)
for i, SE_l in enumerate(['SE_x', 'SE_y']):
ax[i].plot(noises, np.ones(len(noises)), 'k--')
for method in ['full', 'thr', 'wie']:
ax[i].plot(noises, SE_noise[method][:, i], label=method)
ax[i].set_xlabel('noise')
ax[i].set_ylabel(SE_l)
ax[i].set_xscale('log')
ax[i].set_yscale('log')
ax[0].legend(loc='best');
Could we repeat this procedure more times?
A first solution is to define a global cost as the LASSO problem :
$$ \mathcal{C} = .5 \cdot \| x - A y \|^2 + \lambda \| y \|_1 $$And to apply a gradient descent technique on this cost:
conjugate gradient descent algorithm¶
setting up variables:
envelope = MC_env(N_X, N_Y)
events = np.zeros((N_X, N_Y))
events[N_X//2, N_Y//2] = 1
impulse = model(envelope, events)
ATA = model(envelope, impulse)
ATA_pred = model(envelope**2, events)
print(np.absolute(ATA_pred).max(), np.absolute(ATA).max())
assert(((ATA_pred-ATA)**2).sum() < 1.e-16)
#fig, ax = plt.subplots(figsize=(fig_width, fig_width))
vmax = np.absolute(impulse).max()
vmax = np.absolute(ATA).max()
#ax.imshow(impulse, cmap=plt.gray(), vmin=-vmax, vmax=vmax);
fig, axs = plt.subplots(1, 2, figsize=(fig_width, fig_width/phi))
axs[0].imshow(ATA[(N_X//2-N_X//4):(N_X//2+N_X//4), (N_Y//2-N_Y//4):(N_Y//2+N_Y//4)], cmap=plt.gray(), vmin=-vmax, vmax=vmax);
axs[1].imshow(ATA_pred[(N_X//2-N_X//4):(N_X//2+N_X//4), (N_Y//2-N_Y//4):(N_Y//2+N_Y//4)], cmap=plt.gray(), vmin=-vmax, vmax=vmax);
print('impulse.max()', impulse.max())
print('envelope.mean()', envelope.mean())
print('ATA.max()', ATA.max())
print('(envelope**2).mean()', (envelope**2).mean())
# envelope = MC_env(N_X, N_Y)
events, x = generative_model(envelope, verbose=True)
x_noise = add_noise(x, noise)
fig, axs = plt.subplots(1, 2, figsize=(fig_width, fig_width/phi))
#axs[0].imshow(envelope[(N_X//2-N_X//4):(N_X//2+N_X//4), (N_Y//2-N_Y//4):(N_Y//2+N_Y//4)], cmap=plt.plasma());
#axs[1].imshow(envelope[(N_X//2-N_X//4):(N_X//2+N_X//4), (N_Y//2-N_Y//4):(N_Y//2+N_Y//4)]**2, cmap=plt.plasma());
axs[0].imshow(envelope, cmap=plt.plasma());
axs[1].imshow(envelope**2, cmap=plt.plasma());
Let's project the image
events_pred = model(envelope, x_noise, verbose=True) #/ 26.80927 # / np.sqrt(((envelope)**2).mean())
print('relative Squared image error (full)=', SE_x(x, envelope, events_pred))
print('relative L1 error (full)=', L1_y(events, events_pred))
print('relative Squared error (full)=', SE_y(events, events_pred))
rho_est = rho
threshold = np.quantile(np.absolute(events_pred).ravel(), 1-rho_est)
events_pred_thr = events_pred_thr * ((events_pred_thr < -threshold) + (events_pred_thr > threshold))
print('relative L1 error (thr)=', L1_y(events, events_pred_thr))
print('relative Squared error (thr)=', SE_y(events, events_pred_thr))
print('relative Squared image error (thr)=', SE_x(x, envelope, events_pred_thr))
Other stats for introspection:
print('Mean enveloppe energy=', (envelope**2).mean())
print('Squared enveloppe energy=', np.sqrt((envelope**2).mean()) )
print('relative Squared error=',SE_y(events, events_pred_thr))
print('energy rec=', ((events_pred)**2).sum())
print('energy input=', ((events)**2).sum())
print('ratio energy=', ((events_pred)**2).sum()/(events**2).sum())
print('Squared ratio energy=', np.sqrt((events_pred**2).sum()/(events**2).sum()))
events, x = generative_model(envelope, seed=42)
x_noise = add_noise(x, noise)
events_pred = invert(x_noise, envelope)
threshold = np.quantile(np.absolute(events_pred).ravel(), 1-rho_est)
events_pred_thr = events_pred * ((events_pred < -threshold) + (events_pred > threshold))
error = x_noise - model(envelope, events_pred)
grad = model(envelope, error)
grad_pred = model(envelope, x_noise) - model(envelope**2, events_pred)
assert(((grad-grad_pred)**2).sum() < 1.e-16)
We will include information about sparsity by modifying the gradient knowing the LASSO formulation (see https://blogs.princeton.edu/imabandit/2013/04/09/orf523-conditional-gradient-descent-and-structured-sparsity/ ) :
y_lambda_est = y_lambda
grad -= y_lambda_est * np.sign(events_pred)
N_iter = 1000
eta = .01
start_iter = 10
rho_est = rho
y_lambda_est = 1. # y_lambda
N_scan = 15
default = dict(N_iter=N_iter, eta=eta, start_iter=start_iter, rho_est=rho_est,
y_lambda_est=y_lambda_est)
print('default =', default)
iters = range(N_iter)
SE_iter = np.zeros((N_iter, 2))
SE_thr = np.zeros((N_iter, 2))
# intialization
norm_enveloppe = 1# np.sqrt((envelope**2).mean())
#events_pred = 0. * model(envelope, x_noise) / norm_enveloppe
events_pred = invert(x_noise, envelope)
# threshold = np.quantile(np.absolute(events_pred).ravel(), 1-rho_est)
# events_pred_thr = events_pred * ((events_pred < -threshold) + (events_pred > threshold))
# events_pred = events_pred_thr #np.zeros_like(x_noise)
# iterations
for i_iter in iters:
# predicting
grad = model(envelope, x_noise) - model(envelope**2, events_pred)
grad -= y_lambda_est * np.sign(events_pred)
events_pred += eta / (i_iter+start_iter) * grad
# measuring efficiency
SE_iter[i_iter, 0] = SE_x(x, envelope, events_pred)
SE_iter[i_iter, 1] = SE_y(events, events_pred)
# thresholding
threshold = np.quantile(np.absolute(events_pred).ravel(), 1.-rho_est)
events_pred_thr = events_pred * ((events_pred < -threshold) + (events_pred > threshold))
SE_thr[i_iter, 0] = SE_x(x, envelope, events_pred_thr)
SE_thr[i_iter, 1] = SE_y(events, events_pred_thr)
fig, ax = plt.subplots(1, 2, figsize=figsize)
for i, SE_l in enumerate(['SE_x', 'SE_y']):
ax[i].plot(iters, SE_iter[:, i], label='full')
ax[i].plot(iters, SE_thr[:, i], '--', label='thr')
ax[i].set_xlabel('iterations')
ax[i].set_ylabel(SE_l)
#ax[i].set_xscale('log')
ax[i].set_yscale('log')
ax[0].legend(loc='best');
Let's now test this function over its different parameters:
events, x = generative_model(envelope, seed=42)
x_noise = add_noise(x, noise)
def cg(x_noise, envelope,
N_iter=N_iter, eta=eta, start_iter=start_iter, rho_est=rho_est,
y_lambda_est=y_lambda_est):
#events_pred = np.zeros_like(x_noise)
events_pred = invert(x_noise, envelope)
#threshold = np.quantile(np.absolute(events_pred).ravel(), 1-rho_est)
#events_pred *= ((events_pred < -threshold) + (events_pred > threshold))
iters = range(int(N_iter))
# iterations
for i_iter in iters:
# predicting
grad = model(envelope, x_noise) - model(envelope**2, events_pred)
grad -= y_lambda_est * np.sign(events_pred)
events_pred += eta/(i_iter+int(start_iter)) * grad
return events_pred
Testing with and without the sparsity:
events_pred = cg(x_noise, envelope, y_lambda_est=y_lambda_est)
print('relative L1 error (full)=', L1_y(events, events_pred))
print('relative Squared error (full)=', SE_y(events, events_pred))
print('relative Squared image error (full)=', SE_x(x, envelope, events_pred))
threshold = np.quantile(np.absolute(events_pred).ravel(), 1-rho_est)
events_pred_thr = events_pred * ((events_pred < -threshold) + (events_pred > threshold))
print('relative L1 error (thr)=', L1_y(events, events_pred_thr))
print('relative Squared error (thr)=', SE_y(events, events_pred_thr))
print('relative Squared image error (thr)=', SE_x(x, envelope, events_pred_thr))
events_pred = cg(x_noise, envelope, y_lambda_est=0)
print('relative L1 error (full)=', L1_y(events, events_pred))
print('relative Squared error (full)=', SE_y(events, events_pred))
print('relative Squared image error (full)=', SE_x(x, envelope, events_pred))
threshold = np.quantile(np.absolute(events_pred).ravel(), 1-rho_est)
events_pred_thr = events_pred * ((events_pred < -threshold) + (events_pred > threshold))
print('relative L1 error (thr)=', L1_y(events, events_pred_thr))
print('relative Squared error (thr)=', SE_y(events, events_pred_thr))
print('relative Squared image error (thr)=', SE_x(x, envelope, events_pred_thr))
noise
SE = {}
SE_thr = {}
for variable in default.keys():
SE[variable] = np.zeros((N_scan, 2))
SE_thr[variable] = np.zeros((N_scan, 2))
values = np.logspace(-1, 1, N_scan, base=10, endpoint=True) * default[variable]
for i_value, value in enumerate(values):
opts = default.copy()
opts[variable] = value
print('variable', variable, '=', '%.4f' % opts[variable], ' - default =', '%.4f' % default[variable])
x_noise = add_noise(x, noise)
events_pred = cg(x_noise, envelope, **opts)
SE[variable][i_value, 0] = SE_x(x, envelope, events_pred)
SE[variable][i_value, 1] = SE_y(events, events_pred)
threshold = np.quantile(np.absolute(events_pred).ravel(), 1-opts['rho_est'])
events_pred_thr = events_pred * ((events_pred < -threshold) + (events_pred > threshold))
SE_thr[variable][i_value, 0] = SE_x(x, envelope, events_pred_thr)
SE_thr[variable][i_value, 1] = SE_y(events, events_pred_thr)
for variable in default.keys():
values = np.logspace(-1, 1, N_scan, base=10, endpoint=True) * default[variable]
fig, ax = plt.subplots(1, 2, figsize=figsize)
for i, SE_l in enumerate(['SE_x', 'SE_y']):
ax[i].plot(values, np.ones_like(SE_thr[variable][:, i]), 'k--')
ax[i].plot(values, SE[variable][:, i], label='full')
ax[i].plot(values, SE_thr[variable][:, i], '--', label='thr')
ax[i].set_xlabel(variable)
ax[i].set_ylabel(SE_l);
fig.suptitle(variable)
ax[i].set_xscale('log')
ax[i].set_yscale('log')
ax[0].legend(loc='best');
And now for different levels of noise:
rho_est = rho
SE = np.zeros((N_scan, 2))
for i_noise, noise_ in enumerate(noises):
x_noise = add_noise(x, noise_)
events_pred = cg(x_noise, envelope)
threshold = np.quantile(np.absolute(events_pred).ravel(), 1-rho_est)
events_pred_thr = events_pred * ((events_pred < -threshold) + (events_pred > threshold))
SE[i_noise, 0] = SE_x(x, envelope, events_pred_thr)
SE[i_noise, 1] = SE_y(events, events_pred_thr)
SE_noise['cg'] = SE
fig, ax = plt.subplots(1, 2, figsize=figsize)
for i, SE_l in enumerate(['SE_x', 'SE_y']):
ax[i].plot(noises, np.ones(len(noises)), 'k--')
ax[i].plot(noises, SE[:, i], 'g--')
for method in ['full', 'thr', 'cg']:
ax[i].plot(noises, SE_noise[method][:, i], label=method)
ax[i].set_xlabel('noise')
ax[i].set_ylabel(SE_l)
ax[i].set_xscale('log')
ax[i].set_yscale('log')
ax[0].legend(loc='best');
It converges to a better solution as the simple thresholding, but yet the convergence is slow. Another solution is deployed in iterative thresholding.
ISTA: iterative shrinkage-thresholding algorithm¶
Some nice pointers about ISTA:
- the FISTA paper
- lectures on FISTA http://stat.cmu.edu/~ryantibs/convexopt/lectures/prox-grad.pdf http://www.seas.ucla.edu/~vandenbe/236C/lectures/fista.pdf
- https://blogs.princeton.edu/imabandit/2013/04/11/orf523-ista-and-fista/
An implementation in matlab: https://github.com/seunghwanyoo/ista_lasso - and one in python on a toy problem: https://gist.github.com/agramfort/ac52a57dc6551138e89b
As we have seen we can invert events from a noisy image, but the error is big...
env = MC_env(N_X, N_Y)
events, x = generative_model(env, verbose=True)
x_noise = add_noise(x, noise)
events_pred = invert(x_noise, envelope)
print('relative Squared image error (full)=', SE_x(x, envelope, events_pred))
print('relative L1 error (full)=', L1_y(events, events_pred))
print('relative Squared error (full)=', SE_y(events, events_pred))
One idea is to the same procedure iteratively, as we know that the events matrix is sparse:
def shrinkage(events_pred, threshold, stype='soft'):
if stype=='hard':
events_pred_thr = events_pred * ((events_pred < -threshold) + (events_pred > threshold))
else:
events_pred_thr = np.sign(events_pred) * np.maximum(np.abs(events_pred)-threshold, 0.)
return events_pred_thr
fig, ax = plt.subplots(figsize=figsize)
events_pred = np.linspace(-1, 1, 200, endpoint=True)
for stype in ['hard', 'soft']:
ax.plot(events_pred, shrinkage(events_pred, threshold=.25, stype=stype), label=stype, alpha=.5)
ax.legend(loc='best')
ax.set_xlabel('coefficient')
ax.set_ylabel('shrinked coefficient');
N_iter = 2000
eta = .005
y_lambda_est = 4.
rho_est = rho
start_iter = 2
N_scan = 15
default = dict(N_iter=N_iter, rho_est=rho_est, y_lambda_est=y_lambda_est, eta=eta, start_iter=start_iter)
print('default =', default)
iters = range(N_iter)
SE_iter = np.zeros((N_iter, 2, 2))
SE_thr = np.zeros((N_iter, 2, 2))
for i_type, stype in enumerate(['soft', 'hard']):
if True:
# intialization - with the denoising solution:
events_pred = invert(x_noise, envelope)
# thresholding
#threshold = np.quantile(np.absolute(events_pred).ravel(), 1-rho_est)
#events_pred = shrinkage(events_pred, threshold=threshold, stype='hard')
else:
events_pred = np.zeros_like(x_noise)
AT_x_noise = model(envelope, x_noise)
envelope2 = envelope**2
# iterations
for i_iter in iters:
# predicting
grad = AT_x_noise - model(envelope**2, events_pred)
#grad = model(envelope, x_noise - model(envelope, events_pred))
#grad -= y_lambda_est * np.sign(events_pred)
eta_ = eta / np.sqrt(i_iter+start_iter)
events_pred += eta_ * grad
# shrinkage operator
events_pred = shrinkage(events_pred, threshold=eta_*y_lambda_est, stype=stype)
# measuring efficiency
SE_iter[i_iter, 0, i_type] = SE_x(x, envelope, events_pred)
SE_iter[i_iter, 1, i_type] = SE_y(events, events_pred)
# thresholding
threshold = np.quantile(np.absolute(events_pred).ravel(), 1-rho_est)
events_pred_thr = shrinkage(events_pred, threshold=threshold, stype='hard')
SE_thr[i_iter, 0, i_type] = SE_x(x, envelope, events_pred_thr)
SE_thr[i_iter, 1, i_type] = SE_y(events, events_pred_thr)
SE_iter[-1, ...]
fig, ax = plt.subplots(1, 2, figsize=figsize)
for i, SE_l in enumerate(['SE_x', 'SE_y']):
ax[i].plot(iters,np.ones_like(SE_iter[:, i, 0]), 'k--')
for i_type, stype in enumerate(['soft', 'hard']):
ax[i].plot(iters, SE_iter[:, i, i_type], label=stype + '_full')
ax[i].plot(iters, SE_thr[:, i, i_type], '--', label=stype + '_thr')
ax[i].set_xlabel('iterations')
ax[i].set_ylabel(SE_l)
#ax[i].set_xscale('log')
ax[i].set_yscale('log')
ax[0].legend(loc='best');
Now testing ISTA to explore the effect of parameters:
def ISTA(x_noise, envelope, eta=eta, start_iter=start_iter, y_lambda_est=y_lambda_est, rho_est=rho_est, N_iter=N_iter, stype='soft'):
# intialization
events_pred = invert(x_noise, envelope)
AT_x_noise = model(envelope, x_noise)
envelope2 = envelope**2
# iterations
iters = range(int(N_iter))
for i_iter in iters:
# predicting
grad = AT_x_noise - model(envelope2, events_pred)
eta_ = eta / np.log(i_iter+start_iter)
events_pred += eta_ * grad
# thresholding
events_pred = shrinkage(events_pred, threshold=eta_*y_lambda_est, stype=stype)
return events_pred
Testing different parameters:
SE = {}
SE_thr = {}
for variable in default.keys():
SE[variable] = np.zeros((N_scan, 2))
SE_thr[variable] = np.zeros((N_scan, 2))
values = np.logspace(-1, 1, N_scan, base=10, endpoint=True) * default[variable]
for i_value, value in enumerate(values):
opts = default.copy()
opts[variable] = value
print('variable', variable, '=', '%.3f' % opts[variable], ' - default =', '%.3f' % default[variable])
x_noise = add_noise(x, noise)
events_pred = ISTA(x_noise, envelope, **opts)
SE[variable][i_value, 0] = SE_x(x, envelope, events_pred)
SE[variable][i_value, 1] = SE_y(events, events_pred)
threshold = np.quantile(np.absolute(events_pred).ravel(), 1-opts['rho_est'])
events_pred_thr = events_pred * ((events_pred < -threshold) + (events_pred > threshold))
SE_thr[variable][i_value, 0] = SE_x(x, envelope, events_pred_thr)
SE_thr[variable][i_value, 1] = SE_y(events, events_pred_thr)
for variable in default.keys():
values = np.logspace(-1, 1, N_scan, base=10, endpoint=True) * default[variable]
fig, ax = plt.subplots(1, 2, figsize=figsize)
for i, SE_l in enumerate(['SE_x', 'SE_y']):
ax[i].plot(values, np.ones_like(SE_thr[variable][:, i]), 'k--')
ax[i].plot(values, SE[variable][:, i], label='full')
ax[i].plot(values, SE_thr[variable][:, i], '--', label='thr')
ax[i].set_xlabel(variable)
ax[i].set_ylabel(SE_l);
fig.suptitle(variable)
ax[i].set_xscale('log')
ax[i].set_yscale('log')
ax[0].legend(loc='best');
SE_ista = np.zeros((N_scan, 2))
for i_noise, noise_ in enumerate(noises):
x_noise = add_noise(x, noise_)
events_pred_thr = ISTA(x_noise, envelope)
SE_ista[i_noise, 0] = SE_x(x, envelope, events_pred_thr)
SE_ista[i_noise, 1] = SE_y(events, events_pred_thr)
SE_noise['ista']= SE_ista
fig, ax = plt.subplots(1, 2, figsize=figsize)
for i, SE_l in enumerate(['SE_x', 'SE_y']):
ax[i].plot(noises, np.ones(len(noises)), 'k--')
for method in ['full', 'thr', 'cg', 'ista']:
ax[i].plot(noises, SE_noise[method][:, i], label=method)
ax[i].set_xlabel('noise')
ax[i].set_ylabel(SE_l)
ax[i].set_xscale('log')
ax[i].set_yscale('log')
ax[0].legend(loc='best');
FISTA: Fast ISTA¶
def FISTA(x_noise, envelope, rho_est=rho_est, N_iter=N_iter, stype='soft'):
# intialization
events_pred = events_pred_thr = events_pred_thr_ = invert(x_noise, envelope)
t = 1
# iterations
iters = range(int(N_iter))
for i_iter in iters:
t_old = t
t = .5 * (1 + np.sqrt(1 + 4*t**2))
events_old = events_pred
events_pred_thr_old = events_pred_thr_
grad = model(envelope, x_noise) - model(envelope**2, events_pred_thr)
#grad -= y_lambda_est * np.sign(events_pred_thr)
events_pred += eta / (i_iter+1) * grad
return events_pred
N_scan = 15
default = dict(N_iter=1000, rho_est=rho)
print('default =', default)
SE = {}
SE_thr = {}
for variable in default.keys():
SE[variable] = np.zeros((N_scan, 2))
SE_thr[variable] = np.zeros((N_scan, 2))
values = np.logspace(-1, 1, N_scan, base=10, endpoint=True) * default[variable]
for i_value, value in enumerate(values):
opts = default.copy()
opts[variable] = value
print('variable', variable, '=', '%.3f' % opts[variable], ' - default =', '%.3f' % default[variable])
x_noise = add_noise(x, noise)
events_pred = FISTA(x_noise, envelope, **opts)
SE[variable][i_value, 0] = SE_x(x, envelope, events_pred)
SE[variable][i_value, 1] = SE_y(events, events_pred)
threshold = np.quantile(np.absolute(events_pred).ravel(), 1-opts['rho_est'])
events_pred_thr = events_pred * ((events_pred < -threshold) + (events_pred > threshold))
SE_thr[variable][i_value, 0] = SE_x(x, envelope, events_pred_thr)
SE_thr[variable][i_value, 1] = SE_y(events, events_pred_thr)
for variable in default.keys():
values = np.logspace(-1, 1, N_scan, base=10, endpoint=True) * default[variable]
fig, ax = plt.subplots(1, 2, figsize=figsize)
for i, SE_l in enumerate(['SE_x', 'SE_y']):
ax[i].plot(values, np.ones_like(SE_thr[variable][:, i]), 'k--')
ax[i].plot(values, SE[variable][:, i], label='full')
ax[i].plot(values, SE_thr[variable][:, i], '--', label='thr')
ax[i].set_xlabel(variable)
ax[i].set_ylabel(SE_l);
fig.suptitle(variable)
ax[i].set_xscale('log')
ax[i].set_yscale('log')
ax[0].legend(loc='best');
SE_fista = np.zeros((N_scan, 2))
for i_noise, noise_ in enumerate(noises):
x_noise = add_noise(x, noise_)
events_pred_thr = FISTA(x_noise, envelope)
SE_fista[i_noise, 0] = SE_x(x, envelope, events_pred_thr)
SE_fista[i_noise, 1] = SE_y(events, events_pred_thr)
SE_noise['fista']= SE_fista
fig, ax = plt.subplots(1, 2, figsize=figsize)
for i, SE_l in enumerate(['SE_x', 'SE_y']):
ax[i].loglog(noises, np.ones_like(SE_noise[method][:, i]), 'k--')
for method in ['full', 'thr', 'cg', 'ista', 'fista']:
ax[i].plot(noises, SE_noise[method][:, i], label=method)
ax[i].set_xlabel('noise')
ax[i].set_ylabel(SE_l)
ax[0].legend(loc='best');
Checking the efficiency for different sparseness levels
#N_X, N_Y = 2**10, 2**10
#N_X, N_Y = 2**7, 2**7
#noise = .04
envelope = MC_env(N_X, N_Y)
SE_thr = np.zeros((N_scan_rho, 2))
SE_ista = np.zeros((N_scan_rho, 2))
SE_fista = np.zeros((N_scan_rho, 2))
rhos = np.logspace(-4, -2, N_scan_rho, base=10, endpoint=True)
for i_rho, rho_ in enumerate(rhos):
events, x = generative_model(envelope, rho=rho_)
x_noise = add_noise(x, noise)
events_pred = invert(x_noise, envelope)
threshold = np.quantile(np.absolute(events_pred).ravel(), 1-rho_)
events_pred_thr = events_pred * ((events_pred < -threshold) + (events_pred > threshold))
SE_thr[i_noise, 0] = SE_x(x, envelope, events_pred_thr)
SE_thr[i_noise, 1] = SE_y(events, events_pred_thr)
events_pred_thr = ISTA(x_noise, envelope, rho_est=rho_)
SE_ista[i_rho, 0] = SE_x(x, envelope, events_pred_thr)
SE_ista[i_rho, 1] = SE_y(events, events_pred_thr)
events_pred_thr = FISTA(x_noise, envelope, rho_est=rho_)
SE_fista[i_rho, 0] = SE_x(x, envelope, events_pred_thr)
SE_fista[i_rho, 1] = SE_y(events, events_pred_thr)
fig, ax = plt.subplots(1, 2, figsize=figsize)
for i, SE_l in enumerate(['SE_x', 'SE_y']):
ax[i].loglog(rhos, np.ones_like(SE_thr[:, i]), 'k--')
ax[i].loglog(rhos, SE_thr[:, i], label='Original')
ax[i].loglog(rhos, SE_ista[:, i], '--', label='Noisy')
ax[i].loglog(rhos, SE_fista[:, i], '--', label='FISTA')
ax[i].set_xlabel('rho_est')
ax[i].set_ylabel(SE_l);
ax[0].legend(loc='best');
- there is obviously something wrong with ISTA / FISTA
- TODO: learn enveloppe http://yann.lecun.com/exdb/publis/pdf/gregor-icml-10.pdf
- TODO: Noisy case: correlated noise
some book keeping for the notebook¶
%reload_ext watermark
%watermark -i -h -m -v -p numpy,MotionClouds,matplotlib,scipy,imageio -r -g -b