Sparse and Redundant Modeling of Image Content Using an Image-Signature-Dictionary
Authors: Michal Aharon and Michael Elad
PDF
Introduction of Sparse Representation
Background
-
In sparse and redundant modeling, a signal y∈Rn is represented as a linear combination of some atoms taken from a dictionary D∈Rn∗m which contains m atoms di∈Rn. A representation of the signal y is then any vector x∈Rn satisfying y=Dx
-
m>n: the representation is redundant.
-
The solutions to y=Dx are infinite and we prefer the sparest one, i.e, the one with the smallest ||x||0
-
The task of computing a representation of a signal can be formly described by
- However, the constrait above is often relaxed and replaced by {||y-Dx||}_2 \le \epsilon. This allows for additive noise and model deviations
Problem
-
Solving this problem was proved to be an NP-hard problem
-
Two approximation techniques :
-
Matching-pursuit (MP) methods that find the solution one entry at a time in a greedy way
-
Basis-pursuit (BP) algorithm that replaces the L_0 norm by the L_1 norm
-
-
Two types of dictionary
-
Prespecified dictionaries
-
Dictionaries obtained by learning procedure
-
This paper
-
The image-signature-dictionary (ISD), is an 2D image in which each patch can serve as a representing atom
-
A near shift-invariant property is obtained, due to the overlap between atoms extracted from the ISD in nearby locations
-
By taking patches of varying sizes, near scale-invariance is potentially obtained and exploited
The Structure of ISD
-
D_s \in R^{\sqrt{m}*\sqrt{m}} : signature dictionary
-
y’ \in R^{\sqrt{n}*\sqrt{n}} : imasge patch and its single column form y \in R^n
-
d_s \in R^m : a vector obtained by a column lexicographic ordering of the ISD
-
C_{[k,l]} : the linear operator that extract a patch of size \sqrt{n}*\sqrt{n} from the D_s in location [k,l]
-
C_{[k,l]}d_s : the extracted patch as a column vector
-
x is the concatenation of the m coefficients in the array x_{[k,l]}
ISD Training
Energy Function
We assume that each patch is represented by a fixed number of atoms, L. The energy function that the ISD is expected to minimize :
s.t.
s.t.
Two stages in training
-
The sparse coding stage: find \hat{x_i}
Assuming that d_s is known and fixed, we can solve the inner optimization problem and find the sparsest representation \hat{x_i} for each example y_i
This problem can be solved by any pursuit algorithm, such as a greedy method——the orthogonal matching pursuit (OMP)
The OMP selects at each stage an atom from the dictionary that best resembles the residual. After each such selection, the signal is back-projected onto the set of chosen atoms, and the new residual signal is calculated (Don’t understand)
-
Dictionary update stage: \hat{d_s}
Assuming that the sparse representation vectors \hat{x_i} have been computed, we seek the best ISD d_s to minimize:
The gradient of the error expression:
where
Thus, the optimal ISD is obtained simply by:
-
Stochastic gradient appoach (SG)
-
The algorithm above updates the x_i for all examples {y_i}_{i=1}^N and then updates the ISD
-
Another way is to update the ISD after the computation of each x_i (SG):
The step-size parameter \mu is typically set as \mu = \frac{\mu_0}{Iteration}
-
Treatment of the mean
-
Remove the mean in training and testing data
-
Redefine the C_{[k,l]} to also remove the mean of the extrated patch