Sparse and Redundant Modeling of Image Content Using an Image-Signature-Dictionary

论文笔记-2008-SIAM-109

Posted by icbcbicc on September 16, 2016

Sparse and Redundant Modeling of Image Content Using an Image-Signature-Dictionary

Authors: Michal Aharon and Michael Elad
PDF


Introduction of Sparse Representation

Background

  • In sparse and redundant modeling, a signal $y \in R^n$ is represented as a linear combination of some atoms taken from a dictionary $D \in R^{n*m}$ which contains $m$ atoms $d_i \in R^n$. A representation of the signal $y$ is then any vector $x \in R^n$ satisfying $y = Dx$

  • $m > n$: the representation is redundant.

  • The solutions to $y = Dx$ are infinite and we prefer the sparest one, i.e, the one with the smallest ${||x||}_0$

  • The task of computing a representation of a signal can be formly described by

  • However, the constrait above is often relaxed and replaced by ${||y-Dx||}_2 \le \epsilon$. This allows for additive noise and model deviations


Problem

  • Solving this problem was proved to be an NP-hard problem

  • Two approximation techniques :

    • Matching-pursuit (MP) methods that find the solution one entry at a time in a greedy way

    • Basis-pursuit (BP) algorithm that replaces the $L_0$ norm by the $L_1$ norm

  • Two types of dictionary

    • Prespecified dictionaries

    • Dictionaries obtained by learning procedure


This paper

  • The image-signature-dictionary (ISD), is an 2D image in which each patch can serve as a representing atom

  • A near shift-invariant property is obtained, due to the overlap between atoms extracted from the ISD in nearby locations

  • By taking patches of varying sizes, near scale-invariance is potentially obtained and exploited


The Structure of ISD

  • $D_s \in R^{\sqrt{m}*\sqrt{m}} : $ signature dictionary

  • $y’ \in R^{\sqrt{n}*\sqrt{n}} : $ imasge patch and its single column form $y \in R^n$

  • $d_s \in R^m : $ a vector obtained by a column lexicographic ordering of the ISD

  • $C_{[k,l]} : $ the linear operator that extract a patch of size $\sqrt{n}*\sqrt{n}$ from the $D_s$ in location $[k,l]$

  • $C_{[k,l]}d_s : $ the extracted patch as a column vector

  • $x$ is the concatenation of the $m$ coefficients in the array $x_{[k,l]}$


ISD Training


Energy Function

We assume that each patch is represented by a fixed number of atoms, L. The energy function that the ISD is expected to minimize :

s.t.

s.t.

Two stages in training

  • The sparse coding stage: find $\hat{x_i}$

    Assuming that $d_s$ is known and fixed, we can solve the inner optimization problem and find the sparsest representation $\hat{x_i}$ for each example $y_i$

    This problem can be solved by any pursuit algorithm, such as a greedy method——the orthogonal matching pursuit (OMP)

    The OMP selects at each stage an atom from the dictionary that best resembles the residual. After each such selection, the signal is back-projected onto the set of chosen atoms, and the new residual signal is calculated (Don’t understand)

  • Dictionary update stage: $\hat{d_s}$

    Assuming that the sparse representation vectors $\hat{x_i}$ have been computed, we seek the best ISD $d_s$ to minimize:

    The gradient of the error expression:

    where

    Thus, the optimal ISD is obtained simply by:

  • Stochastic gradient appoach (SG)

    • The algorithm above updates the $x_i$ for all examples {y_i}_{i=1}^N and then updates the ISD

    • Another way is to update the ISD after the computation of each $x_i$ (SG):

    The step-size parameter $\mu$ is typically set as $\mu = \frac{\mu_0}{Iteration}$

Treatment of the mean

  • Remove the mean in training and testing data

  • Redefine the $C_{[k,l]}$ to also remove the mean of the extrated patch