Processing math: 20%

Sparse and Redundant Modeling of Image Content Using an Image-Signature-Dictionary

论文笔记-2008-SIAM-109

Posted by icbcbicc on September 16, 2016

Sparse and Redundant Modeling of Image Content Using an Image-Signature-Dictionary

Authors: Michal Aharon and Michael Elad
PDF


Introduction of Sparse Representation

Background

  • In sparse and redundant modeling, a signal yRn is represented as a linear combination of some atoms taken from a dictionary DRnm which contains m atoms diRn. A representation of the signal y is then any vector xRn satisfying y=Dx

  • m>n: the representation is redundant.

  • The solutions to y=Dx are infinite and we prefer the sparest one, i.e, the one with the smallest ||x||0

  • The task of computing a representation of a signal can be formly described by

minx
  • However, the constrait above is often relaxed and replaced by {||y-Dx||}_2 \le \epsilon. This allows for additive noise and model deviations


Problem

  • Solving this problem was proved to be an NP-hard problem

  • Two approximation techniques :

    • Matching-pursuit (MP) methods that find the solution one entry at a time in a greedy way

    • Basis-pursuit (BP) algorithm that replaces the L_0 norm by the L_1 norm

  • Two types of dictionary

    • Prespecified dictionaries

    • Dictionaries obtained by learning procedure


This paper

  • The image-signature-dictionary (ISD), is an 2D image in which each patch can serve as a representing atom

  • A near shift-invariant property is obtained, due to the overlap between atoms extracted from the ISD in nearby locations

  • By taking patches of varying sizes, near scale-invariance is potentially obtained and exploited


The Structure of ISD

  • D_s \in R^{\sqrt{m}*\sqrt{m}} : signature dictionary

  • y’ \in R^{\sqrt{n}*\sqrt{n}} : imasge patch and its single column form y \in R^n

  • d_s \in R^m : a vector obtained by a column lexicographic ordering of the ISD

  • C_{[k,l]} : the linear operator that extract a patch of size \sqrt{n}*\sqrt{n} from the D_s in location [k,l]

  • C_{[k,l]}d_s : the extracted patch as a column vector

  • x is the concatenation of the m coefficients in the array x_{[k,l]}


ISD Training


Energy Function

We assume that each patch is represented by a fixed number of atoms, L. The energy function that the ISD is expected to minimize :

s.t.

s.t.

Two stages in training

  • The sparse coding stage: find \hat{x_i}

    Assuming that d_s is known and fixed, we can solve the inner optimization problem and find the sparsest representation \hat{x_i} for each example y_i

    This problem can be solved by any pursuit algorithm, such as a greedy method——the orthogonal matching pursuit (OMP)

    The OMP selects at each stage an atom from the dictionary that best resembles the residual. After each such selection, the signal is back-projected onto the set of chosen atoms, and the new residual signal is calculated (Don’t understand)

  • Dictionary update stage: \hat{d_s}

    Assuming that the sparse representation vectors \hat{x_i} have been computed, we seek the best ISD d_s to minimize:

    The gradient of the error expression:

    where

    Thus, the optimal ISD is obtained simply by:

  • Stochastic gradient appoach (SG)

    • The algorithm above updates the x_i for all examples {y_i}_{i=1}^N and then updates the ISD

    • Another way is to update the ISD after the computation of each x_i (SG):

    The step-size parameter \mu is typically set as \mu = \frac{\mu_0}{Iteration}

Treatment of the mean

  • Remove the mean in training and testing data

  • Redefine the C_{[k,l]} to also remove the mean of the extrated patch