Back to my homepage
Fast and Exact Total Variation Minimization

This webpage briefly presents some results on my work done with Marc Sigelle on total variation minimization. Note many more results are about to come; We are waiting for notification of submitted papers. So feel free to come and see this page.

Most of the results presented here have been published in [1] and [2].

Fast and Exact Total Variation Minimization

Jérôme Darbon - Marc Sigelle

Overview of our Approach

Our method deals with the minimization of the total variation under a convex data fidelity term. We propose an algorithm which computes an exact minimizer of this problem. The method relies on the decomposition of an image into its level sets. Using these level sets, we map the problem into optimizations of independent binary Markov Random Fields. Binary solutions are found thanks to graph-cut techniques and we show how to derive a fast algorithm. We also study the special case when the fidelity term is the $ L^1$-norm. Finally we provide some experiments.


We assume $ u$ and $ v$ are two images defined on $ \Omega$. Thus we are interested in minimizing the following functional:

$\displaystyle E(u) = \int_\Omega f\left(u(x), v(x)\right)dx + \beta \int_\Omega \vert\nabla u\vert   .$ (1)

We assume that the attachment to data term is a convex function of $ u(.)$, such as: $ f\left(u(x), v(x))\right) = \vert u(x) - v(x)\vert^\alpha
$ for the $ L^\alpha$ case $ (\alpha = 1, 2)$, and that the regularization parameter $ \beta$ is some positive constant. we propose a fast algorithm which computes an exact minimizer of problem 1. It relies on reformulating this problem into independent binary MRFs attached to each level set of an image. Exact minimization is performed thanks to a minimum cost cut algorithm.

Formulation using Level Sets and MRF

We assume that $ u$ takes values in the discrete set $ [0,L-1]$ and is defined on a discrete lattice $ S$. We denote by $ u_s$ the value of the image $ u$ at the site $ s \in S$. We consider the thresholding images $ u ^ \lambda$ where $ u_s ^ \lambda =
\bbbone_{u_s\leq\lambda}$. One can reconstruct the original image from its level sets using $ u_s = \min\{\lambda, u_s ^\lambda=1\}$.

The coarea formula states that for any function $ u$ which belongs to the space of bounded variation [3] one has $ tv(u) =
\displaystyle %
\int_\bbbr P(u ^ \lambda) d\lambda$ almost surely. In the discrete case, we write $ tv(u) =
\displaystyle %
\sum_{\lambda=0}^{L-2} P(u ^\lambda )$, where $ P(u ^\lambda)$ is the perimeter of $ u ^ \lambda$ (notice that $ u_s ^{L-1} = 1$ for every $ s \in S$, which explains the previous summation up to $ L-2$.) Let us define the neighboring relationship between two sites $ s$ and $ t$ as $ s \sim t$. The associated cliques of order two are noted as $ (s,t)$. This enables to estimate the perimeter using the approach proposed in [5]. Thus we have $ tv(u) =
\displaystyle %
\sum_{\lambda=0}^{L-2} \sum_{(s,t)} w_{st} \vert u_s^\lambda -
u_t^\lambda\vert$, where $ w_{st}$ is set to 0.26 and 0.19 for the four- and eight- connected neighborhood, respectively.

Proposition 1   The discrete version of the energy $ E(u)$ rewrites as
$\displaystyle E(u)$ $\displaystyle =$ $\displaystyle \sum_{\lambda=0}^{L-2} E^\lambda(u ^\lambda)
 + C %
where$ (2)
$\displaystyle E^\lambda(u ^\lambda)$ $\displaystyle =$ $\displaystyle \beta
\sum_{(s,t)} w_{st} ((1-2u_t^\lambda) u_s^\lambda ...
...le \sum_{s\in\Omega}
(g_s(\lambda+1) - g_s(\lambda)) (1 - u_s^\lambda)
    $ (3)
$\displaystyle g_s(x)$ $\displaystyle =$ $\displaystyle f(x, v_s)
\forall s \in S
C = \displaystyle \sum_{s\in\Omega} g_s(0)$  

Proof: See [2]

Note that each $ E ^\lambda(u^\lambda)$ is a binary MRF with an Ising prior model. To minimize $ E(.)$ one can minimize all $ E^\lambda(.)$ independently. Thus we get a family $ \{\hat u ^\lambda\}$ which are respectively minimizers of $ E^\lambda(.)$. Clearly the summation will be minimized and thus we have a minimizer of $ E(.)$ provided this family is monotone:

$\displaystyle \hat u ^\lambda \leq \hat u^\mu$     $\displaystyle \forall \lambda < \mu$     $\displaystyle \enspace .$ (4)

If this property holds then the optimal solution is given by [4] : $ \hat u_s = \min\{\lambda, \hat u_s^\lambda =
1\}$     $ \forall s$. If property 4 does not hold, then the family $ \{u^\lambda\}$ is not a function.

Since the MRF posterior energy is decomposable into levels, it is useful to define the ``local neighborhood configurations'': $ N_s= \{u_t\}_{t \sim s}$ and $ N_s^\lambda = \{u_t^\lambda\}_{t \sim s}$ $ \forall \lambda \in [0, L-2]$ .

Proposition -Lemma   If the local conditional posterior energy at each site $ s$ writes as

$\displaystyle E(u_s \vert N_s, v_s) = \displaystyle\sum_{\lambda=0}^{L-2} ( \Delta \phi_s (\lambda) u_s^\lambda  + \chi_s (\lambda)  )$ (5)

where $ \Delta \phi_s (\lambda)$ is a non-increasing function of $ \lambda $ and $ \chi_s (\lambda)$ does not depend on $ u_s^\lambda$, then one can exhibit a ``coupled'' stochastic algorithm minimizing each total posterior energy $ E^{\lambda}(u^\lambda)$ while preserving the monotone condition: $ \forall s , $ $ u_s^{\lambda} \nearrow$ with $ \lambda $ .

Proof: see [1].

In other words, given a binary solution $ u^\star$ to the problem $ E^k$, there exists at least one solution $ \hat{u}$ to the problem $ E^l$ such that $ u^\star \leq \hat{u}$     $ \forall
k \leq l$.

Proposition 2   The requirements stated by proposition 2 are equivalent to these:
all conditional energies $ E(u_s \vert N_s, v_s) $ are convex functions of grey level $ u_s \in [0,L-1]$, for any neighborhood configuration and local observed data.

Proof: see [2]

Clearly both $ L^1$ + TV and $ L^2$ + TV models enjoy this convexity property and satisfy thus the conditions of application of our Lemma.

A Fast Algorithm

Although the previous section proves that the monotone property holds, it does not provide an algorithm to compute a solution. Our algorithm makes use of the formulation shown in equation 2 which allows independent optimizations.

We use a divide and conquer based algorithm to minimize the energy. Such an approach requires to decompose a problem into smaller ones, then to solve these sub-problems and to recombine the sub-solutions to get an optimal solution. Our algorithm takes benefit of the following. Suppose we minimize at some level $ \lambda $. Then, for all pixels of the minimizer we know whether they are below or above $ \lambda $. Thus it is useless to take into account pixels above $ \lambda $ for further optimizations which only allow pixels to be lower than $ \lambda $. Obviously, the same holds for pixels which are below $ \lambda $. Then, every connected component (it defines a partition of the image) of the minimizer can be optimized independently from each others. The latter corresponds to the decomposition of the problem into subproblems. Once minimizers of subproblems are computed, they are recombined to yield an optimal solution. The recombination is straightforward since the decomposition was a partition.

Figure 1: Illustration of our algorithm. From left to right, image (a), (b), (c) and (d). The partition of the image after a minimization with respect to some level $ \lambda $ is shown on (a). The connected components of the image (a) are shown on (b): it corresponds to the decomposition of the problem into subproblems. Each subproblem is solved independently and the result is depicted in (c). Finally solutions of subproblems are recombined to yield the image (d).
Image dec

A good choice to choose the threshold level $ \lambda $ is to use a dichotomic process. For instance, suppose the minimizer is a constant image, then our algorithm requires exactly $ \log_2(L)$ (we suppose $ L$ is a power of two) binary optimizations to compute it. The following table presents some time results (in seconds) on a pentium 4 3GHz.

Image $ \beta=1$ $ \beta=2$ $ \beta=3$
Lena (256x256) 0.37 0.54 0.72
Lena (512x512) 1.56 2.24 2.84
Woman (522x232) 0.53 0.7 1.03

Consequently, our algorithm is quasi-linear with respect to the number of pixels of the image and proportional to $ \log_2(L)$.


Some results for $ L^1 + TV$ for the image "woman". Results for different $ \beta$ are presented.

Image woman Image o-1 Image o-2-1 Image o-3
Image l Image l1 Image l21 Image l3

Some results for L2 + TV are about to come...


We have presented an algorithm which computes an exact solution for the minization of the total variation under a convex constraint. The method relies on the decomposition of the problem into binary ones using the level sets of an image. Moreover, this algorithm is quite fast. Comparison to other algorithms with respect to time complexity must be made. Extension of this method to other type of regularization is in progress.


J. Darbon and M. Sigelle.
Exact Optimization of Discrete Constrained Total Variation Minimization Problems.
In LNCS series vol. 3322, editor, Tenth International Workshop on Combinatorial Image Analysis (IWCIA 2004), pages 540-549, 2004.

J. Darbon and M. Sigelle.
A Fast and Exact Algorithm for Total Variation Minimization.
In LNCS series, editor, 2nd Iberian Conference on Pattern Recognition and Image Analysis (IbPria), page to appear., 2005.

L. Evans and R. Gariepy.
Measure Theory and Fine Properties of Functions.
CRC Press, 1992.

F. Guichard and J.M. Morel.
Mathematical morphology "almost everywhere".
In Proceedings of ISMM, pages 293-303. Csiro Publishing, April 2002.

H.T. Nguyen, M. Worring, and R. van den Boomgaard.
Watersnakes: Energy-driven watershed segmentation.
IEEE PAMI, 23(3):330-342, 2003.

About this document ...

Fast and Exact Total Variation Minimization

This document was generated using the LaTeX2HTML translator Version 2002-2-1 (1.70)

Copyright © 1993, 1994, 1995, 1996, Nikos Drakos, Computer Based Learning Unit, University of Leeds.
Copyright © 1997, 1998, 1999, Ross Moore, Mathematics Department, Macquarie University, Sydney.

The command line arguments were:
latex2html -split=1 ./article.tex

The translation was initiated by DARBON Jerome on 2005-02-07

Back to my homepage
DARBON Jerome 2005-02-07