Back to my homepage
Jerome Darbon Total Variation Minimization with Data Fidelity as a Contrast Invariant Filter

Total Variation Minimization with $ L^1$ Data Fidelity as a Contrast Invariant Filter

Jérôme Darbon
EPITA Research and Development Laboratory (LRDE)
14-16 rue Voltaire F-94276 Le Kremlin-Bicêtre,France


This paper sheds new light on minimization of the total variation under the $ L^1$ -norm as data fidelity term ($ L^1+TV$ ) and its link with mathematical morphology. It is well known that morphological filters feature the property of being invariant with respect to any change of contrast. First, we show that minimization of $ L^1+TV$ yields a self-dual and contrast invariant filter. Then, we further constrain the minimization process by only optimizing the grey levels of level sets of the image while keeping their boundaries fixed. This new constraint is maintained thanks to the Fast Level Set Transform which yields a complete representation of the image as a tree. We show that this filter can be expressed as a Markov Random Field on this tree. Finally, we present some results which demonstrate that these new filters can be particularly useful as a pre-processing stage before segmentation.


Minimization of total variation (TV) under $ L^2$ data fidelity is a popular image restauration technique [24]. It is well-known that solutions live in the space of functions of bounded variation which allows for sharp boundaries [13]. However, in [7], Chan et al note that the minimizer generally presents a loss of contrast (this is mainly due to the use of $ L^2$ fidelity). As a consequence they propose to replace it by $ L^1$ . In this paper, we consider the minimization of TV under $ L^1$ fidelity. We present new theoretical results (to our knowledge) of minimizers of this functional along with some new filters.

Suppose an image is defined on a rectangle $ \Omega$ of $ \bbbr^2$ , we are interested in minimizing the following energy:

$\displaystyle E_v(u) = \int_\Omega \left\vert u(x)-v(x) \right\vert dx + \beta \int_\Omega \left\vert \nabla u\right\vert \enspace,$ (1)

where last term is the total variation of $ u$ weighted by a coefficient $ \beta$ . Note that the gradient is taken in the distributional sense. Since this energy is not strictly convex, the uniqueness of global minimizer cannot be assured. The use of $ L^1$ fidelity has already been studied by some authors. In [1,2,3], Alliney restricts his studies to the one dimensional case and to the discrete case. He provides an algorithm which converges towards a local minimum thanks to recursive median filters. In [21,22,23], Nikolova studies functionals with non-smooth priors and fidelity terms (including the one considered in this paper) and presents very good results for image denoising. In [22], she reports that the minimization of TV under a $ L^1$ -norm constraint yields an image where some pixels do not change their gray-levels. In [7], Chan et al. directly address the continuous problem. They show that the data energy is not continuous with respect to minimizers. In [8], the authors use (1) to get solutions for some non-convex minimization problems. In [4], Bouman et al. study generalized Gaussian: The considered functional is a special case of this study.

In [18], it is stated that a square cannot arise as a minimizer of (1). As a consequence, in [11,12], Dibos et al. propose an iterative algorithm to minimize TV where only gray-levels of level sets are allowed to change while their shapes remain unchanged. Their algorithm relies on the topographic map [6] (also known as FLST-tree) which is a contrast invariant representation of the image and a tree. This tree can be efficiently built using the Fast Level Sets Transform (FLST) described in [20].

In this paper, we show that minimization of (1) yields a self-dual and contrast invariant filter. Then we follow the ideas of Dibos et al. by minimizing (1) under the supplementary constraint that shapes can only change their gray-levels. The rest of this paper is as follows. In section 2, we show that TV minimization under $ L^1$ fidelity is a self-dual and contrast invariant filter. We briefly present the FLST-tree in section 3. In section 4 we reformulate $ L^1+TV$ minimization on the FLST-tree and show that it corresponds to a Markov Random Field (MRF). A fast and exact minimization algorithm and some results are presented in section  % latex2html id marker 1046
$ \ref{sec.results}$ . Finally we draw some conclusions in section % latex2html id marker 1048
$ \ref{sec.conclusion}$ .

$ L^1+TV$ as a contrast invariant filter

In this section we show that the minimization of the total variation under the $ L^1$ -norm as a data fildelity term yields a contrast invariant filter.


We first introduce the notion of level sets and change of contrast.

Definition -sets-definition   The lower and upper level sets of an image, referred to as $ L_\lambda$ and $ U ^\lambda$ respectively, are defined as follows:

$\displaystyle L_\lambda(u) = \{x, u(x) \leq \lambda\}$


$\displaystyle U ^\lambda(u) = \{x,
u(x) \geq \lambda\}\enspace.$

Lower level sets of $ u$ will also be referred to as $ u ^\lambda$ for convenience.

Note that decompositions into level sets is sufficient to reconstruct the image:

$\displaystyle u(x) = \inf\{\lambda, L_\lambda(u)(x)\} = \sup\{\lambda, U^\lambda(u)(x)\}$ (2)

We define a continous change of contrast as follows [14]:

Definition -contrast-definition   Any continuous non-decreasing function is called a continuous change of contrast.

We now introduce a lemma proved in [15].

Lemma -contrast-th   Assume $ g$ to be a continuous change of contrast and u a real function defined on $ \Omega$ . The following holds:

$\displaystyle \forall \lambda   \exists \mu     L_\lambda(g(u)) = L_\mu(u) \enspace .

The same property holds for upper level sets.

In other words, after a continuous change of contrast, the level sets of an image $ g(v)$ are some level sets of the image $ v$ .

Reformulation through level sets

We now reformulate (1) using the level sets of an image as described in [7,9]. The coarea formula states that for any function which belongs to the space of functions of bounded variation, we have for almost all $ \lambda$

$\displaystyle \int_\Omega \left\vert\nabla u\right\vert = \int_\bbbr \int_\Omeg...
...{u ^\lambda} \right\vert d\lambda = \int_\bbbr P(u^\lambda) d\lambda \enspace ,$ (3)

where $ P(E)$ and $ \chi_E$ stand for the perimeter and the characteristic of the set $ E$ , respectively. $ L^1$ fidelity can be rewritten as follows:

$\displaystyle \int_\Omega \left\vert u(x) - v(x)\right\vert \enspace dx =
...\int_\Omega \left\vert u ^\lambda(x) - v ^\lambda(x) \right\vert
dx \enspace .

Using previous equalities, equation (1) rewrites as follows:

$\displaystyle E_v(u) = \int_{\bbbr} E^\lambda_v(u ^\lambda, v^\lambda)   ,   where$ (4)

$\displaystyle E ^\lambda_v(u^\lambda, v^\lambda) = \int_{\Omega} \left( \beta
... \right\vert + \left\vert u ^\lambda(x) - v
^\lambda(x) \right\vert dx \right) $

It is important to note that any term $ E ^\lambda_v(u^\lambda, v^\lambda)$ involves thresholded images $ u ^\lambda$ and $ v ^\lambda$ only. In other words, this decomposition rewrites (1) as a summation on all gray-levels of quantities. Moreover, in [9,10], it is shown that one can independently optimizes every $ E
^\lambda_v(\cdot, v^\lambda)$ in order to minimize (1), and then to reconstruct the minimizer using (2). We are now ready to show that minimization of $ L^1+TV$ yields a contrast invariant filter.

Theorem -contrast-theorem   Let $ v$ be an observed image and $ g$ be a continuous change of contrast. Assume $ u$ to be a global minimizer of $ E_v(\cdot)$ . Then $ g(u)$ is a global minimizer of $ E_{g(v)}(\cdot)$ .

Proof: It is sufficient to prove that for any level $ \lambda$ , a minimizer for $ g(v)^\lambda$ is $ g(u)^\lambda$ . Using lemma 1, there exists $ \mu$ such that $ v^\mu = g(v)^\lambda$ . A minimizer of $ E ^\mu_v(\cdot, v^\mu)$ is $ u ^\mu$ . Thus, $ u ^\mu$ is a minimizer of $ E ^\mu_v(\cdot,
g(v)^\lambda)$ . And we have $ u ^\mu = g(u) ^\lambda$ . This concludes the proof. $ \square$

Self-dual invariance is easily obtained.

Theorem -contrast-theorem   Let $ v$ be an observed image and assume $ u$ is a minimizer of $ E_v(\cdot)$ , then $ -u$ minimizes $ E_{-v}(\cdot)$ .

Proof: It is enough to note that $ \int_\Omega \vert\nabla u\vert = \int_\Omega \vert\nabla(-u)\vert$ and that $ \int_\Omega \vert u(x) - v(x)\vert dx = \int_\Omega \vert(-u(x)) - (-v(x))\vert
dx$ . The conclusion is straightforward. $ \square$

These results of contrast invariance and self-duality are new to our knowledge.

Fast Level Set Transform

In this section we briefly review the topographic map[6] which is built using the Fast Level Sets Transform. We refer the reader to [19,20] for a complete presentation.

The topographic map relies on simple inclusions of level sets. The family of lower and upper level sets are respectively increasing and decreasing, i.e,

$\displaystyle L_\lambda(u) \subset L_\mu(u)    \forall \lambda \leq \mu   ,

$\displaystyle U ^\lambda(u) \subset U ^\mu(u)    \forall \lambda \geq \mu   ,

These inclusion properties allow for a non-redundant and complete representation of an image into a tree, which will be referred to as FLST-tree for the rest of this paper. Basically, we consider connected components of level sets whose holes have been filled, referred to as shapes. A node of the FLST-tree corresponds to a shape. The parent of a node is the smallest shape that contains it while descendants are all shapes included into it. Because the FLST-tree considers both lower and upper level sets for inclusion, each shape is tagged to know if it comes from a lower or upper level set. Figure 1 depicts such a tree on a simple example. Thus, this tree decomposes an image $ u$ into shapes $ S_1,...,S_N$ , where $ S_N$ is the root of the tree.

Many attributes associated to each shape can be computed. For the remaining of this paper we only consider attributes required to rewrite $ L^1+TV$ on the FLST-tree. Thus we need for each shape $ S_i$ the following attributes:

The gray level value of the parent of the shape $ S_i$ will be referred to as $ v^p_i$ . Of course the parent of the root is not defined.

Figure 1: A simple image and its corresponding FLST-tree. Inferior and superior shapes are denoted by inf and sup respectively.
Image flst

Finally, note that the definition of the FLST-tree only involves level sets. Thus two images which differ only by a change of constrast give the structure of the tree (only gray level of shapes differs). The same property holds for self duality.

$ L^1+TV$ on the FLST-tree

In this section we consider the FLST-tree of the observed image $ v$ . Since, this tree is a complete representation of the image, we use it to reformulate the functional (1). Thus, we will minimize $ L^1+TV$ under the following new constraint: only gray level values $ v_i$ of shapes are allowed to change. Boundaries of shapes do not move. In what follows, original and minimizer gray level value of a shape $ S_i$ will be referred to as $ v_i$ and $ {u_i}$ respectively.

We first reformulate fidelity terms and the total variation term. Then we show that it is a Markov Random Field.

Reformulation on the FLTS-tree

The idea relies on defining a partition of $ \Omega$ obtained using the FLST-tree. Such a partition is obtained as follows: for each shape $ S_i$ we have kept pixels which are in $ S_i$ but not in its descendants. Recall that such a set is denoted by $ D_i$ . As a consequence the family $ \{D_i\}$ is a partition of $ \Omega$ . Therefore data fidelity term can be rewritten as follows:

$\displaystyle \sum_{i=1}^{N} \vert D_i\vert \vert u_i - v_i\vert.

Using the shapes provided by the FLST-tree and the coarea formula (3), we rewrite the total variation as follows:

$\displaystyle \sum_{i=1}^{N-1} P_i \vert u_i - u_i^p\vert.

It only depends on the difference between the gray values of a shape and its parent weighted by the perimeter of the considered shape.

Markov Random Field

As a consequence, we are interested in finding a family $ v_0...v_i$ which minimizes the following energy:

$\displaystyle \sum_{i=1}^{N} \vert D_i\vert \vert u_i - u_i\vert + {\beta} \sum_{i=1}^{N-1} P_i \vert u_i - u_i^p\vert.$ (5)

As one can see, this is the Bayesian labelling of a Markov Random Field where pairwise interactions are considered. More precisely, sites of the random field are the nodes of the FLST-tree whose labels are the gray level values of the shapes. Neighborhoods are defined by children and parents. Cliques of order 2 are considered. One can easily adapt theorem 1 and 2 to show that this minimization yields a self-dual and contrast invariant filter. The only difference is that $ \Omega$ is now the nodes of the FLST-tree instead of a rectangle of $ \bbbr^2$ . Thus the integral over $ \Omega$ is replaced by a summation on nodes of the FLST-tree.

Minimization algorithm and results

We briefly present an exact optimization algorithm for minimizing (5) and then present some results.

Minimization Algorithms

We are interested in minimizing exactly energy (5). Many algorithms are available to minimize it since this energy is convex. However this energy is not differentiable because of the use of absolute norm. Thus is it quite difficult to get an exact minimizer using gradient descent. Note that one could take benefit from the tree structure for this minimization, and use a Viterbi-like algorithm as described in [25]. However, this is intractable because some nodes of the FLST-tree have too many children (typically more than 200).

We used the algorithm described in [10] which provides an exact global minimizer of energy (5) using a divide and conquer approach. Basically it uses the formulation in terms of level sets given in (4). Each term $ E ^\lambda_v(u^\lambda, v^\lambda)$ is discretized and it yields a MRF where only binary variables are involved. Exact solutions of these binary MRF are found thanks to minimum-cut [16]. It is also shown that if two neighboring pixels at a level $ \lambda$ are different (i.e one is above $ \lambda$ and the other below), then they do not interact with each other anymore. Thus optimizations involving these pixels can be performed independently. This algorithm requires $ log_2 L$ minimum cuts where $ L$ is the number of gray levels (typically $ L=2^8$ ). In practice, time required to perform a minimum cut is quasi-linear with respect to the number of pixels [5].

We used the implementation of the FLST algorithm available in the Megawave image processing library [17]. Time results (in seconds on a 3GHz Pentium 4) for the well-known lena image and for the two images depicted in figures 2 and 3 are presented in table 1.

Table 1: Time results (in seconds)for our filtering. Time needed to build the FLST-tree and to perform the minimization are presented in the second and the third column respectively.
Image FLST Minimization
Lena (256x256) 0.18 0.11
Lena (512x512) 1.09 1.04
Woman (522x232) 0.39 0.06
Squirrel (209x288) 0.24 0.19

As one can see, this algorithm is fast.


We demonstrate on some examples that total variation minimization with $ L^1$ data fidelity and constrained by the FLST-tree can be a good image simplification filter. In particular, experimental results show that it can be very useful as a pre-processing step for segmentation.

Figure 2 depicts minimizers for the image "woman" with different regularization coefficients $ \beta$ . As one can see, the higher $ \beta$ is the more the image is simplified. The background tends to become homogeneous and face details are removed, while the contrast is not lost. This is mainly due to the morphological behavior of the filter. Results suggest than these images can be used as good initial starts for a segmentation algorithm. Finally, only few regions remain when a very high $ \beta$ is used. Level lines of such a minimizer are depicted in Figure 2 superimposed on the original image. Note that the boundaries do not move. This result shows that this filter behaves correctly even in the limit of strong weighted coefficient.

Figure 3 depicts minimizers for a textured image. On this image, the higher $ \beta$ is the more the texture is removed. However, the background and the squirrel almost do not merge. They merge where the tail has a hole and where the tail is dark due to shading. Except from these two areas, results are very good. Note how well the separation between the two textured regions (background/squirrel) is, using a high $ \beta$ . This latter result could be a segmentation result itself except for the two problems mentioned above, and also because the eye, being too small, is missed.

Figure 2: $ L^1+TV$ minimization on the FLST results for image "woman". Original image is depicted in (a). Minimizers for $ \beta =3$ and $ \beta =15$ are shown on (b) and (c) respectively. Image (d) depicts the level lines (in white) of the minimizer for $ \beta =30$ superimposed on an attenuated version of the original image.








Figure 3: Illustration of $ L^1+TV$ minimization on the FLST-tree for a textured image. Original image is depicted in (a). Minimizers for $ \beta =1$ and $ \beta =2$ are shown on (b) and (c) respectively. Image (d) depicts the level lines (in white) of the minimizer for $ \beta =10$ superimposed on the original image.
\centerline{(d) }\medskip\end{figure}


In this paper we have shown that minimizing the total variation under the $ L^1$ norm as data fidelity yields a self-dual and contrast invariant filter. We have added a constraint such that boundaries of level sets cannot move. Experiments have shown that this filter behaves particularly well in order to simplify an image. This is mainly due to its morphological behavior. Moreover an efficient algorithm is available to perform the minimization.

Further extensions are considered. We could use other morphological trees like Min/Max trees as proposed by Salembier et al. in [26]. We are currently working on formalizing connected attribute opening/closing as a minimization process on these trees using our approach. Finally, extension of this method to vectorial self-dual mathematical morphology will be presented in a forthcoming paper.


The author would like to thank Marc Sigelle (ENST / LTCI CNRS UMR 5141) and Didier Verna (LRDE) for their proofreadings.


S. Alliney.
Digital filters as absolute norms minimizers.
IEEE Transactions on Signal Processing, 40(6):1548-1562, 1992.

S. Alliney.
An algorithm for the minimization of mixed $ l^1$ and $ l^2$ norms with application to bayesian estimation.
IEEE Transactions on Signal Processing, 42(3):618-627, 1994.

S. Alliney.
A property of the minimum vectors of a regularizing functional defined by means of the absolute norm.
IEEE Transactions on Signal Processing, 45(4):913-917, 1997.

C. Bouman and K. Sauer.
A generalized gaussian image model for edge-preserving map estimation.
IEEE Transactions on Signal Processing, 2(3):296-310, july 1993.

Y. Boykov and V. Kolmogorov.
An experimental comparison of min-cut/max-flow algorithms for energy minimization in vision.
IEEE Transactions on Pattern Analysis and Machine Intelligence, 26(9):1124-1137, 2004.

V. Caselles, B. Coll, and J. Morel.
Topographic maps and local contrast changes in natural images.
International Journal of Computer Vision, 33(1):5-27, 1999.

T. Chan and S. Esedoglu.
Aspect of total variation regularized $ l^1$ function approximation.
Technical Report 7, UCLA, 2004.

T. Chan, S. Esedoglu, and M. Nikolova.
Algorithms for Finding Global Minimizers of Image Segmentation and Denoising Models.
Technical report, UCLA, Sept. 2004.

J. Darbon and M. Sigelle.
Exact Optimization of Discrete Constrained Total Variation Minimization Problems.
In Tenth International Workshop on Combinatorial Image Analysis, volume 3322 of Lecture Notes in Computer Science, pages 540-549, 2004.

J. Darbon and M. Sigelle.
A Fast and Exact Algorithm for Total Variation Minimization.
In In the proceedings of the second Iberian Conference on Pattern Recognition and Image Analysis (IbPria 2005), volume 3522 of Lecture Notes in Computer Science, pages 351-359, 2005.

F. Dibos and G. Koepfler.
Total variation minimization by the fast level sets transform.
In IEEE Workshop on Variational and Level Sets Methods, pages 179-185, 2001.

F. Dibos, G. Koepfler, and P. Monasse.
Geometric Level Sets Methods in Imaging, Vision, and Graphics, chapter 7- Total Variation Minimization for Scalar/Vector Regularization, pages 121-140.
Springer-Verlag, 2003.

L. Evans and R. Gariepy.
Measure Theory and Fine Properties of Functions.
CRC Press, 1992.

F. Guichard and J. Morel.
Mathematical morphology "almost everywhere".
In Proceedings of ISMM, pages 293-303. Csiro Publishing, April 2002.

F. Guichard and J.-M. Morel.
Image Iterative Smoothing and PDE s.
downloadable manuscript : please write email to, 2000.

V. Kolmogorov and R. Zabih.
What energy can be minimized via graph cuts?
IEEE Transactions on Pattern Analysis and Machine Intelligence, 26(2):147-159, 2004.

image processing software available at

Y. Meyer.
Oscillating patterns in image processing and nonlinear evolution equations.
University Lecture Series, 22, 2001.

P. Monasse.
Constrast Invariant Representation of Digital Images and Application to Registration.
PhD thesis, Ceremade, University Paris 9-Dauphine, 2000.

P. Monasse and F. Guichard.
Fast computation of a contrast-invariant representation.
IEEE Transactions on Image Processing, 9(5):860-872, 2000.

M. Nikolova.
Local strong homogeneity of a regularized extimator.
SIAM Journal on Applied Mathematics, 61:633-658, 2000.

M. Nikolova.
Minimizers of cost-functions involving nonsmooth data-fidelity terms.
SIAM J. Num. Anal., 40(3):965-994, 2002.

M. Nikolova.
A variational approach to remove outliers and impulse noise.
Journal of Mathematical Imaging and Vision, 20:99-120, 2004.

L. Rudin, S. Osher, and E. Fatemi.
Nonlinear total variation based noise removal algorithms.
Physica D., 60:259-268, 1992.

P. Salambier and L. Garrido.
Binary partition tree as an efficient representation for image processing, segmentation, and information retrieval.
IEEE Transactions on Image Processing, 9(4):561-576, 2000.

P. Salambier, A. Oliveras, and L. Garrido.
Antiextensive connected operators for image and sequence processing.
IEEE Transactions on Image Processing, 7(4):555-570, April 1998.

About this document ...

Total Variation Minimization with $ L^1$ Data Fidelity as a Contrast Invariant Filter

This document was generated using the LaTeX2HTML translator Version 2002-2-1 (1.71)

Copyright © 1993, 1994, 1995, 1996, Nikos Drakos, Computer Based Learning Unit, University of Leeds.
Copyright © 1997, 1998, 1999, Ross Moore, Mathematics Department, Macquarie University, Sydney.

The command line arguments were:
latex2html -split=1 article.tex

The translation was initiated by Jérôme Darbon on 2006-02-22

Back to my homepage
Jerome Darbon Jérôme Darbon 2006-02-22