Professional Documents
Culture Documents
Abstract-This paper presents a number of model based inter- would hold much more potential for greater image fidelity
polation schemes tailored to the problem of interpolating missing than a 2-D operation. Of course, the problem then arises about
regions in image sequences. These missing regions may be of estimating motion, and it becomes important to acknowledge
arbitrary size and of random, but known, location. The problem
of locating the missing regions is discussed in another paper the errors that will occur in this estimation process. Therefore,
in this issue. This problem occurs regularly with archived film with respect to reconstruction, a good algorithm would take
material. The film is abraded or obscured in patches, giving rise advantage of both spatial and temporal information and be
to bright and dark flashes, known as “dirt and sparkle” in the able to emphasize one or the other in spatially or temporally
motion picture industry. Both 3-D autoregressive models and 3-D inhomogeneous’ regions of the sequence.
Markov random fields are considered in the formulation of the
different reconstruction processes. The models act along motion Although it is true that one can formulate motion estimators
directions estimated using a multiresolution block matching (BM) that use the paradigms presented in this paper, we choose
scheme. It is possible to address this sort of impulsive noise instead, to use a simpler motion estimator-multiresolution
suppression problem with median filters, and comparisons with block matching. This brings some element of practicality to
earlier work using multilevel median filters are performed. These the algorithms that will be discussed since there already exist
comparisons demonstrate the higher reconstructionfidelity of the
new interpolators. block matching (BM) estimators on silicon, which conceiv-
ably could be incorporated into multiresolution schemes. The
I. INTRODUCTION details of the motion estimation scheme used can be found in
[I] and [ 2 ] .It is sufficient to note here that the multiresolution
T HE problem of missing data in image sequences occurs
regularly in archived motion picture film as well as
sequences from extremely high-speed film cameras. Parti-
scheme is similar to the one used by Bierling [3], and the
BM itself incorporates some explicit robustness to noise as
cles caught in the film transport mechanism can damage discussed by Boyce [41.
the image information. The missing data regions manifest as A full reconstruction system would therefore involve first
“blotches” of random intensity in the sequence, called “dirt motion estimation, then detection of the missing regions
and sparkle” in the motion picture industry. The problem (which have been characterized as temporal discontinuities
can be solved by using either a global filtering strategy or a in [l]), and, finally, reconstruction of the detected missing
detection/interpolation approach. The global filtering strategy regions. The paper considers three interpolators that are each
suffers from the drawback that the treatment is not guaranteed representative of a class of systems.
to leave uncorrupted regions untouched. This paper, therefore, First, a 3-D multilevel median filter that is an extension
describes processes for interpolating missing areas in the of those introduced previously [5]-[7]is presented. Although
image sequence after they have been flagged for treatment strictly not an interpolator, this type of filtering operation
by some detection process. Various detection processes have yields acceptable results when used as part of a detector
already been described in [ l ] and [2]. In this paper, the SDIa controlled scheme. Turning the filter on and off as required
detector (described in [ 1J and [2]) is used for examining the limits the “fading” effect of the median operation to just the
behavior of the interpolators in a real situation. flagged sites, thus improving the overall quality of the resulting
An important point is the size of the missing data being image when compared with a globally filtered one. Controlled
considered in this paper. Unlike typical impulsive noise sup- median operations were also considered in [2] and [8].
pression applications, it is possible for blotches on motion Two “model-based” approaches are then described. The first
picture film to be larger than 20 x 20 pixels. A spatial median employs a Markov random field (MRF) model of the image,
filtering operation thus becomes less effective in the center of and the second considers 3-D autoregressive’ (3-DAR) models
such distortion primarily because it is then considering many of the image. Both of these models attempt to account for
missing pixels in its output. Of course, one could design a intensity variation in the image, the first employing Gibbs
median filter that uses more intraframe information, and this distributions and Bayesian estimation strategies, whereas the
is illustrated in the section on 3-D multilevel filters. second employs a more traditional linear prediction approach.
In addressing the issue of data reconstruction for image se- The goal in using some image model for reconstruction is to be
quences, it is necessary to recognize that a fully 3-D operation able to provide interpolated samples that smoothly blend with
the rest of the data at the fringes of the blotch as well as to be
Manuscript received March 19, 1994; revised January 10, 1995. This work
was supported by the British Library and Cable and Wireless PLC. The
associate editor coordinating the review of this paper and approving it for ’Inhomogeneous due to either nontrivial motion or erroneous motion
publication was A. Murat Tekalp. estimation.
The authors are with the Signal Processing and Communications Labora- Noncausal multidimensional autoregressive processes are considered here.
tory, Department of Engineering, Cambridge University, Cambridge, UK. Noncausal autoregressive process are perhaps better referred to as noncausal
IEEE Log Number 9414601. minimum variance processes [9].
Authorized licensed use limited to: BIBLIOTECA D'AREA SCIENTIFICO TECNOLOGICA ROMA 3. Downloaded on October 8, 2009 at 04:32 from IEEE Xplore. Restrictions apply.
1510 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 4, NO. 11, NOVEMBER 1995
where the model is only over the missing regions, (i(3 : hence, variable and those which are of known, fixed values.
d ( 3 = l}, d ( 3 = 0 indicating known data at position Performing this integral results in
r'; and d(?) = 1 indicating missing data. JV-Fis the spatial
neighborhood of pixel F, T- is the temporal neighborhood,
and A is the relative weight given to the temporal neighbors. r
21 normalizes the distribution. The spatial neighborhoods
used were the first- and second-order neighborhoods (four
and eight nearest neighbors), and the temporal neighborhood
comprised either one or five pixels from each of the previous
and following frames. .<€N,-:d(.+O
The Gibbs sampler [15], [16] may be used directly with the
distribution of (2) to form an interpolation. At each pixel of the
missing region taken in turn, a new value is drawn from the
+A [T/2
S'E 7<
+(m(3 - i(.9)7 ] . (7)
Authorized licensed use limited to: BIBLIOTECA D'AREA SCIENTIFICO TECNOLOGICA ROMA 3. Downloaded on October 8, 2009 at 04:32 from IEEE Xplore. Restrictions apply.
KOKARAM et al.: INTERPOLATION OF MISSING DATA IN IMAGE SEQUENCES 1513
111. COMPUTATIONAL
LOAD
Since the interpolation processes described here are inde-
pendent of the choice of motion estimator and blotch detector,
the load of those processes is not considered here. See [ l ] for
discussion of the computational load of various detectors.
All arithmetic operations, e.g., +- ABS < were counted
as costing one operation. The exponential function evaluation
was taken as costing 20 operations, and inversion of an N x N
matrix was assumed to be an N 3 process. Estimates for the
number of operations per blotched pixel for the detectors are
as follows:
MMF = 160
3-DAR = 20000 (assuming a block size of 8 x 8 pixels,
a 9:0 model, and a 10% rate of corruption)
MRF = 22 operations per iteration.
With regard to the MRF interpolator, about 1000 iterations
Fig. 2. Frame 23 of WESTERN, size 256 x 256
were needed in the following- experiments.
. The 3-DAR oper-
ation estimate is not independent of the rate or spatial layout
for motion; therefore, the motion parameter has been omitted of the corruption since the process involves the inversion of
from the 3-D AR model. Further, a 3-D trend (of the form matrices (Au),the sizes of which are a function of the number
+ +
ai p j yk) is subtracted from the data prior to modeling to of spatially connected missing pixels in a considered block of
improve the prediction [ 2 ] , [25]. The least squares estimation data.
of the trend coefficients is also weighted in an identical manner
to that shown here and performed as a separate step. Iv. RESULTSAND DISCUSSION
Minimizing the squared error [~,(q]' with respect to the There are two factors to be considered in discussing the
coefficients, then yields the following set of P 1 equations. + performance of these interpolators. First of all, given some
P missing patch and errors in motion estimation due to these
a / ~ E [ ( w ( q ) ~ I ( . ' +{ k ) I ( F f &)] = 0 patches; how accurate is the reconstruction? Second, in a real
k=O situation, errors in motion estimation will yield subsequent
f o r m = O . . . P. (14) errors in detection of missing patches; how robust is the inter-
polator to these errors? Of course, the ultimate performance of
where a0 = 1.0. Therefore, these equations may be written in
the interpolators would be observed when the missing patches
matrix form as
have been correctly detected and the motion estimation process
C,a = -c W (15) has not been adversely affected. However, this does not give
a realistic assessment of performance and results for this case
where C , is a P x P matrix of correlation coefficients, and c, are not illustrated here in the interest of brevity.
is a P x 1 vector of correlation coefficients. Equation (15) is The sequence WESTERN1 (60 frames of 256 x 256) is
the weighted solution for the P model coefficients. The most used to demonstrate the performance of the interpolators on
obvious choice for the weighting function is a binary field set artificially corrupted data. The probability of distortion was
to 0 for all the blotch positions and 1 otherwise. This is found 0.007, and the blotches were generated as outlined in the
to be extremely effective in practice. Note that methods for companion paper [ 11. Motion estimation was performed using
optimal weighting are available; one of these is given in [26]. the corrupted frames with a multiresolution BM algorithm em-
ploying three resolution levels, 256 x 256: 128 x 128,64 x 64.
D. A Practical Consideration The details of the parameters used for the motion-estimation
It is necessary to choose a region of data around the detected process are not important; it is sufficient to note that all
missing region from which to estimate the AR coefficients that interpolators used the same motion vectors. Integer accurate
are then used to interpolate the missing data. For the purposes motion estimates were used. Fig. 2 shows a full-sized picture
of this paper, this region was chosen to be a square area of frame 23 of the WESTERN sequence to give a feel for the
centered on the missing region such that the missing region image composition.
occupied less than 10% of the data block. Of course, when
the missing region is large enough to cover many statistically A. Known Distortion
differing areas, the resulting coefficients do not well describe Fig. 3 compares the performance of various interpolators
the underlying model for the particular missing region. In such on separate frames of WESTERN based on the mean squared
cases, the interpolation is blurred. It would be better to use error (MSE) between the interpolated missing regions and the
1514 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 4, NO. 11, NOVEMBER 1995
2.5
2
$
w
2
1.5
v)
C
a
f 1
0.5
0
IO 20 30 40 50 60
Frame
Fig. 3. MSE of various interpolators of known distortion
I I I I I
0 ‘
10 20 30 40 50 60
Frame
Fig. 4. MSE of various interpolators operating on distortion detected using the SDIa.
original clean frames. The missing regions have been assumed of the compromises within each interpolator. The blotch over
to be correctly identified in this case. The graph shows that the ‘C’ is very well removed by the median filter. The AR
the 9:8 AR interpolator performs best .overall, with the median interpolator does not perform as well here (although texture
operation being the worst and the MRF interpolator (using is reconstructed) because it is unable to reject the corrupted
first-order cliques in a 1:4:1 neighborhood with four pixel information in that same position in the previous frame (see
+
current frame support in a configuration with X = 1)striking Fig. 5). The MRF interpolator does not do as good a job of
some compromise between these extremes. reconstructing texture as the AR process since the interpolated
To illustrate this behavior, Fig. 5 shows a zoomed portion3 region above the ‘C’ is not textured at all. The fact that the
of three frames from the corrupted WESTERN sequence. The median filter reconstructs the texture in this region well is more
original (zoomed) frame 23 is shown as the bottom right hand due to the fact that it rearranges existing surrounding samples
image in Fig. 5. The missing regions (blotches) of interest have and conserves the randomness of the background texture.
been boxed in white in the top right hand image (frame 23). Visual results from the 9:O model are not shown since it
Fig. 6 shows the results of interpolating the missing regions is clear that its performance is affected by the lack of spatial
using a 9:8 AR model, the MRF interpolator and the ML3Dex support in the current frame. In this respect, it is prone to
median filter. The three boxed regions show a good overview the same problems affecting ML3Dex in that the quality of
interpolations depends heavily on the integrity of the motion
3Size (128 x 128) estimates.
KOKARAM et al.: INTERPOLATION OF MISSING DATA IN IMAGE SEQUENCES IS15
Fig. 5. Zoom on degraded frames 22, 23, (Top left, right) 24 (Bottom left) Fig. 7. Degraded frames 44, 45 (top left, right), 46 (bottom left) of
of WESTERN. Zoom on original frame 23 (bottom right). WESTERN. Bottom right: Detection on frame 45 using SDIa indicated as
bright white pixels.
Fig. 6. Zoom on restored frame 23 using MRF (top, left), 9:8 AR (top right),
M13-Dex (bottom left). Original frame 23 (bottom right). Fig. 8. Restored frame 45 using MRF, AR 9:8 (top left, right), M13-Dex and
original frame 45 (bottom left, right).
V. REAL MOVIES
Two outstanding considerations remain with respect to real
degradation in typical motion picture film. First of all, unlike
the artificial case, blotches do not have sharp edges; therefore,
it is typical for a simple detector like the SDIa to be unable to
detect the periphery of a blotch. As a result, the interpolation
process usually cannot remove the entire defect and in the
AR case often replaces the missing data with data that has
the intensity of the undetected blotch periphery. One solution
to this problem is to examine the image data in the region Fig. 10. Frame 2 of FRANK with large blotches boxed.
Fig:. 12. Detection on frame 2 of FRANK. White: both fractional and integer
-
motion estimation. Green: additional flagged by fractional estimation. Red: Fig. 14. Restored frame 2 using MRF
additional flagged by integer estimation.
Figs. 13-15 show interpolations of the missing data using a flutterings of one of the petals; therefore, it is partially flagged
9:s AR model, MRF, and M13-Dex system, respectively. The as a blotch. The performance of the interpolators in this region
MRF system used cliques in a 5:8:5 neighborhood with X = 2 . is worse than in other areas.
The five pixels used in the previous frame were arranged in Subjective Assement: A series of differently, artificially,
a + configuration. The interpolated locations were flagged and real degraded sequences have been processed. Informal
by the SDIa using fractional motion estimates. Note again subjective assessment of the restored sequences displayed at
how well all the systems perform where there is little textural 25 framesls (UK PAL television standard) was performed.
detail. However, the blotch in the head of the figure is best It is found, in general, that it is difficult to determine any
interpolated by the AR system, with the MRF being somewhat major difference in quality between the restorations at this
blurred and the median filter giving a generally flat intensity. frame rate. A closer examination allows the observer to rank
Again, the classic motion-estimation problem arises in the the restorations in the order 3-DAR, MRF, and MMF. The
petals of the flower in the picture. It is very difficult for AR process is more robust to motion-estimation errors and
any motion estimation algorithm to track the almost random generally gives the smoothest interpolation. The MMF often
1518 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 4, NO. 11, NOVEMBER 1995
REFERENCES
R. R. Schultz and R. L. Stevenson, “A bayesian approach to image Ani1 C. Kokaram (S‘91-M‘92), for a photograph and biography, see this
expansion for improved definition,” IEEE Trans. Image Processing, vol. issue, p. 1508.
3, no. 4, pp. 233-242, May 1994.
S. V. Vaseghi, “Algorithms for the restoration of archived gramophone
recordings,” Ph.D. Thesis, Cambridge Univ., Cambridge, UK, 1988.
P. Strobach, “Quadtree-structured linear prediction models for image
sequence processing, ”IEEE Patt. Anal. Machine Intell., vol. 11, pp.
742-747, July 1989. Robin D. Morris, for a photograph and biography, see this issue, p. 1508.
S. Efstratiadis and A. Katsagellos, “A model based, pel-recursive motion
estimation algorithm,” in Proc. IEEE ICASSP, 1990, pp. 1973-1976.
R. Veldhuis, Restoration of Lost Samples in Digital Signals. Engle-
wood Cliffs, NJ: Prentice Hall, 1980.
E. DiClaudio, G. Orlandi, F. Piazza, and A. Uncini, “Optimal weighted
LS AR estimation in presence of impulsive noise,” in Proc. IEEE
ICASSP, vol. E3.8, 1991, pp. 3149-3152. William J. Fitzgerald, for a photograph and biography, see this issue, p. 1508.
J. S. Lim, Two-Dimensional Signal and Image Processing. Englewood
Cliffs, NJ: Prentice-Hall, 1990.
R. Schalkoff, Digital Image Processing and Computer Vision. New
York: Wiley, 1989.
B. Girod, “Motionxompensating prediction with fractional-pel accu-
racy,” IEEE Trans. Commun., vol. 41, pp. 604-612, 1993. Peter J. W. Rayner, for a photograph and biography, see this issue, p. 1508.