You are on page 1of 11

IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 4. NO.

I I , NOVEMBER 1995 1509

Interpolation of Missing Data in Image Sequences


Ani1 C. Kokaram, Member, IEEE, Robin D. Morris, William J. Fitzgerald, and Peter J. W. Rayner

Abstract-This paper presents a number of model based inter- would hold much more potential for greater image fidelity
polation schemes tailored to the problem of interpolating missing than a 2-D operation. Of course, the problem then arises about
regions in image sequences. These missing regions may be of estimating motion, and it becomes important to acknowledge
arbitrary size and of random, but known, location. The problem
of locating the missing regions is discussed in another paper the errors that will occur in this estimation process. Therefore,
in this issue. This problem occurs regularly with archived film with respect to reconstruction, a good algorithm would take
material. The film is abraded or obscured in patches, giving rise advantage of both spatial and temporal information and be
to bright and dark flashes, known as “dirt and sparkle” in the able to emphasize one or the other in spatially or temporally
motion picture industry. Both 3-D autoregressive models and 3-D inhomogeneous’ regions of the sequence.
Markov random fields are considered in the formulation of the
different reconstruction processes. The models act along motion Although it is true that one can formulate motion estimators
directions estimated using a multiresolution block matching (BM) that use the paradigms presented in this paper, we choose
scheme. It is possible to address this sort of impulsive noise instead, to use a simpler motion estimator-multiresolution
suppression problem with median filters, and comparisons with block matching. This brings some element of practicality to
earlier work using multilevel median filters are performed. These the algorithms that will be discussed since there already exist
comparisons demonstrate the higher reconstructionfidelity of the
new interpolators. block matching (BM) estimators on silicon, which conceiv-
ably could be incorporated into multiresolution schemes. The
I. INTRODUCTION details of the motion estimation scheme used can be found in
[I] and [ 2 ] .It is sufficient to note here that the multiresolution
T HE problem of missing data in image sequences occurs
regularly in archived motion picture film as well as
sequences from extremely high-speed film cameras. Parti-
scheme is similar to the one used by Bierling [3], and the
BM itself incorporates some explicit robustness to noise as
cles caught in the film transport mechanism can damage discussed by Boyce [41.
the image information. The missing data regions manifest as A full reconstruction system would therefore involve first
“blotches” of random intensity in the sequence, called “dirt motion estimation, then detection of the missing regions
and sparkle” in the motion picture industry. The problem (which have been characterized as temporal discontinuities
can be solved by using either a global filtering strategy or a in [l]), and, finally, reconstruction of the detected missing
detection/interpolation approach. The global filtering strategy regions. The paper considers three interpolators that are each
suffers from the drawback that the treatment is not guaranteed representative of a class of systems.
to leave uncorrupted regions untouched. This paper, therefore, First, a 3-D multilevel median filter that is an extension
describes processes for interpolating missing areas in the of those introduced previously [5]-[7]is presented. Although
image sequence after they have been flagged for treatment strictly not an interpolator, this type of filtering operation
by some detection process. Various detection processes have yields acceptable results when used as part of a detector
already been described in [ l ] and [2]. In this paper, the SDIa controlled scheme. Turning the filter on and off as required
detector (described in [ 1J and [2]) is used for examining the limits the “fading” effect of the median operation to just the
behavior of the interpolators in a real situation. flagged sites, thus improving the overall quality of the resulting
An important point is the size of the missing data being image when compared with a globally filtered one. Controlled
considered in this paper. Unlike typical impulsive noise sup- median operations were also considered in [2] and [8].
pression applications, it is possible for blotches on motion Two “model-based” approaches are then described. The first
picture film to be larger than 20 x 20 pixels. A spatial median employs a Markov random field (MRF) model of the image,
filtering operation thus becomes less effective in the center of and the second considers 3-D autoregressive’ (3-DAR) models
such distortion primarily because it is then considering many of the image. Both of these models attempt to account for
missing pixels in its output. Of course, one could design a intensity variation in the image, the first employing Gibbs
median filter that uses more intraframe information, and this distributions and Bayesian estimation strategies, whereas the
is illustrated in the section on 3-D multilevel filters. second employs a more traditional linear prediction approach.
In addressing the issue of data reconstruction for image se- The goal in using some image model for reconstruction is to be
quences, it is necessary to recognize that a fully 3-D operation able to provide interpolated samples that smoothly blend with
the rest of the data at the fringes of the blotch as well as to be
Manuscript received March 19, 1994; revised January 10, 1995. This work
was supported by the British Library and Cable and Wireless PLC. The
associate editor coordinating the review of this paper and approving it for ’Inhomogeneous due to either nontrivial motion or erroneous motion
publication was A. Murat Tekalp. estimation.
The authors are with the Signal Processing and Communications Labora- Noncausal multidimensional autoregressive processes are considered here.
tory, Department of Engineering, Cambridge University, Cambridge, UK. Noncausal autoregressive process are perhaps better referred to as noncausal
IEEE Log Number 9414601. minimum variance processes [9].

1057-7149/95$04.00 0 1995 IEEE

Authorized licensed use limited to: BIBLIOTECA D'AREA SCIENTIFICO TECNOLOGICA ROMA 3. Downloaded on October 8, 2009 at 04:32 from IEEE Xplore. Restrictions apply.
1510 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 4, NO. 11, NOVEMBER 1995

able to preserve detail such as edges across the blotch. This is


particularly useful considering the large size of blotches that
can occur.
Of course, no blotch detector is perfect, especially if, as in
[I], the only criterion for detection is a region of temporal w1 w2
discontinuity. Therefore, one expects false alarms, and the
interpolators presented in the rest of this paper are quite robust
to such problems. However, in areas of nontrivial motion,
such as rapid occlusion and uncovering, it is difficult to
w3 w4 w5
interpolate useful data unless perhaps more frames are used
for motion estimatiodinterpolation. The algorithms discussed Fig. I. Subfilter masks used for the new MMF: ML3-Dex.
here operate on three frames only but can be extended in the is similar to that of Alp et al. [7] and is called ML3Dex for
manner of [IO] and [ 111 to consider more frames. extended ML3D. The output of the filter is defined as follows:
The remainder of this paper first presents the various inter-
polators and then compares their action on known distortion in
21 = median[W~] 1 5 I25
an image sequence. Finally, examples of complete restorations, ML3Dex Filter output = median[zl,2 2 , 2 3 , 2 4 , 2 5 1 . (1)
i.e., motion estimation followed by detection and reconstruc- Two additional windows have been incorporated that con-
tion, are given to show the applicability of these interpolators tain minimal information from the current frame and extensive
to the problem of removing “dirt and sparkle” from image information from the outer frames. Consider the situation with
sequences. a large blotch covering all of the center 3 x 3 pixels. It can
11. THE INTERPOLATORS be seen that although the windows W3 and W4 would output
Blotch values inside the degraded area, the three windows
A. A 3-0Multilevel Median Filter W1, W2,W5 would still be able to yield a correction using
Median filters are not usually regarded as interpolators, but image data. In other words, provided that the blotch does not
in this case, it is clear that the median operation is quite ap- occur at the same position in the three frames, the medians
propriate both because of its reported robustness as well as its of windows W1, W2,W5 are not dominated by scratch data.
success in impulsive noise suppression for images as a whole. Furthermore, the additional information improves the scratch
Alp [7] and Arce [ 5 ] , [6] have both previously introduced rejection capacity of the filter. This is accompanied of course
3-D multilevel median filter (MMF) structures for removing by a subsequent loss of detail preservation when compared to
impulsive noise. The structures were introduced without a the Alp et al. or Arce filters.
motion-compensated implementation and without detection of B. MRF-Based Interpolators
impulses. Both Huang [12] and Martinez [13] implemented
To interpolate very large regions of missing data in an
a three-tap motion-compensated median operation with good
isolated image would require a complex, adaptive model.
results. A discussion of the advantages of motion-compensated
However, typical missing data regions caused by blotches do
application of 3-D MMF’s as opposed to a nonmotion-
not typically occur at the same motion-compensated locations
compensated application is given in [2]. Although one pays
in successive frames, and this means that the spatiotemporal
a higher computational cost for motion estimation, the gains
neighborhood of the missing region contains much good infor-
in image fidelity can make the process worthwhile. This is
important in the case of TV imagery in which the motion is mation, reducing the complexity of the interpolator required.
In this section, an interpolator based on an MRF image
not small, therefore causing the filters presented by Alp and
model [15] is proposed. Although this type of model can be
Arce to tend to a purely spatial operation.
very general and sophisticated, because of the comments in
It is important to realize the blotches that are typically
the preceding paragraph, a simple MRF was used in this work.
encountered can be quite large in practice. Blotches spanning
Only two element cliques were used, coupled with different
10 pixels are often seen. With this in mind, it becomes
neighborhood structures and the quadratic potential function.
important that the filtering operation involves information
This produces a smooth interpolation that is suitable for this
from the surrounding frames. Because of the little temporal
purpose.
information used by the filters introduced by Alp and Arce,
they cannot completely remove this degradation after one filter The nature of the problem is such that areas that are not
classified as missing must not be affected-only the missing
pass. Several passes could be employed, but doing so would
areas are to be interpolated, based on the known information
affect the rest of the uncorrupted image. Although the three-
in the spatiotemporal neighborhood of the missing region. An
tap filter of Huang and Martinez would be very effective in
MRF formulation that embodies this is
removing blotches, it is also very sensitive to erroneous motion / r
estimation. 1
As stated previously, the authors [8], [14] introduced a 3-D p ( I = i ( D = d ) = -exp
ZI
MMF that preserves detail well while being fairly robust to -
motion estimation errors and being able to reject large size
distortion. The proposed filter structure can be defined with
the help of the subfilter windows shown in Fig. 1. The MMF
+A (i(3- i ( q 2
S’E 7;
(2)
I)
KOKARAM et al.: INTERPOLATION OF MISSING DATA IN IMAGE SEQUENCES 151 I

where the model is only over the missing regions, (i(3 : hence, variable and those which are of known, fixed values.
d ( 3 = l}, d ( 3 = 0 indicating known data at position Performing this integral results in
r'; and d(?) = 1 indicating missing data. JV-Fis the spatial
neighborhood of pixel F, T- is the temporal neighborhood,
and A is the relative weight given to the temporal neighbors. r
21 normalizes the distribution. The spatial neighborhoods
used were the first- and second-order neighborhoods (four
and eight nearest neighbors), and the temporal neighborhood
comprised either one or five pixels from each of the previous
and following frames. .<€N,-:d(.+O

The Gibbs sampler [15], [16] may be used directly with the
distribution of (2) to form an interpolation. At each pixel of the
missing region taken in turn, a new value is drawn from the
+A [T/2
S'E 7<
+(m(3 - i(.9)7 ] . (7)

distribution of p ( z ( q ) , conditional on the current values in the


This result is used in (4) to find the interpolant.
neighborhood. This will converge to a sample from the joint Gradient descent can be used to find the solution of (4), the
distribution over the missing region. However, for the large gradients being
state spaces typically involved in high-quality imagery, each
iteration of the Gibbs sampler is computationally expensive,
and the interpolation converges very slowly.
This motivates the use of the mean field approximation = 4(m(3-m(.3)
[ 171, [ 181 as a more efficient, deterministic approximation
to the interpolant. The distribution of (2) is replaced by a
much simpler distribution with no spatiotemporal interaction
between the variables, giving the joint distribution over the
pixels to be interpolated as +A 2(m(3-
.<€ 7;

where again, the distinction between fixed and variable argu-


ments is apparent. Once the values of m ( 3 that minimize the
errors in the approximation have been found, the interpolation
This distribution is a function of the parameters m ( 3 . The is given by i(3 = m ( 3 at the missing pixels.
optimum interpolant is i(F) = m ( 3 , which maximizes the Recently, in [21], the Huber potential function was used in
distribution in (3). The problem is to select m(F)to minimize a gradient descent algorithm for image expansion, which is es-
the errors introduced by using the distribution of (3) rather sentially an interpolation task. The formulation in [21] is more
than the distribution of (2). complex than is needed here. As for image expansion, large
The Gibbs-Bogoliubov-Feynman bound [ 191, [20] states areas must be interpolated from little data. For interpolating
that the error of this approximation is minimized when missing data in image sequences, the areas to be interpolated
are close to good data in the temporal direction, allowing the
V,[-Tln Zo + (U - Uo),] =O (4) simpler formulation of (8) to be used successfully.
where U ( i ) is the energy function of the distribution of (2) C. The AR-Based Interpolator
given by
The interpolation method developed by Vaseghi [22] has
been very successful for audio signals, and it is this method
that is extended to three dimensions for use in this image
J sequence interpolation problem.
(5) It is assumed that the statistics of the information in a block
of L x L pixels is sufficiently stationary for a single model
and U o ( i )is similarly the energy function for the distribution to be used for that block. Within this block, assume that there
of (3). (U - U O ) , is the mean value with respect to the are M pixel locations that are corrupted. These locations are
distribution of (3), that is arbitrary; they may be connected together as in a blotch, or
they may be a set of dispersed impulses.
(U - U& = ( U ( i )- Uo(i))po(i)di (6) Further, it is assumed that the block can be modeled by a
L M
3-D AR model incorporating L x L blocks in the surrounding
where M is the number of missing pixels. The section of frames that are compensated for motion. This model has
this integral involving UO(z) is straightforward. The temporal already been discussed in [ 11, [2,] [23], and [24]. The intensity
term in U ( i ) also produces a simple integral. Because of the of the pixels in the block in the current frame may be written as
interaction between the variables, the spatial term in the U ( i ) P
integral is more involved and requires a distinction to be I(? = aJ(r'+ 4;c) +E ( 3 (9)
made between neighbors that are within missing regions and, k=l
1512 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 4, NO. 11, NOVEMBER 1995

where model coefficients. These must be estimated from the corrupt


I ( f ) pixel grey level at position r' (a 3-D position vector) data, and this estimation process is discussed next.
in the image An important point to recognize is that (IO) can be made up
ak model coefficients of error observations from blocks in frames before and after
~ ( f )model excitation (or ideal prediction error). the current one. In fact, it is sensible to incorporate as many
The P vectors i& are support vectors that point to each pixel in observations as possible that incorporate data in the missing
the neighborhood used for the AR model. Therefore, I(?+ &) regions to maximize the information used. This implies that
is the grey level of the pixel at the kth support position for the for a causal model with support in the previous frame only,
pixel at r'. It is assumed that a set of N frames are considered, observations in the K x K block in the next frame also incor-
and the data volume used has already been compensated for porate the missing pixel information i, in the current frame.
motion. Therefore, the resulting interpolator incorporates information
It is helpful to describe predictors by the number of pixels from one frame both previous to and following the current
support in each frame. There is no evidence for asymmetric frame. For a noncausal model incorporating one frame of
supports; therefore, a 9:O model refers to a model with nine support in the previous and next frames, a total of five frames
pixels in a 3 x 3 square in the previous frame acting as support. can be incorporated into the equations (two frames previous
A 9:0:9 model has twice that support: nine pixels in each of and following the considered one). In practice, however, this
the previous and next frames. In general, an a : b model has extra information does not yield significant improvements over
a pixels support in the previous frame and b in the current using just observations from the current frame.
frame; an a : b : c model has, in addition, c pixels support A useful practical point to be noted is that for a given set
in the next frame. of missing pixels, there is a maximum area of observations
Allowing for a border of pixels at the edge of the L x L around the missing region beyond which no improvement in
block in the current frame, say n (so that (?+ &) will never interpolation quality is gained. This region depends on the
result in a location outside the L x L block), an equation for model support. The reason is that the observation equations
the error at every pixel within a centered K x K block in that (i.e., the equations for the model errors) are only useful
frame can be written as follows: for interpolation if they contain at least one missing pixel.
Therefore, if there was a region of missing pixels that was of
e = Ai (10) size 1 x 1 then given a 9:O causal 3-D AR model, the spatial
where extent of the observation equations need only be (1+2) x (1+2).
i N L 2 x 1 column vector of row ordered pixels from the This result follows from examination of the interpolation (12).
N L x L blocks Estimating the Model Coefficients: To solve (12), the
e K 2 x 1 column vector of errors model coefficients are required. In a real case, however,
A matrix of coefficients satisfying the model equation at these values are unavailable. They must be estimated from the
all the considered points. degraded image sequence data. However, as has been stated
before, the block sizes that must be used are small. This is
This coefficient matrix is of size K 2 x N L 2 . The vector i
forced because of the highly nonstationary nature of image
contains intensities of both known and unknown pixels. If this
sequences, both in terms of space and time (due to errors in
vector is separated into two vectors i, (U for unknown) and i k
motion estimation). This means that the distortion can bias
(k for known), which represent the known and unknown pixel the model coefficients adversely. Because a detector is used
intensities, then (IO) can be written as
to isolate a suspected distorted area, this information can be
e = Akik + A&. (1 1)
used to suppress the bias that the distorted area would cause.
The normal equations are altered to solve for the AR
Here, Ak, A, are the coefficient matrices corresponding to parameters using weighted coefficient estimation. The model
the known and unknown data vectors. They are submatrices coefficients are normally chosen to minimize the expected
of the A matrix made by extracting the relevant columns. The value of the squared prediction error at all points in the block
length of i, is M x 1. considered. Because some of this data is now known to be
To derive an interpolation, i, must be found. Following missing, the prediction error at these points may be weighted
Vaseghi [22], this is done by minimizing the squared error to zero so that this data does not affect the estimation process.
eTe with respect to i, as follows: The general approach is to weight the prediction error by some
function w ( q , prior to minimizing the squared weighted error.
eTe = (&it+ A,i,]T[Akik + A&] For the purposes of coefficient estimation, the new predic-
= irArAkik + irArAuiu+ iTA;Akik tion equation may be written as
+ i;ATA,i, P
- - - 2A:Akik + 2ATA,i,
deTe fw(3 = w(q akI(?+ {k) (13)
diu k=O
+ i, = -[AzA,]-'ATAkik. (12) where all the symbols have their usual meaning a. = 1.0,
Therefore, the solution for the interpolated pixels is given and E,(F') is the weighted error at position ?. It is assumed
by (12). Of course, this solution implies knowledge of the that the data volume being used has already been compensated

Authorized licensed use limited to: BIBLIOTECA D'AREA SCIENTIFICO TECNOLOGICA ROMA 3. Downloaded on October 8, 2009 at 04:32 from IEEE Xplore. Restrictions apply.
KOKARAM et al.: INTERPOLATION OF MISSING DATA IN IMAGE SEQUENCES 1513

1 nonstationary AR models for these situations, but these are


not considered in this work.

111. COMPUTATIONAL
LOAD
Since the interpolation processes described here are inde-
pendent of the choice of motion estimator and blotch detector,
the load of those processes is not considered here. See [ l ] for
discussion of the computational load of various detectors.
All arithmetic operations, e.g., +- ABS < were counted
as costing one operation. The exponential function evaluation
was taken as costing 20 operations, and inversion of an N x N
matrix was assumed to be an N 3 process. Estimates for the
number of operations per blotched pixel for the detectors are
as follows:
MMF = 160
3-DAR = 20000 (assuming a block size of 8 x 8 pixels,
a 9:0 model, and a 10% rate of corruption)
MRF = 22 operations per iteration.
With regard to the MRF interpolator, about 1000 iterations
Fig. 2. Frame 23 of WESTERN, size 256 x 256
were needed in the following- experiments.
. The 3-DAR oper-
ation estimate is not independent of the rate or spatial layout
for motion; therefore, the motion parameter has been omitted of the corruption since the process involves the inversion of
from the 3-D AR model. Further, a 3-D trend (of the form matrices (Au),the sizes of which are a function of the number
+ +
ai p j yk) is subtracted from the data prior to modeling to of spatially connected missing pixels in a considered block of
improve the prediction [ 2 ] , [25]. The least squares estimation data.
of the trend coefficients is also weighted in an identical manner
to that shown here and performed as a separate step. Iv. RESULTSAND DISCUSSION
Minimizing the squared error [~,(q]' with respect to the There are two factors to be considered in discussing the
coefficients, then yields the following set of P 1 equations. + performance of these interpolators. First of all, given some
P missing patch and errors in motion estimation due to these
a / ~ E [ ( w ( q ) ~ I ( . ' +{ k ) I ( F f &)] = 0 patches; how accurate is the reconstruction? Second, in a real
k=O situation, errors in motion estimation will yield subsequent
f o r m = O . . . P. (14) errors in detection of missing patches; how robust is the inter-
polator to these errors? Of course, the ultimate performance of
where a0 = 1.0. Therefore, these equations may be written in
the interpolators would be observed when the missing patches
matrix form as
have been correctly detected and the motion estimation process
C,a = -c W (15) has not been adversely affected. However, this does not give
a realistic assessment of performance and results for this case
where C , is a P x P matrix of correlation coefficients, and c, are not illustrated here in the interest of brevity.
is a P x 1 vector of correlation coefficients. Equation (15) is The sequence WESTERN1 (60 frames of 256 x 256) is
the weighted solution for the P model coefficients. The most used to demonstrate the performance of the interpolators on
obvious choice for the weighting function is a binary field set artificially corrupted data. The probability of distortion was
to 0 for all the blotch positions and 1 otherwise. This is found 0.007, and the blotches were generated as outlined in the
to be extremely effective in practice. Note that methods for companion paper [ 11. Motion estimation was performed using
optimal weighting are available; one of these is given in [26]. the corrupted frames with a multiresolution BM algorithm em-
ploying three resolution levels, 256 x 256: 128 x 128,64 x 64.
D. A Practical Consideration The details of the parameters used for the motion-estimation
It is necessary to choose a region of data around the detected process are not important; it is sufficient to note that all
missing region from which to estimate the AR coefficients that interpolators used the same motion vectors. Integer accurate
are then used to interpolate the missing data. For the purposes motion estimates were used. Fig. 2 shows a full-sized picture
of this paper, this region was chosen to be a square area of frame 23 of the WESTERN sequence to give a feel for the
centered on the missing region such that the missing region image composition.
occupied less than 10% of the data block. Of course, when
the missing region is large enough to cover many statistically A. Known Distortion
differing areas, the resulting coefficients do not well describe Fig. 3 compares the performance of various interpolators
the underlying model for the particular missing region. In such on separate frames of WESTERN based on the mean squared
cases, the interpolation is blurred. It would be better to use error (MSE) between the interpolated missing regions and the
1514 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 4, NO. 11, NOVEMBER 1995

2.5

2
$
w
2
1.5
v)
C
a

f 1

0.5

0
IO 20 30 40 50 60
Frame
Fig. 3. MSE of various interpolators of known distortion

I I I I I
0 ‘
10 20 30 40 50 60
Frame
Fig. 4. MSE of various interpolators operating on distortion detected using the SDIa.

original clean frames. The missing regions have been assumed of the compromises within each interpolator. The blotch over
to be correctly identified in this case. The graph shows that the ‘C’ is very well removed by the median filter. The AR
the 9:8 AR interpolator performs best .overall, with the median interpolator does not perform as well here (although texture
operation being the worst and the MRF interpolator (using is reconstructed) because it is unable to reject the corrupted
first-order cliques in a 1:4:1 neighborhood with four pixel information in that same position in the previous frame (see
+
current frame support in a configuration with X = 1)striking Fig. 5). The MRF interpolator does not do as good a job of
some compromise between these extremes. reconstructing texture as the AR process since the interpolated
To illustrate this behavior, Fig. 5 shows a zoomed portion3 region above the ‘C’ is not textured at all. The fact that the
of three frames from the corrupted WESTERN sequence. The median filter reconstructs the texture in this region well is more
original (zoomed) frame 23 is shown as the bottom right hand due to the fact that it rearranges existing surrounding samples
image in Fig. 5. The missing regions (blotches) of interest have and conserves the randomness of the background texture.
been boxed in white in the top right hand image (frame 23). Visual results from the 9:O model are not shown since it
Fig. 6 shows the results of interpolating the missing regions is clear that its performance is affected by the lack of spatial
using a 9:8 AR model, the MRF interpolator and the ML3Dex support in the current frame. In this respect, it is prone to
median filter. The three boxed regions show a good overview the same problems affecting ML3Dex in that the quality of
interpolations depends heavily on the integrity of the motion
3Size (128 x 128) estimates.
KOKARAM et al.: INTERPOLATION OF MISSING DATA IN IMAGE SEQUENCES IS15

Fig. 5. Zoom on degraded frames 22, 23, (Top left, right) 24 (Bottom left) Fig. 7. Degraded frames 44, 45 (top left, right), 46 (bottom left) of
of WESTERN. Zoom on original frame 23 (bottom right). WESTERN. Bottom right: Detection on frame 45 using SDIa indicated as
bright white pixels.

Fig. 6. Zoom on restored frame 23 using MRF (top, left), 9:8 AR (top right),
M13-Dex (bottom left). Original frame 23 (bottom right). Fig. 8. Restored frame 45 using MRF, AR 9:8 (top left, right), M13-Dex and
original frame 45 (bottom left, right).

The median filter fails however, when the motion estimate


is not sufficiently accurate or the structures to be interpolated why it performs better overall when compared with the MRF
are more complicated. The upper right-hand highlighted blotch interpolator, especially with regard to texture. The lack of a
in the relevant image in Fig. 5 is a good example of this. line process in the MRF interpolator also reduces the sharpness
The diagonal structure is well reconstructed by both the MRF of the reconstructed features.
and AR processes (the MRF interpolation being somewhat
blurred as expected) but not the median process. This is the B. Unknown Distortion
basic problem with the use of the median operation in this Fig. 4 compares the performance of the interpolators when
way. Whereas the other interpolators attempt to create some the blotch locations are unknown and must be detected using,
smooth transition of data across the blotch, the median filter for instance, one of the detectors discussed in [l]. The SDIa
used in this manner rearranges the data with no regard to detector is chosen for use here because it is the cheapest
smooth transitions at the edges of the missing regions. The computationally. This is also the simplest detector for temporal
adaptive nature of the AR interpolator used also explains discontinuities [ 2 ] .
1516 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 4, NO, 1 I , NOVEMBER 1995

Fig. 4 therefore represents the performance of the in-


terpolators in a more realistic case. The interpolators used
here were left unchanged from the previous experiment. The
SDIa threshold et was set to 21.0. It shows that the relative
performance remains the same, but the absolute performance
is worse than that shown previously for the obvious reason that
there are more false alarms. A classic instance of a fast motion
induced false alarm is shown in Figs. 7 and 8. These figures
are again zoomed portions of WESTERN of size 128 x 128.
The corresponding detected missing regions are shown in
the bottom right-hand side of Fig. 7. It is clear why such a
large region is flagged as missing at the center of the picture.
That region cannot be easily matched in the previous or the
next frame. Unfortunately, only a thin ring of pixels around
the “missing” region is “undetected”; therefore, the MRF
interpolator, lacking any adaptivity in this implementation,
(Fig. 8) cannot reconstruct the “missing” region well. The AR
process performs more respectably; however, the interpolated
data is still quite blurred. The median filter also fails in
the same regions as the MRF interpolator in a more severe
Fig. 9. Frame 1 of FRANK.
manner as it introduces sharp edges at the boundaries of the
interpolated data.
In general, the MRF process is perhaps the most robust esti-
mator and provides a good tradeoff between computation and
fidelity. The AR process is much heavier computationally, but
because it is easily made adaptive, it can perform better than
the MRF interpolator presented here. The median filter does
not compare well with either of the model-based interpolators;
however, it requires by far the least computation, and provided
there is not much drastic motion in an image sequence, it will
perform acceptably.

V. REAL MOVIES
Two outstanding considerations remain with respect to real
degradation in typical motion picture film. First of all, unlike
the artificial case, blotches do not have sharp edges; therefore,
it is typical for a simple detector like the SDIa to be unable to
detect the periphery of a blotch. As a result, the interpolation
process usually cannot remove the entire defect and in the
AR case often replaces the missing data with data that has
the intensity of the undetected blotch periphery. One solution
to this problem is to examine the image data in the region Fig. 10. Frame 2 of FRANK with large blotches boxed.

of the suspected blotch and extend the detected blotch areas


if necessary. Another less intelligent, but effective, approach this sort of motion in the right forearm of FRANK. The
is to dilate [9], [27], [28] the suspected blotch locations corresponding detection fields (dilated using a 3 x 3 square
using a simple morphological operator and effectively have as a structuring element) are superimposed on the second
a pessimistic estimate of the extent of the blotch. This latter frame in Fig. 12. Bright white pixels mark the instances that
post processing stage is employed here. the SDIa flagged as corruptions with both fractional ( f 0 . 5
Although the experiments thus far have been conducted pixel) and integer accurate motion estimates. Green and red
with integer accurate motion estimates, it is typical of moving mark additional flagged pixels using fractional and integer
objects to show some degree of fractional (i.e., subpixel) accurate motion estimates, respectively. It is observed that the
motion from frame to frame. This can have a great effect area flagged as blotch from integer-accurate motion estimates
on the motion-compensated residual and cause false alarms is much larger and consists of more false alarms than that
to be flagged by the detector in a region of such motion. It of the fractional motion estimates; note the forearm. Both
is better overall to estimate motion to some fractional pixel motion estimators used a three-level multiresolution process
accuracy, such as 0.5 or 0.25 pixels [29]. The three frames as mentioned earlier, with the fractionally accurate motion
(of resolution 256 x 256) shown as Figs. 9-11 show just estimator estimating motion to f 0 . 5 pixels.
KOKARAM et al.: INTERPOLATION OF MISSING DATA IN IMAGE SEQUENCES 1517

Fig. 1 1 . Frame 3 of FRANK. Fig. 13. Restored frame 2 using 9 : 8 AR

Fig:. 12. Detection on frame 2 of FRANK. White: both fractional and integer
-
motion estimation. Green: additional flagged by fractional estimation. Red: Fig. 14. Restored frame 2 using MRF
additional flagged by integer estimation.

Figs. 13-15 show interpolations of the missing data using a flutterings of one of the petals; therefore, it is partially flagged
9:s AR model, MRF, and M13-Dex system, respectively. The as a blotch. The performance of the interpolators in this region
MRF system used cliques in a 5:8:5 neighborhood with X = 2 . is worse than in other areas.
The five pixels used in the previous frame were arranged in Subjective Assement: A series of differently, artificially,
a + configuration. The interpolated locations were flagged and real degraded sequences have been processed. Informal
by the SDIa using fractional motion estimates. Note again subjective assessment of the restored sequences displayed at
how well all the systems perform where there is little textural 25 framesls (UK PAL television standard) was performed.
detail. However, the blotch in the head of the figure is best It is found, in general, that it is difficult to determine any
interpolated by the AR system, with the MRF being somewhat major difference in quality between the restorations at this
blurred and the median filter giving a generally flat intensity. frame rate. A closer examination allows the observer to rank
Again, the classic motion-estimation problem arises in the the restorations in the order 3-DAR, MRF, and MMF. The
petals of the flower in the picture. It is very difficult for AR process is more robust to motion-estimation errors and
any motion estimation algorithm to track the almost random generally gives the smoothest interpolation. The MMF often
1518 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 4, NO. 11, NOVEMBER 1995

underlying random nature of its driving function. The choice


of a particular interpolator for a certain application is governed
by the tradeoff between fidelity and computational burden.
The AR interpolator is computationally much more intensive
than the median process, but it gives the best visual result.
The median process requires very few operations, and in
fact, despite the “rough” nature of the method, it is found
to be effective in practice. In the short term, one would
expect detector-controlled median systems to be more popular
for real time “deblotch” equipment for video, whereas the
AR and MRF systems could be used in particular instances
for high levels of degradation or where a very high quality
of reconstruction is required (in slow motion sequences for
instance).

REFERENCES

A. C. Kokaram, R. D. Moms, W. J. Fitzgerald, and P. J. W. Rayner,


“Detection of missing data in image sequences,” IEEE Trans. Image
Processing, this issue, pp. 14961508.
A. C. Kokaram, “Motion picture restoration,” Ph.D. Thesis, Cambridge
Fig. 15. Restored frame 2 using M13-Dex. Univ., Cambridge, UK, May 1993.
M. Bierling, “Displacement estimation by heirarchical block matching,”
SPIE VCIP, pp. 942-951, 1988.
J. Boyce, “Noise reduction of image sequences using adaptive motion
causes breakup of the periphery of fast moving regions. The compensated frame averaging,” in Proc. IEEE ICASSP, vol. 3, 1992,
MRF reduces the breakup of fast moving regions but cannot pp. 461264.
reproduce texture as well as the 3-DAR process. G. R. Arce and E. Malaret, “Motion preserving ranked-order filters for
image sequence processing,” IEEE Int. Con$ Circuits Syst., 1989, pp.
983-986.
A. Robust Motion Estimation G. R. Arce, “Multistage order statistic filters for image sequence pro-
cessing,’’IEEE Trans. Signal Processing, vol. 39, no. 5, pp. 1146-1 161,
The motion estimation process is adversely affected by the May 1991.
presence of large blotches. For block matching using a mean B. Alp, P. Haavisto, T. Jarske, K. Oista mo and Y. Neuvo, “Median-
absolute error criterion, the extent to which it is affected based algorithms for image sequence processing,” SPIE visual Commun.
Image Processing, pp. 122-133, 1990.
depends on the block size. To some degree, using multires-
A. C. Kokaram and P. J. W. Rayner, “A system for the removal of
olution motion estimation brings an inherent robustness to the impulsive noise in image sequences,” SPIE visual Commun. Image
estimation process. At the low-resolution levels, the size of Processing, pp. 322-331, Nov. 1992.
the corrupting blotches is quite small and affects the motion- A. K. Jain, Fundamentals of Digital Image Processing. Englewood
Cliffs, NJ: Prentice Hall, 1989.
estimation process less. At the higher resolution levels, motion M. Sezan, M. Ozkan, and S. Fogel, “Temporally adaptive filtering of
estimation would be more affected, but provided estimation at noisy image sequences using a robust motion estimation algorithm,” in
the previous levels has resulted in an estimate close to the Proc. IEEE ICASSP, vol. 3, May 1991, pp. 2429-2431.
M. Ozkan, M. I. Sezan, and A. M. Tekalp, “Adaptive motion-
actual motion, the error is small. compensated filtering of noisy image sequences,” IEEE Trans. Circuits
Unfortunately, it is quite feasible that the size of a blotch Syst. video Technol., pp. 277-290, Aug. 1993.
extends across a large enough area to cause a problem even at T. Huang, Image Sequence Analysis. New York Springer-Verlag.
1981.
the low resolution levels. These motion estimation errors do D. M. Martinez, “Model-based motion estimation and its application to
not typically cause problems in detection, but they do result restoration and interpolation of motion pictures,” Ph.D. Thesis, Mass.
in inappropriate interpolations. It would be best to address Inst. of Technol., Cambridge, 1986.
this problem specifically in the design of a motion estimator. A. C. Kokaram and P. J. W. Rayner, “Removal of impulsive noise
in image sequences,” in Proc. Singapore In?. Cont Image Processing,
However, this is not likely to be a simple task, and it is Sept. 1992, pp. 629-633.
better to consider correcting motion vectors at the sites of S. Geman and D. Geman, “Stochastic relaxation, Gibbs distributions and
suspected distortion. Although this is not addressed in this the Bayesian restoration of images,” IEEE Trans. Part. Anal. Machine
Intell., vol. PAMI-6, no. 6, pp. 721-741, Nov. 1984.
paper, operators such as the vector median or an AR-based M. A. Tanner, Toolsfor Statistical Inference, Springer Series in Statistics.
vector interpolator appear to provide useful solutions, and New York: Springer-Verlag. 1993, 2nd ed.
these are currently being pursued. H. P. Hiriyannaiah, G. L. Bilbro, and W. E. Snyder, “Restoration of
piecewise-constant images by mean-field annealing,” J. Opt. Soc. Amer.,
pp. 1901-1912, 1989.
VI. CONCLUSIONS I. M. Abdelquader, S. A. Rajala, W. E. Snyder, and G. L. Bilbro, “En-
ergy minimization approach to motion estimation,” Signal Processing,
The three interpolators presented have been shown to be vol. 28, pp. 291-309, 1992.
useful to lesser or greater extents, depending on the scenery D. Chandler, Introduction to Modem Statistical Mechanics. Oxford,
and motion present. They can all produce good interpolations U K Oxford University Press, 1987.
J. Zhang, “The application of the Gibbs-Bogoliubov-Feynman inequal-
in areas of little texture, but it is the AR interpolator that ity in mean field calculations for Markov random fields,” in SPIE visual
is best at handling texture because of its adaptivity and the Commun. Image Processing, vol. 2308, pp. 982-993, 1994.
KOKARAM et al.: INTERPOLATlON OF MISSING DATA IN IMAGE SEQUENCES 1519

R. R. Schultz and R. L. Stevenson, “A bayesian approach to image Ani1 C. Kokaram (S‘91-M‘92), for a photograph and biography, see this
expansion for improved definition,” IEEE Trans. Image Processing, vol. issue, p. 1508.
3, no. 4, pp. 233-242, May 1994.
S. V. Vaseghi, “Algorithms for the restoration of archived gramophone
recordings,” Ph.D. Thesis, Cambridge Univ., Cambridge, UK, 1988.
P. Strobach, “Quadtree-structured linear prediction models for image
sequence processing, ”IEEE Patt. Anal. Machine Intell., vol. 11, pp.
742-747, July 1989. Robin D. Morris, for a photograph and biography, see this issue, p. 1508.
S. Efstratiadis and A. Katsagellos, “A model based, pel-recursive motion
estimation algorithm,” in Proc. IEEE ICASSP, 1990, pp. 1973-1976.
R. Veldhuis, Restoration of Lost Samples in Digital Signals. Engle-
wood Cliffs, NJ: Prentice Hall, 1980.
E. DiClaudio, G. Orlandi, F. Piazza, and A. Uncini, “Optimal weighted
LS AR estimation in presence of impulsive noise,” in Proc. IEEE
ICASSP, vol. E3.8, 1991, pp. 3149-3152. William J. Fitzgerald, for a photograph and biography, see this issue, p. 1508.
J. S. Lim, Two-Dimensional Signal and Image Processing. Englewood
Cliffs, NJ: Prentice-Hall, 1990.
R. Schalkoff, Digital Image Processing and Computer Vision. New
York: Wiley, 1989.
B. Girod, “Motionxompensating prediction with fractional-pel accu-
racy,” IEEE Trans. Commun., vol. 41, pp. 604-612, 1993. Peter J. W. Rayner, for a photograph and biography, see this issue, p. 1508.

You might also like