You are on page 1of 38

Fractal Volume Compression

Wayne O. Cochran John C. Hart Patrick J. Flynn


School of Electrical Engineering and Computer Science
Washington State University
Pullman, WA 99164-2752
fwcochran,hart, ynng@eecs.wsu.edu

October 18, 1995

Abstract
This research is the rst application of fractal compression to volumetric data. The
various components of the fractal image compression method extend simply and di-
rectly to the volumetric case. However, the additional dimension increases the already
high time complexity of the fractal technique, requiring the application of sophisticated
volumetric block classi cation and search schemes to operate at a feasible rate. Nu-
merous experiments over the many parameters of fractal volume compression show it
to perform competitively against other volume compression methods, surpassing vector
quantization and approaching the discrete cosine transform.

Keywords: Data compression, fractal, iterated function system, volume visualization.

1 Introduction
The problem of managing extremely large data sets often arises in applications employing
volumetric data. This has prompted research in new techniques for economizing both storage
1
space and processing time. Data compression techniques reduce the size of volumetric data,
converting the array into a representation which is stored more eciently.
Fractal techniques based on iterated function systems [11, 2] have been successfully ap-
plied to the compression of one-dimensional signals [3, 31] and two dimensional images [13, 8],
by nding a fractal representation that models the original signal as closely as possible, and
storing the model instead of the original data. Fractal volume compression uses analogous
techniques for the encoding of three-dimensional volumetric data.
This research in fractal volume compression is part of our recurrent modeling project,
which focuses on the abilities of the recurrent iterated function system [3] as a geometric
representation. Fractal volume compression provides a rst step toward the use of linear
fractals to model arbitrary 3-D shapes.
Section 2 reviews the fractal image compression technique and summarizes four items of
previous work relating to fractal volume comrpession. Section 3 extends the components of
the fractal image compression method to volumetric data. Section 4 discusses optimization
of the search used by fractal compression methods, allowing fractal volume compression to
perform its search in a feasible amount of time. Section 5 brie y describes decompression al-
gorithms. Section 6 lists the results of various experiments and compares them with previous
results. Section 7 concludes and o ers directions for further research.

2 Previous Research
This section begins by reviewing the essentials of the fractal image compression method,
mentioning elements that directly contributes to the fractal volume compression method.
Compression researchers have only recently applied their methods to the task of com-
pressing volumetric data. There appear to be only two previous extensions of sophisticated
image compression techniques (vector quantization and the discrete cosine transform) to vol-
umetric data. In addition, there is previous work in fractal compression of 3-D image data,
but only in the form of animations and multi-view imagery. The rest of this section summa-

2
rizes these previous techniques whereas their results appear in Sec 6.7 for better comparison
to fractal volume compression.

2.1 Fractal Image Compression


Fractal image compression partitions an image into contiguous, non-overlapping square range
blocks, and imposes a second, coarser partition on the image, separating it into larger
possibly-overlapping square domain blocks [13]. The collection of domain blocks forms the
domain pool D:
Next, a class of contractive block transformations, the transformation pool T is de ned.
Each transformation consists of a value1 component, which alters the scalar values of voxels
within the block, and a geometry component, which shues (permutes) the positions of
voxels within a block. An image block transformation is contractive if and only if it brings
every two pixels in a block both nearer spatially (by re-sampling it at a lower resolution)
and closer in value (by reducing its contrast).
For each range block R; the encoding process searches for the (D; T ) 2 D  T such that
T (D) best matches R: The quality of this match is determined by a distortion measure.
The L2 distortion measure sums the square of the di erence between corresponding pixels
between the two w  h blocks X and Y is
w h
(X (x; y) , Y (x; y))2
X X
L2(X; Y ) = (1)
x=0 y=0
If the search fails to nd a satisfactory match, the range block can subdivided and some or
all of its children encoded. Storage of the range block is replaced by storage of the indices of
the domain block and transformation. This technique yielded high- delity image encoding
at compression rates ranging from 9 : 1 to 12 : 1 [13].
Numerous variations of the basic fractal image compression have appeared [28]. Tech-
niques to reduce the large search space of transformed domain blocks are of particular interest
1This component was called massic in [13], presumably due to fractal image compression's roots in iterated
function systems and measure theory.

3
to the volumetric case due to its increased time and space complexity. A block classi cation
scheme [26] trims the search space by a constant factor [13], segregating searches within
simple edge, mixed edge, midrange or shade block classes. The most promising method
reduces the time complexity of the search from linear to logarithmic by adapting existing
sophisticated nearest-neighbor techniques [27].

2.2 Vector Quantization of Volumetric Data


Vector quantization (VQ) is a popular technique for compressing images as well as many
other forms of data [10]. It attains high compression rates with little coding error by nding
the representative blocks of pixels, and representing the input in terms of these blocks.
When applied to images, vector quantization partitions the input into contiguous, non-
overlapping blocks. Based on the distributions of the block contents and the desired com-
pression rate, a codebook is constructed containing a small but diverse subset of these rep-
resentative blocks. Each of the original blocks is then coded as an index to its nearest
representative in the codebook. The decoding algorithm simply performs a table lookup for
each stored index.
Vector quantization extends directly to volume data [23]. Encoding the gradients as
well as the scalar values of volume data accelerates the rendering of VQ compressed volume
data, directly from the compressed version [24]. Shading the codebook as a preprocesses
reduced total shading time by a factor of 1000. Rendering the codebook as a preprocess
and compositing the resulting pixmaps reduced orthographic-projection rendering time by a
factor of 12.
High-speed VQ rendering requires a \space- lling" block arrangement, such that each
block contain enough information to interpolate the values of the cells. For example, their
1283 dataset consists of (128=2)3 = 262; 144 \voxel-spanning" 23-voxel blocks, but consists
of (127=1)3 = 2; 048; 383 space- lling blocks. Moreover, high-speed VQ requires the storage
of not only the scalar value at each voxel, but also its three normal components and a color.
Hence, their example \compression" for high-speed rendering of a CT dataset using 23-voxel

4
blocks rst expands the data by a factor of ve with shading information, then compresses
the result by a factor of only ve due to the space- lling arrangement of blocks, resulting in
no compression, just fast low-quality rendering.

2.3 DCT Volume Compression


The standard JPEG [25] image compression scheme, based on the discrete cosine transform
(DCT), also extends to volumetric data [32]. The scheme partitions the input volume into
83-voxel blocks and the DCT converts these 512 spatial values into 512 frequency values,
which quantize and entropy-encode better than the original spatial data.
Discrete cosine transform algorithms have been studied exhaustively, and existing opti-
mized algorithms for computing DCT coecients have yielded very fast compression times
(e.g. 132.1 seconds for a 12-bit 2563 CT head dataset).
An overlapping (space- lling) arrangement of the blocks, as in the previous example, al-
lowed DCT-compressed volumes to be rendered directly from the compressed version. Non-
overlapping (voxel-spanning) blocks of 83 voxels were collected into overlapping (space- lling)
\macro-blocks" of 323 voxels. Moreover, the overlap was increased from one voxel to three to
support gradient computation, but the redundancy was reclaimed through careful compres-
sion of macro-block boundaries. Macro-blocks were accessed and decompressed on demand,
one at a time, during the rendering process. This scheme saved time by avoiding the un-
necessary decompression of blocks that were completely occluded, and saved space, allowing
compressed volumes to be rendered on workstations that lack the necessary memory to store
the entire uncompressed volume.

2.4 Fractal Encoding of Moving Pictures


The fractal encoding of moving picture sequences was demonstrated in [5] as an extension
of one and two dimensional signal encoding. Using the temporal data as a third dimension,
43-voxel range blocks and 123 -voxel domain blocks were de ned over a series of 12 partial
video sequences (termed slabs) for 3-D fractal compression. Treating time simply as a third
5
dimension caused annoying artifacts in the decompressed frames. Accurately-reproduced
high-contrast edges tended to visually mask errors that occurred in the lower frequency
background. Once an edge passed from the scene, these previously-masked errors became
more obvious, causing a visual incongruity between edge and non-edge frames. Increasing
the frame rate reduced this problem, but e ectively reduced the system to a still image
coder.
As a result, this form of volume compression was abandoned and a new system was
developed that compressed the rst frame using a fractal technique whereas successive frames
simply drew upon 2-D domain blocks from the previous frame. Reasonable quality was
reported at 80Kb=s for 352  288 monochrome frames at 10 frames per second.

2.5 Fractal Compression of a Multi-View Image


A fractal compression technique was devised for another form of three-dimensional image
data called a multi-view image, a collection of images obtained from viewing an object
from a discrete range of positions. Given enough images the object can be re-observed in
3-D with smooth transitions between continuously changing viewpoints as motion parallax
places the object in relief. The tremendous size of these 3-D datasets necessitates compression
techniques that exploit the view-to-view coherence between neighboring images.
The encoding technique used in [22] for multi-view images is an application of the algo-
rithm presented in [13] with very few extensions to take advantage of the extra dimension.
Their 3-D dataset contained color values along the x; y; and v (view) axis which was
partitioned into 82  5-voxel range blocks. Neighboring range blocks along the view axis
overlapped slightly, such that when decoded, these overlapping regions averaged together to
produce smoother transitions between reconstructed range blocks. Domain blocks consisted
of eight range blocks, and were searched in an outward spiraling path in the xy-plane,
extending from the goal range block, with the expectation that matching domain blocks are
spatially near the range block [13]. Only four isometry transformations were used.
As in [13], range and domain blocks were classi ed, though in this case based solely on

6
the variance of their brightness. Range blocks were also subdivided when necessary. After
entropy encoding, the parameters describing the fractal transformations, a 17-view color
multi-view image of a toy dog was coded at a bit rate of 0.1095 bits per pixel with a PSNR
of 37.52 dB.
Fractal coding is appreciated for its resolution independence. Since fractal transforma-
tions simply describe a contraction from one region of the dataset to another they can be
used at any resolution. Arti cial data was interpolated from the compressed representation
by decoding at a higher resolution than the original dataset. This feature supported the
approximation of a 51-view 3-D image with a 17-view image, which e ectively improved the
bit rate from 0.1095 to 0.0365 bits per pixel [22].

3 Fractal Volume Compression


Just as fractal image compression encodes images with block-restricted contractive self-
transforms, fractal volume compression likewise encodes volumes. The design issues for
creating a fractal block coding system for volumes are direct extensions of those used in
fractal image compression. The process partitions the volume into both ne range blocks
and coarse domain blocks, and then nds for each range block the transformed domain block
that best matches it.
This section begins with the characterization and notation of volumetric data, followed
by descriptions of the volumetric range, domain and transformation pools.

3.1 Volume Datasets


A volumetric dataset is a collection of voxels which represent some measurable property of
an object sampled on an integer 3-D grid. The algorithm presented here is constrained
to input volumes with scalar voxels. Volumes with vector samples may be divided into
separate datasets containing only scalar voxels and each resulting new dataset is encoded

7
independently2.
The functional notation V (x; y; z) 2 R denotes the voxel located in the input volume at
the grid point (x; y; z) 2 Z3: Each voxel outside the volume's region of support is de ned
to be zero (i.e., a volume with dimensions W  H  D implies V (x; y; z)  0 if (x; y; z) 62
[0; W , 1]  [0; H , 1]  [0; D , 1]).
Each voxel is normally quantized to b bits by mapping it to the integer range 0 : : : 2b , 1
thus requiring W  H  D  b bits to store the entire volume directly.
Volumetric data typically arises from the spatial measurement of real-world data, but also
from simulated sources. Medical applications use computed-tomography (CT) or magnetic-
resonance-imaging (MRI) scans. The study of aerodynamics depends on the results of wind
tunnel data and computational uid dynamics simulations. Computer graphics has found
situations in which a volumetric representation performs better than a surface description
[16, 15]. Volume compression makes the management of such massive amounts of volumetric
data feasible for general use in existing facilities.
The distortion metric
,1 hX
wX ,1 dX
,1
L2(X; Y ) = (X (x; y; z) , Y (x; y; z))2 (2)
x=0 y=0 z=0
measures the similarity or \distance" between two w  h  d blocks of voxels X and Y: The
notation L2(X; Y ) will be used to represent this value.

3.2 Volumetric Range Partitioning


As with many compression algorithms, the dataset is partitioned into small spatial regions
and each region is encoded separately. The simple scheme used here dices the volume into
w  h  d non-overlapping cuboid cells (termed range blocks) and encodes each cell (i.e.,
subvolume) individually. A range block is uniquely located within the volume by a corner
grid point (xr; yr ; zr) taken from the set
R = f(iw; jh; kd) j (i; j; k) 2 [0; dW=we , 1]  [0; dH=he , 1]  [0; dD=de , 1]g (3)
2 In image compression, color images are analogously divided into independently-compressed color planes.

8
which de nes all the range blocks in this partitioning set. The voxels that comprise any
range block can be enumerated by the function
R(x; y; z) = V (xr + x; yr + y; zr + z); (x; y; z) 2 [0; w , 1]  [0; h , 1]  [0; d , 1] (4)

which conveniently allows us to reference voxels within a range block without regard to their
global positions within the volume.
Adaptive partitioning subdivides range blocks that fail to encode satisfactorily. Larger
range blocks yield a higher compression rate, though typically at a lower delity. The
encoder rst attempts to nd maps for large range blocks. Each range block is subdivided
into eight children. If the coding of the range block results in a distortion above a speci ed
threshold tmse for any child, then that child block is coded separately (and possibly subdivided
itself). If seven or eight children require separate coding, then a short stub code replaces the
range block's code, and all eight children are encoded separately. The overhead associated
with tracking this hierarchical coding requires each (non-stub) parent code contain child
con guration information.

3.3 Volumetric Domain Pool


A set of nw  nh  nd (n = 2; 3; : : :) subvolumes called domain blocks are now de ned by
the set of corner lattice points
D = (ix; jy ; kz ) (i; j; k) 2 0; W , nw  0; H , nh  0; D , nd
(   " $ %# " $ %# )


(5)

x y z
which are spread evenly throughout the volume by the integer domain spacing parameters
(x; y ; z ): A domain block located by the grid point (xd; yd; zd) 2 D uses the function

D(x; y; z) = V (xd + x; yd + y; zd + z); (x; y; z) 2 [0; nw , 1]  [0; nh , 1]  [0; nd , 1]: (6)


to reference local constituent voxels. Note that the lattice support of these domain blocks
may overlap for large domain pools (i.e., 's are small) or may be widely spread apart (i.e.,
's are large).
9
The domain block search is a minimization problem, but often the true minimum is
discarded in practice when other domain blocks yield higher compression with sucient
delity. While large domain pools are e ective for providing accurate maps they can decrease
compression performance by forcing large indices to be transmitted to the decoder. Often a
small localized search can provide suciently accurate domain maps for many range blocks.
A common scheme for locating source domain blocks (see [22]) is a outward spiral search
emanating from the target range block. Transmitted indices along this spiral path will
usually provide low bit rates when entropy encoded [4].
Fractal volume compression utilizes a two pass system that rst checks a small number
(e.g., 64) of spatially close domain blocks (using only the identity transformation). For
those range blocks that encode poorly during this rst pass a larger (global) domain pool
is interrogated during a second pass. The rst pass is very rapid (and signi cantly reduces
the number of searches needed during the slower second pass). The global domain pool is
extremely large and Section 4 addresses its search.

3.4 Volumetric Transformation Pool


A set of transformations T are used to map (source) domain blocks to (target) range blocks.
Each transformation is composed of several components. The rst component re-samples
domains blocks at a coarser resolution to obtain geometric contractivity. The second removes
the mean value (i.e., the DC component) from each of the domain block's voxels. Another set
of transformations alter the values of voxels within a block by scaling each voxel by a constant
and adding an o set to the result. Constraining the magnitude of the scaling constant to be
strictly less than one guarantees contractivity, but the introduction of the orthogonalization
operator appears to removes any restrictions on the size of these scaling coecients [12]. The
diversity of the domain pool is increased by the last class of transformations which permute
voxels within the block.
The spatial contraction operator Cn decimates a nw  nh  nd domain block so it is

10
re-sampled with the same spatial resolution as a w  h  d range block:
n,1 n,1 n,1
Cn  D(x; y; z) = w  1h  d
X X X
D(nx + i; ny + j; nz + k): (7)
i=0 j =0 k=0
This e ectively applies an n3 averaging lter to the domain block followed by a simple
subsampling operation. Here we have forced the dimensions of the domain block to be
integral multiples of the range block dimensions, but the following equivalent decimation
operator Cn0 can be used to contract w0  h0  d0 domain blocks without this restriction:
1 w ,1 h ,1 d ,1 D x+i ; y+j ; z+k
0 0 0     $ %!

Cn0  D(x; y; z) = :
X X X

w0  h0  d0 i=0 j=0 k=0 w h d (8)

An orthogonalization transformation O is now applied which simply removes the contracted


domain block's DC component so that it is has zero-mean. Given the domain block mean
w,1 h,1 d,1
d = w  1h  d Cn  D(x; y; z);
X X X
(9)
x=0 y=0 z=0
this operation is described by
O  Cn  D(x; y; z) = Cn  D(x; y; z) , d: (10)
Next, the ane voxel value transformation G ; scales each voxel in a block B () by and
adds  to the result
G ;  B (x; y; z) = B (x; y; z) + : (11)
When = 0 this transformation simply de nes a block with a single uniform value of : In
place of the eight 2-D block isometries used in [13], we now have 48 3-D block isometries for
cubic partitions (i.e., w = h = d). Exactly half of them are listed in Tables 1 and 2, and
represent the rigid body rotations about thirteen axes (three face axes, six edge axes and
four vertex axes). These are doubled to include all re ections by complementing all three
coordinates in each case.
Non-cubic partitions only consider eight isometries: the identity, the three 180 rotations
about the principle axes, the re ections of the principle axes, and total re ection, shown in
Table 3.
11
x y z Identity
,x ,y z Rotation of 
x ,y ,z about the three
,x y ,z face-face axes.
,y x z Rotation of =2
y ,x z about the three
x z ,y face-face axes.
x ,z y
,z y x
z y ,x
Table 1: Ten rigid body cubic rotations, shown in terms of their result on the input vec-
tor (x; y; z):

The transformation pool T is the set of all possible transformations of the form
T = Ik  G ;  O  Cn (12)
which can be parameterized by an isometry index 0  k < 8; a \contrast scale" and a
\luminance shift" : For e ective quantization the value of is chosen from a nite set
f 0; : : :; n,1g (determined a priori) and  is mapped to the nearest integer. We will use
the notation T (D) to denote the net e ect of the transformation T 2 T on the domain block
D 2 D:

4 Searching
For each range block R 2 R a transformation T 2 T and domain block D 2 D are sought
after that yield a suciently small distortion measure L2(R; T (D)): This search dominates
the fractal compression process, and its extension to volumetric data causes this search time
12
y x ,z Rotation of 
z ,y x about the six
,x z y edge-edge axes.
,y ,x ,z
,z ,y ,x
,x ,z ,y
z x y Rotation of (2=3)
,y z ,x about the four
,z x ,y vertex-vertex axes.
y ,z ,x
y z x
z ,x ,y
,y ,z x
,z ,x y

Table 2: Fourteen more rigid body cubic rotations shown in terms of their result on the
input vector (x; y; z):

13
x y z Identity
,x ,y z Rotations
,x y ,z
x ,y ,z
,x y z Re ections
x ,y z
x y ,z
,x ,y ,z
Table 3: Eight non-cubic transformations.

to greatly lengthen.
Designing a fast encoder hinges on the eciency of the search for matching transformed
domain blocks. This problem is approached by introducing heuristics to guide the encoder's
search towards regions of the dataset that share common characteristics and are therefore
more likely to provide self-similar mappings. A classi cation scheme [26] provides such
guiding [13]. This scheme used an edge-oriented classi cation system designed to alleviate
both the time complexity and edge degradation problems inherent to block image coders.
This system does not easily extend to the realm of 3-D block classi cation as a consequence of
the complexity added by the extra dimension. One previous volumetric classi cation scheme
[22] thresholded the sample variance of voxels within a block. This has the undesirable
e ect of avoiding rapidly convergent transformations that map high contrast regions to low
contrast range blocks.
Even with classi cation, this search for self-ane maps still remains sequential. Instead
of segregating blocks into a set of prede ned classes, associating a real-valued key for each
block replace the linear global block search with a logarithmic multi-dimensional nearest-
neighbor search [27]. This solution is readily generalized to 3-D block encoding and allows

14
for much larger search spaces which improve coding delity but also require larger domain
indices.

4.1 Brute Force Search


A brute force algorithm is e ective for the rst pass search of the local domain pool but is
too time consuming for the global domain pool.
For each range block R 2 R; the compression algorithm searches the domain and trans-
formation pools for the domain block D 2 D and transformation T 2 T that minimizes the
distortion L2(T (D); R) where T (D) is the result of applying the transform T to the domain
block D: The resulting set of \self-tiling" maps produce the \fractal" code that replaces the
input volume dataset. A complete search of the virtual codebook D  T involves examining
all possible parameterizations Tk; ; and their e ect on every domain block in the domain
i

pool. The algorithm in Table 4 outlines this \brute force" solution. Every range block in
the partitioning set R is examined for which the entire domain pool D and transformation
pool T is scanned for the optimal transformation.
One can prune the exhaustive search of D  T by solving for the optimal gray-value
transformation coecients. If we represent range blocks and contracted domain blocks with
the m  1 (m = w  h  d) column vectors [r0 : : : rm,1]T and [d0 : : : dm,1 ]T ; respectively, then
the optimal choice for and  provides the least squares t to an over-determined system
of the form Ax = b: 2 3 2 3

6
d0 1 72 3
r0
6 7
6 7 6 7
6
d1 1 7
r1
6 7

= : (13)
6 76 7 6 7
6
6
6
... ... 74
7
7
5
.
6
.
.
6
6
7
7
7
6 7 6 7
4 5 4 5
dm,1 1 rm,1
If we consider uniform domain blocks (i.e., d0 = d1 =    = dm,1) non-admissible then
the columns of A are linearly independent, the matrix AT A is invertible and the unique
least-squares solution x = [ ]
 T is

x = (AT A),1AT b: (14)


15
for R 2 R do begin
dist 1;
for D 2 D do
for T 2 T do begin
D0 T (D);
if L2(R; D0 ) < dist then begin
code fT; Dg;
dist L2(R; D0 );
end
end;
replace R with quantized code;
end;

Table 4: Exhaustive search over D  T for each range block R 2 R:

16
Let d = E (di ) = m1 mi=0,1 di be the rst moment or mean of di (similarly for r = E (ri )),
P

and let d2 = E (d2i ) , (E (di))2 be the second central moment or variance of di ; and let
rd = E (ridi ) , E (ri)E (di ) be the second central moment or covariance of ri and di ; then
2 3 2 3

 rd =d2
x =  =
6
4
7
5
6
4 :7
5 (15)
 r ,  d
The orthogonalization transformation O yields d = 0: The o set value simply becomes
the DC component r of the range block. The value i closest to  from the set fa0; : : :an,1g
is selected as our contrast scaling coecient.  is simply mapped to the nearest integer.
It is well known from transform coding that much of the energy in an image represented in
the frequency domain resides in its DC term. Therefore, it is critical that the decoder can
faithfully reproduce  from whatever quantization scheme is used.

4.2 Volumetric Block Classi cation


Block classi cation was used in [13] not only for the encoding speedup that was obtained
by segregating the search, but for also determining the complexity of each block. This
allowed for more expressive transformations to be used for highly detailed blocks and simpler
transformations for less interesting blocks. Devoting a higher bit rate to more complex areas
of the image is worthwhile for maximum performance. Extension of the block classi cation
techniques borrowed from classi ed VQ [26] into 3-D is non-trivial and would no doubt be
computationally expensive. The nal volume coder used in our experiments used a simple
variance threshold value to separate blocks into one of two classes: boring and active. Less
bits were devoted to contrast scaling coecients and no isometries (other than the identity)
were attempted for blocks belonging to the boring class.
In [6] principal component analysis (PCA) classi ed blocks based on their gradients. PCA
is a well known technique for nding a set of ordered orthogonal basis vectors that express the
directions of progressively smaller variance in a given dataset [21, 14]. The dimensionality of
highly correlated datasets can be reduced in this manner by expressing their values in terms
of these new basis vectors. This is often used for nding normals to planes that best t a set
17
of scattered data points fxig: The technique of weighted PCA assigns a relative importance
or weight wi to each point xi in the set. A set of principal vectors were extracted from
subvolume blocks in this manner by using the block's supporting lattice for each xi and the
corresponding voxel values for the weight wi: While this reduced encoding time and bit rate
for the volume coder, some complex blocks were misclassi ed as not containing signi cant
gradients thus leaving the possibility of critical blocks producing poor collage maps. Three-
dimensional block classi cation remains a fertile area of study that could prove bene cial to
many areas in volume visualization.

4.3 Nearest Neighbor Search


Even with classi cation, we are still faced with a demanding search through N admissible
domain blocks (under the set of allowed isometries), computing least squares approximation
along the way, to nd the optimal encoding for each range block. This daunting task can be
reduced to a multi-dimensional nearest neighbor search which can be performed in expected
logarithmic time [27].
Using the notation of Section 4.1, the transformed domain block Ax is the projection
of r onto the column space of A which is spanned by the orthonormal basis vectors e =
p1m [1 : : : 1]T and (d) where
(x) = kxx ,
, hx; eie :
hx; eiek (16)
Thus the projection Ax is equivalent to
ProjA (r) = hr; eie + hr; (d)i(d) (17)
for a given domain block d: Note that (x) simply removes the DC term of x which is
accomplished by the operator in Equation 10. Since e and (x) are orthogonal the coecients
and  of the gray level transformation G from Equation 11 are not correlated.
In [27], the search for the domain block d that yields minimal coding error L2(r; ProjA (r))
is shown to be equivalent to the search for the nearest neighbor of (r) in the set of 2N vectors
f(di)gNi=1: The problem is now reduced to nding nearest neighbors in Euclidean space for
18
which there are well known algorithms [9, 30]. Given a set of m-dimensional points [9] shows
how an optimal space dividing kd-tree can be constructed with O(mN log N ) preprocessing
steps so that the search for the nearest neighbors of a given query point can be found in
expected O(log N ) time. Since this technique su ers when the dimension m becomes large
we down- lter all blocks to 23(m = 8) vectors before processing. All domain blocks that are
close to uniform (e.g., d2  8) are discarded while the remaining domain block addresses are
stored in the kd-tree twice, once for each search key (d):
Since we are down-sampling r and d to compute (r) and (d) the kd-tree search will
not guarantee nding the actual nearest neighbors. Fortunately, searching the kd-tree for
several (e.g., 10 or 20) neighbors can still be done in logarithmic time for a given query
point. For each neighbor the least squares method for nding and  are performed at the
range block's full resolution. Also, in order to reduce the memory requirements, we perform
the search for isometries of the range block fIk,1(r)g7k=0 instead of explicitly storing all of
the keys f(Ik (d))g7k=0 for each domain block.
A substantial speedup is possible if we relax the constraint of nding the true nearest
neighbors to simply nding good matches. When searching for the n nearest neighbors, the
algorithm described in [9] keeps track of the dissimilarity measure r (i.e., radius) of the nth
best match found so far. It is this value r that informs us whether we need to traverse
another branch of the space partitioning tree. If we scale r by some number less than 1
(e.g., 0.2) when making this decision we can avoid inspecting large portions of the tree. This
modi cation was added in our encoder without signi cantly a ecting the quality of the maps
found.

5 Fractal Volume Decompression


Complete decompression begins with an initial volume, typically initialized with zeroes. The
next volume in the sequence is partitioned into range blocks, and the previous volume into
domain blocks. Then the appropriate domain block in the previous volume is mapped to

19
each range block in the next volume. At each step of the iteration, the domain pool is
re ned. As this process continues, it converges to the decompressed version.
In implementation, the process actually needs only one volume, partitioned into both
range and domain blocks, which is decompressed onto itself at each step of the iteration.
Each range block is overwritten with the transformed contents of the appropriate domain
block. If this overwrites a section of the volume later accessed as domain block, then some
or all of the domain block would be as it was in a later iteration. Some care is necessary for
hierarchical (e.g., octree) representations to make sure that the larger more lossy \parent"
range blocks do not overwrite previously decoded \child" blocks from a previous iteration.
Faster techniques exist for decompression fractal-coded images [20], and could be easily
extended to the volumetric case. However, these techniques are typically designed for anima-
tion playback, and volume visualization systems are not yet fast enough to render dynamic
volume datasets in real time.
The fractal volume compression algorithm can be adjusted to permit direct rendering of
compressed data, avoiding a complete decompression step. Separating a volumetric dataset
into a set of possibly overlapping \macro-blocks" supports on-demand decompression during
rendering [32]. Such macro-blocks may be incorporated into fractal volume compression by
selecting domain blocks within the macro-block containing the range block. In this fashion,
each macro-block of the compressed volume may be decompressed independently.

6 Results
The fractal volume compression algorithm was tested on a variety of popular, publicly-
available datasets, and its performance measured using a variety of metrics, to promote
better comparison with other existing and future volume compression methods.

20
6.1 Measurement
Several quantitative methods exist for indicating the delity of a compression algorithm. In
the following, V is the original volume with dimensions W H D; and V~ is the decompressed
volume. The function "(x) represents the error between V (x) and V~ (x) :

"(x) = V (x) , V~ (x): (18)


The mean-square error (mse) is one numerical measure for determining the accuracy of
a compressed volume, and is de ned
x "2(x) :
P

mse = W  H  D (19)

Another way to express the di erence or \noise" between V and V~ is the signal-to-noise
ratio (SNR). Several versions of SNR exist, which can cause confusion when comparing
results. The signal-to-noise ratio used in [24], denoted SNRf ; measures the ratio of the
signal variance to the error variance:
var V~ (x) E ( ~ 2(x)) , E 2(V~ (x))
V
SNRf = 10 log10 var "(x) = 10 log10 E ("2(x)) , E 2("(x)) : (20)

If "(x) and V (x) each have a mean of zero then SNRf is equivalent to the mean squared
signal-to-noise ratio
SNRms = 10 log10 x V (x~) :
P
~ 2
(21)
L2(V; V )
Even for datasets that are not zero-mean (as is the case here) this is often used to measure
the quality of the reconstructed signal.
The peak-to-peak signal-to-noise ratio, PSNR, is de ned

PSNR = 10 log10 varP"(x) ;


2
(22)
where P 2 is the square of the dynamic range or peak signal value, the distance between the
largest and smallest voxel values, and the denominator is the variance of "(x) which is often
simply approximated by the mse.

21
6.2 CT Medical Data
A 2562  113 12-bit CT head dataset (containing voxel values ranging from ,1117 to 2248)
was use to test the fractal volume compression algorithm over several di erent range block
sizes w  h  d and octree subdivision down to an edge size no less than two voxels. An mse
distortion threshold value of tmse (the maximum allowed average mse per voxel) determined
which child blocks were recursively encoded. The domain pool blocks were spaced x = 4;
y = 4; and z = 2 apart, yielding approximately 230; 000 domain block records for each
octree partition level. Encoding times were measured on a dual-processor HP-9000/J200
with 256 MB of RAM, and do not include le I/O time.

whd tmse comp. SNRms PSNR time


23 4; 608 10:99 : 1 27.62 42.57 134 s
43 1; 024 14:17 : 1 30.15 45.09 973 s
43 2; 048 18:41 : 1 28.74 43.68 711 s
43 4; 608 26:18 : 1 26.68 41.63 519 s
83 1; 024 15:93 : 1 29.70 44.64 1,607 s
83 2; 048 22:00 : 1 27.97 42.91 1,285 s
83 4; 608 34:37 : 1 25.50 40.43 1,081 s
32  2 4; 608 20:83 : 1 25.14 40.09 249 s
62  4 4; 608 39:21 : 1 21.08 38.06 616 s

Table 5: Encoding results for a 12-bit 2562  113 CT head using an mse octree partitioning
threshold of 4608.

To demonstrate the fast convergence of the decoder, several reconstructed 12-bit CT


heads were created using one to six iterations of the stored transformations. The delity of
these reconstructed volumes (using the maps corresponding to the seventh row of Table 5)
are given in Table 6. One advantage to introducing the orthogonalization operator O (see
22
Equation 10) is a quick and guaranteed convergence to the attractor in a xed number of
iterations [12]. In this example, only ve iterations of the simple method outlined in Section 5
were required to decode the data. Each iteration took approximately 15 seconds.

iterations SNRms PSNR


1 13.24 28.08
2 21.09 36.03
3 24.99 39.34
4 25.47 40.40
5 25.50 40.43
6 25.50 40.43

Table 6: SNR for decoded 12 bit CT head using 1 to 6 iterations of the stored transformations.

This same dataset was reduced to 8 bit voxels by mapping each original voxel v to the
range f0 : : : 255g by round(255  (v + 1117)=3365): The 8-bit volume was compressed using
various mse child threshold values tmse to determine when octree subdivision would occur.
Whereas images are generally compressed in their rendered state, volume data requires
further processing, such as classi cation, shading and rendering [18, 7], before being displayed
as an image. Hence errors introduced through compression yield di erent artifacts in the
volumetric case than in the image case. An isovalue surface rendering, such as Figure 2,
puts volume compression techniques to an extreme test. Moreover, faces are typically used
to analyze compression delity since humans have developed visual skills speci cally for
interpreting subtle changes in facial expression.

6.3 MRI Medical Data


Fractal volume compression was also tested on an 8-bit 2562  109 MRI dataset of a human
head. The noise in MRI datasets challenges compression algorithms more than CT datasets.
23
whd tmse comp. SNRf PSNR time
23 18 11.02:1 25.90 41.49 166 s
32  2 18 13.92:1 24.53 40.10 241 s
42  2 9 24.61:1 22.04 37.63 448 s
42  2 18 29.07:1 21.52 37.12 298 s
43 18 18.02:1 25.30 40.87 607 s
62  4 9 20.78:1 22.81 38.39 830 s
62  4 18 27.51:1 21.74 37.32 640 s
82  4 18 55.35:1 14.41 30.13 592 s
83 18 22.27:1 24.18 39.75 1,003 s
122  8 9 24.08:1 20.69 36.29 1,362 s

Table 7: Encoding results for an 8-bit 2562  113 CT head using an using varying mse octree
partitioning thresholds

24
Figure 1: Slice #56 of the CT head volume dataset: Original (upper left), 18:1 43-voxel
range blocks (upper right), 22:1 6  42-voxel range blocks (lower left), 22:1 83-voxel range
blocks (lower right).

25
Figure 2: Isovalue surface rendering of the skin (isovalue = 50) of the CT head volume
dataset: Original (upper left), 18:1 43-voxel range blocks (upper right), 22:1 6  42-voxel
range blocks (lower left), 22:1 83-voxel range blocks (lower right).

26
For example, a simple run-length encoding of the CT head dataset reduces it to 68% of its
original size whereas the RLE method failed to compress the MRI head dataset.

tmse comp. mse SNRf PSNR


36 20.23:1 11.116 18.332 37.672
64 29.82:1 17.939 16.281 35.692
100 42.72:1 25.813 14.520 34.014
144 60.80:1 36.058 12.977 32.563
169 85.30:1 48.355 11.611 31.289
256 118.23:1 61.501 10.480 30.246
324 160.06:1 75.232 9.527 29.372
400 211.97:1 89.530 8.696 28.617
900 728.85:1 152.33 5.949 26.305

Table 8: Coding resulting for an 8-bit 2562  109 MRI of a human head with varying octree
partitioning threshold values tmse : Range block size varied dynamically from 163 to 23: A
domain spacing of x;y;z = 4 was used along each axis.

Figure 3 demonstrates the delity of the fractal volume compression on a slice of the MRI
dataset. Even at the extremely high 729:1 rate, the skin and bone edges are reproduced, but
the textured detail is obscured.

6.4 CT Engine Data


Table 12 contains results from an 8-bit 2562  110 CT scan of an engine block. The engine
contains many sharp edges, which fractal volume compression performs particularly well on.
A side-by-side comparison (Figure 4 shows the compressed version to be indistinguishable
from the original.

27
Figure 3: Slice #54 of the MRI head volume dataset: Original (upper left), 20:1 (upper
center), 25:1 (upper right), 30:1 (lower left), 43:1 (lower center), 729:1 (lower right).

28
block size comp. SNRf (dB) PSNR (dB)
23 8.59:1 21.31 40.60
22  3 11.09:1 20.24 39.53
32  2 15.35:1 18.99 38.30
43 12.14:1 20.98 40.26
42  6 16.29:1 19.08 38.39
83 13.59:1 20.72 39.99
163 13.91:1 20.47 39.76
323 13.96:1 20.00 39.30

Table 9: Comparison of compression rate and delity versus range block sizes and hierarchy
depth of the MRI head dataset, using an mse octree partitioning threshold tmse = 18 and
domain spacing x;y;z = 4 along each axis.

partition total local search admissible global


range size range codes codes found domain blocks
163 1,792 1,109 47,703
83 4,302 144 42,992
43 26,724 1,037 36,445
23 104,521 75,606 29,218
Total codes transmitted = 137,339.

Table 10: Coding statistics for an 8-bit 2562  109 MRI of a human head at various levels
in the octree using a partitioning threshold value tmse = 36 and domain spacing x;y;z = 4
along each axis.

29
partition total local search admissible global
range size range codes codes found domain blocks
163 1,792 1,443 47,703
83 744 96 42,992
43 879 356 36,445
23 274 273 29,218
Total codes transmitted = 3,689.

Table 11: Coding statistics for an 8-bit 2562  109 MRI of a human head at various levels
in the octree using a partitioning threshold value tmse = 900 and domain spacing x;y;z = 4
along each axis.

direct storage 7,208,960 bytes


RLE compressed 5,413,292 bytes
FVC compressed 216,460 bytes
compression time 552 s
compression rate 33.3:1
compression rate (over RLE) 25.0:1
mse 4.17
PSNR 41.93
SNRms 29.10
SNRf 28.13

Table 12: Results from compressing an 8-bit 2562  110 CT scan of an engine block. 83-voxel
range blocks (octree partitioned down to 23 voxels) were used with a partitioning treshold
of tmse = 18: The domain spacing x;y;z = 4 was used along all three axes.

30
Figure 4: Original (left) and compressed (right) rendering an engine block MRI.

6.5 Slicing: Comparison with Fractal Image Compression


Volumetric encoding performs better than slice-by-slice encoding and also o ers other ben-
e ts such as direct rendering from the compressed version.
A fractal image coder was used to compress each cross section of some of the datasets.
Independent compression of the individual slices of the 8-bit CT head resulted in a volume
compressed by 18.06:1 with a PSNR of 39.21 dB, but Table 7 shows that similar delity
(39.75 dB) resulted from fractal volume compression at a better rate of 22.27:1. Slice-by-
slice compression of the 8-bit MRI resulted in a 30.74:1 rate at a PSNR of 33.717 dB whereas
Table 9 shows the volumetric version compressed at a better delity (34.014 dB) at a higher
ratio of 42.72:1. As the compression rate increases, the bene ts of volumetric compression
of volumetric data appears to grow.

31
6.6 Dicing: Integration of Macro-Blocks
Fractal volume compression supports the rendering of a volume directly from the compressed
version by rst dicing the volume into macro-blocks [32] (groups of blocks), and then com-
pressing each macro-block independently. Each macro-block may then be decompressed inde-
pendently. Hence large datasets may be rendered on workstations lacking sucient memory
to contain the entire data, and only visible sections of the volume need be decompressed.
The 8-bit CT head was also diced into 323 -voxel macro-blocks, which were compressed
individually, to measure any loss of delity due to the limited domain pools. Since the
macro-blocks were signi cantly smaller than the entire volume, a tighter domain spacing
was allowed, and resulted in an impressive compressed volume delity PSNR of 41.28 dB
(SNRms of 27.50 dB). The compression rate for such a volume remains under investigation,
but will surely beat the 11 : 1 ratio in Table 7.

6.7 Comparison with Previous Results


The CT head dataset results compare with other volume compression algorithms based on
vector quantization and the discrete cosine transformation. The VQ algorithm was tested on
an 8-bit 1283 version of the dataset [24] whereas the DCT algorithm was tested on a 12-bit
2563 version [32]. As the resolution of the original dataset is 2562  113; both results are
guilty of some form of interpolation or padding, and such added redundancy can in ate the
compression ratio by as much as a factor of two.
Using voxel-spanning blocks, the VQ algorithm was capable of a compressing the 1283
volume by 40:1 at a SNRf of 17.6 dB [24]. At about 40:1, fractal volume compression yields
a measurably-better volume of 21.08 dB.
The performance of this fractal volume compression implementation falls short of the
mark set by the DCT volume compression algorithm [32]. Some have produced fractal
image compression algorithms that outperform the DCT [4], and one would expect similar
performance from fractal volume compression once fractal techniques receive the same level
of attention and optimization as the DCT currently has.
32
7 Conclusion
Fractal image compression extends simply and directly to three dimensions. As in the
image case, the technique still matches domain blocks to range blocks, but the additional
dimension produces many more isometries yielding a richer transformed-domain pool, higher
compression rate and better delity.
The increased search time caused by the additional dimension is overcome by several
sophisticated classi cation schemes. In particular, the nearest-neighbor method [27], which
is the only one to reduce the time complexity of the search, performed best for fractal volume
compression.
The performance of fractal volume compression beats VQ, and rivals DCT volume com-
pression. There is evidence that fractal image compression can beat DCT image compression
[4] and there are many common cases where fractal compression is the preferred technique.
The fractal volume compression system used for these results is available at:
ftp.eecs.wsu.edu:/pub/hart/fvc.tar.gz

7.1 Applications
Contrary to its name, fractal image compression performs better on sharp edges and worse
in textured areas. Hence, fractal volume compression performs better on clearly delineated
regions, such as bone and skin, but worse on tissue or nely detailed areas. The DCT volume
compression method blurs both edges and ne detail [32]. Hence, fractal volume compression
appears well suited for the compression of classi ed, ltered datasets, and also of synthesized
data sets.
Fractal compression techniques tend to perform better at high compression rates com-
pared to other methods. For medical volume compression applications, any data loss due
to compression is likely unacceptable. Hence, volume compression in general may nd its
most suitable application in medicine for the high-rate low- delity indexing, previewing and
presentation of volume datasets, for which fractal compression techniques are the best choice.

33
7.2 Future Research
Fast rendering algorithms appear available based on the fractal representation. As [23] pre-
ceded [24], further research on fractal volume compression will likely produce a fast rendering
algorithm.
Fractal compression research is much newer than other compression techniques such as
the DCT or VQ, and the technique is still not well understood nor fully developed. As new
enhancements appear for fractal image compression, they will likely improve fractal volume
compression as well.
The Bath fractal transform [19] operates without a domain search, and could be easily
extended the 3-D and applied to volumetric data. This technique compensates for the loss
of block diversity by augmenting the ane block transformation with linear, quadratic and
even cubic functions.
Compression of volumes sampled on an irregular grid would require a more sophisticated
partitioning and block-transformation scheme.

7.3 Acknowledgments
Several volume visualization systems were used during the development of fractal volume
compression [1, 17, 29]. Data for the CT head was obtained from the University of North
Carolina-Chapel Hill volume rendering test data set.
This research is part of the recurrent modeling project, which is supported in part by a
gift from Intel Corp. The research was performed using the facilities of the Imaging Research
Laboratory, which is supported in part under grants #CDA-9121675 and #CDA-9422044.
The second author is supported in part by the NSF Research Initiation Award #CCR-
9309210. The third author is supported in part by the NSF under grants #IRI-9209212 and
#IRI-9506414.

34
References
[1] Ricardo S. Avila, Lisa M. Sobierajski, and Arie E. Kaufman. Towards a comprehensive
volume visualization system. In Proc. of IEEE Visualization '92, pages 13{20, Oct.
1992.
[2] Michael F. Barnsley and Stephen G. Demko. Iterated function schemes and the global
construction of fractals. Proceedings of the Royal Society A, 399:243{275, 1985.
[3] Michael F. Barnsley, John H. Elton, and D. P. Hardin. Recurrent iterated function
systems. Constructive Approximation, 5:3{31, 1989.
[4] Kai Uwe Barthel, Thomas Voye, and Peter Noll. Improved fractal image coding. In
Proc. of Picture Coding Symposium, March 1993.
[5] J.M. Beaumont. Image data compression using fractal techniques. BT Technology
Journal, 9(4), October 1991.
[6] Wayne O. Cochran, John C. Hart, and Patrick J. Flynn. Principal component analysis
for fractal volume compression. In Proc. of Western Computer Graphics Symposium,
pages 9{18, March 1994.
[7] Robert A. Drebin, Loren Carpenter, and Pat Hanrahan. Volume rendering. Computer
Graphics, 22(4):65{74, Aug. 1988.
[8] Yuval Fisher. Fractal image compression. In Przemyslaw Prusinkiewicz, editor, Fractals:
From Folk Art to Hyperreality. SIGGRAPH '92 Course #12 Notes, 1992. To appear:
Data Compression, R. Storer (ed.) Kluwer.
[9] Jerome H. Friedman, Jon Louis Bentley, and Raphael Ari Finkel. An algorithm for
nding best matches in logarithmic expected time. ACM Transactions of Mathematical
Software, 3(3), September 1977.

35
[10] Alan Gersho and Robert M. Gray. Vector Quantization and Signal Compression.
Kluwer, Boston, 1992.
[11] J. Hutchinson. Fractals and self-similarity. Indiana University Mathematics Journal,
30(5):713{747, 1981.
[12] Geir Egil ien. L2-Optimal Attractor Image Coding with Fast Decoder Convergence.
PhD thesis, Norwegian Institute of Technology, 1993.
[13] Arnaud E. Jacquin. Image coding based on a fractal theory of iterated contractive image
transformations. IEEE Transactions on Image Processing, 1(1):18{30, Jan. 1992.
[14] I.T. Jolli e. Principal Component Analysis. Springer-Verlag, New York, 1986.
[15] James T. Kajiya and Timothy L. Kay. Rendering fur with three dimensional textures.
Computer Graphics, 23(3):271{280, July 1989.
[16] Arie Kaufman. Ecient algorithms for 3D scan conversion of parametric curves, sur-
faces, and volumes. Computer Graphics, 21(4):171{179, July 1987.
[17] Philippe Lacroute and Marc Levoy. Fast volume rendering using a shear-warp factoriza-
tion of the viewing transformation. In Computer Graphics, Annual Conference Series,
pages 451{458, July 1994. Proc. of SIGGRAPH '94.
[18] Marc Levoy. Display of surfaces from volume data. IEEE Computer Graphics and
Applications, 8(3):29{37, 1988.
[19] Donald M. Monro and Frank Dudbridge. Fractal approximation of image blocks. In
Proc. of ICASSP, volume 3, pages 485{488, 1992.
[20] Donald M. Monro and Frank Dudbridge. Rendering algorithms for deterministic fractals.
IEEE Computer Graphics and Applications, 15(1):32{41, Jan. 1995.
[21] Donald F. Morrison. Multivariate Statistical Methods. McGraw-Hill Book Company,
New York, 1976.
36
[22] Takeshi Naemuri and Hirishi Harashima. Fractal encoding of a multi-view 3-d image.
In Bob Werner, editor, Proceedings ICIP-94, volume 1. IEEE Computer Society Press,
November 1994.
[23] Paul Ning and Lambertus Hesselink. Vector quantization for volume rendering. In Proc.
of 1992 Workshop on Volume Visualization, pages 69{74. ACM Press, 1992.
[24] Paul Ning and Lambertus Hesselink. Fast volume rendering of compressed data. In
Gregory M. Nielson and Dan Bergeron, editors, Proc. of Visualization '93, pages 11{18.
IEEE Computer Society Press, 1993.
[25] William B. Pennebaker and Joan L. Mitchell. JPEG Still Image Data Compression
Standard. Van Nostrad Reinhold, New York, 1993.
[26] B. Ramamurthi and A. Gersho. Classi ed vector quantization of images. IEEE Trans-
actions on Communications, 34, Nov. 1986.
[27] Dietmar Saupe. Accelerated fractal image compression by multi-dimensional nearest
neighbor search. In J.A. Storer and M. Cohn, editors, Proceedings DCC'95 Data Com-
pression Conference. IEEE Computer Society Press, March 1995.
[28] Dietmar Saupe and Raouf Hamzaoui. A guided tour of the fractal image compression
literature. In John C. Hart, editor, SIGGRAPH '94 Course #13 Notes: New Directions
for Fractal Models in Computer Graphics, pages 5{1{5{21. ACM SIGGRAPH, 1994.
[29] Barton T. Stander and John C. Hart. A Lipschitz method for accelerated volume
rendering. In Proc. of Volume Visualization Symposium '94, Oct. 1994. To appear.
[30] Pravin M. Vaidya. An O(n log n) algorithm for the all-nearest-neighbors problem. Dis-
crete Computational Geometry, 4:101{115, 1989.
[31] Greg Vines. Signal Modeling with Iterated Function Systems. PhD thesis, Georgia
Institute of Technology, May 1993.

37
[32] Boon-Lock Yeo and Bede Liu. Volume rendering of DCT-based compressed 3D scalar
data. IEEE Transactions on Visualization and Computer Graphics, 1(1), March 1995.

38

You might also like