You are on page 1of 6

TWO-DIMENSIONAL ORTHOGONAL DCT EXPANSION

IN TRIANGULAR AND TRAPEZOID REGIONS

ABSTRACT
It is known that the 2-D DCT basis is complete and
orthogonal in a rectangular region. In this paper, we
introduce the way to generate the complete and
orthogonal 2-D DCT basis in a trapezoid region or a
triangular region without using the complicated GramSchmidt method. Moreover, since a polygon can be
decomposed several triangular regions, the proposed
method is also suitable for the polygonal region. Our
algorithm can much generalize the JPEG algorithm.
Instead of dividing an image into 8 by 8 blocks, we can
divide an image into trapezoid or triangular regions
and then transform and code each of them. In addition
to the DCT basis, our method can also be used for
generating the 2-D complete and orthogonal DFT
basis, KLT basis, Legendre basis, Hadamard (Walsh)
basis, and polynomial basis in the trapezoid and
triangular regions.

eve transform (KLT) and has higher ability for


decorrelation, after performing the DCT, most energy is
concentrated on the low-frequency region, which is very
helpful for compression.
Although the DCT in (1) is popular in image
compression, it has some problem. That is, it is only
orthogonal in an MN rectangular region. However, for
the region with other shape, it may not be orthogonal.
Although for these types of regions, we can use the GramSchmidt algorithm to convert the DCT basis into an
orthogonal basis, it is very time-consuming and the roundoff error may be caused during the process of computation.
In this paper, we find that, with some modification,
the DCT basis can also be complete and orthogonal in a
triangular region, a trapezoidal region, or their twisted
forms.
Furthermore, since a polygon can be viewed as a
combination of triangles, the proposed method can also be
applied for a polygonal region. We can first divide an nside polygon into n-2 triangular regions (instead of 88
blocks) then perform DCT expansion for each triangular
region.
Therefore, with the proposed method, we can perform
DCT expansion for an arbitrary polygonal region. It
makes the JPEG algorithm much more flexible.
Moreover, in addition to the DCT basis, the proposed
method can also be applied to other discrete orthogonal
bases with even and odd symmetries, such as the KLT
basis, the DFT basis, the Hadamard (Walsh) basis, the
discrete Legendre basis, and other discrete orthogonal

polynomial bases. With the proposed method, we can


convert them into a complete and orthogonal vector set in
the trapezoid and triangular regions
2. COMPLETE AND ORTHOGONAL DCT BASIS IN
THE TRAPEZOID REGION
Here, we define the trapezoid as a region that has M rows
(or columns) and if the number of pixels in the mth row (m
= 0, 1, , M-1) is denoted by K(m), then
K(m) + K(M1m) is a constant.
(4)
(M1)th row
(M2)th row

1st row
0th row

Fig. 1: A trapezoid region that satisfies (4) and the


starting point of each row are aligned at the same
column. Black dots mean the pixels in the trapezoid
region.
(a)
m = M 1
m = M 2

m=2
m=1
m=0
n=0

(b)

N 1

2
Region B

Region A
Region A

Region B

rotation by 180
Region B

Region A
Rectangular region

M N
2 2

(both p and q are even)

M N
2 2

(both p and q are odd)

MN .
2

(17)

Case 2: M is odd and N is even: Since there are (M+1)/2


even p and (M1)/2 odd p. Thus, the number of (p, q) that
satisfy (14) is:

M 1 N
2 2

M 1 N
2 2

(both p and q are even)

(both p and q are odd)

MN .
2

(18)

MN .
2

(19)

Case3: M is even and N is odd:


M N 1 M N 1
2 2
2 2
(both p and q are even)

(both p and q are odd)

Note that, it is impossible that both M and N are odd. In


this case, from (5), if m = (M1)/2, 2K((M1)/2) = N and
K((M1)/2) = N/2, which is not an integer.
Therefore, in all the cases, we can obtain MN/2 DCT
bases from Theorem 2, which is equal to the number of
points in the trapezoid region A. Thus, the DCT bases
obtained from Theorem 2 form a complete and
orthonormal set in the trapezoid region A.
#
In Fig. 3, we give an example. Fig. 3(a) is a trapezoid
region. We use (14)-(16) to derive its complete and
orthonormal DCT set (consists of 16 bases) and the results
are shown in Fig. 3(b).

Fig. 4(a), we can first shear it into in Fig. 4(b), then use
the method in Section 2 to find the complete orthogonal
DCT bases, and then shear the bases back.
Furthermore, our method can also be applied for the
trapezoid regions that is the rotation form of Fig. 1 or
Fig. 4(a).
Moreover, since the triangular region can be viewed
as a special case of trapezoid region whose number of
pixel in the first (or the last) row is 1 (i.e., in (5), K(0) = 1
or K(M 1) =1), as in Fig. 5, thus, the method in Theorem
2 can also be used for the triangular region.
Furthermore, since an n-side polygonal region can be
view as a combination of n2 triangular regions, we can
also use our method to perform DCT expansion for a
polygonal region.
(b)

(a)
shearing

(a)

(b)

C0,0
2

C2,0
2

4
2 4 6 8 10

C3,3

4
2 4 6 8 10

C2,4

2 4 6 8 10

C1,5

C3,5

2 4 6 8 10

2 4 6 8 10

C0,6

C3,7

4
2 4 6 8 10

2 4 6 8 10

C1,7

4
2 4 6 8 10

2 4 6 8 10

C2,6

1st row
0th row

4
2 4 6 8 10

C0,4

2 4 6 8 10

C1,3

4
2 4 6 8 10

(M1)th row

4
2 4 6 8 10

C2,2

C3,1
2

4
2 4 6 8 10

C0,2

C1,1

Fig. 4: Shearing a region that satisfies (5) into the


trapezoid region whose first pixels in each row are
aligned at the same column.

4
2 4 6 8 10

2 4 6 8 10

Fig. 3: The complete and orthonormal 2-D DCT basis in a


trapezoid region.
3. EXTENDING TO GENERALIZED TRAPEZOID,
TRIANGULAR, AND POLYGONAL REGIONS
We have derived the complete and orthonormal DCT basis
for the trapezoid region whose first pixels in each row are
aligned at the same column, as in Fig. 1. In fact, our
results can also be applied to other type of regions.
First, our results can be applied to any trapezoid
region that satisfies (4), even if the first pixels in each row
are not aligned at the same column. For the region as in

Fig. 5: A triangular region can be viewed as a special


case of the trapezoid region where K(0) or K(M1)
=1 in (4).
However, it is hard to find a trapezoid which can
match the arbitrary shape accurately for real case image
compression. That is, we find the approximate trapezoid
that is contained inside the arbitrary shape with the largest
area instead of finding the perfect matched trapezoid.
In order to have higher compression ratio we intend to
find a trapezoid that is contained inside the shape.
Therefore, most of the pixels in the trapezoid region may
have similar characteristics (grey level values). In other
words, energy in this trapezoid region mostly concentrates
in the low frequency region. It is helpful for image
compression. Fig. 6 shows an example of finding an
approximate trapezoid in an arbitrary region. Fig. 6(a) is
an arbitrary shape and Fig. 6(b) is one of the ways to find
the approximate trapezoid region. We can see that the
trapezoid cannot exactly match the shape in Fig. 6(a).
Therefore, we may find more trapezoids with smaller size
in the rest of the region to have the entire shape. In
chapter 5, we will show how to deal with this problem. In
fact, little amount of missing points is tolerable. They can
be easily recovered by pixel interpolation in the posterior

process.
(a)

(b)
approximate
trapezoid

trapezoid
50

50

100

100

150

150

200

200
50

Fig. 6: Finding (b) an approximate trapezoid region in (a)


an arbitrary shape.
4. EXTENDING TO OTHER SYMMETRIC
ORTHOGONAL BASIS
In Sections 2 and 3, we discussed how to derive the
complete and orthogonal DCT basis in a triangular or a
trapezoid region. In fact, our method is also suitable for
other types of bases. Since Theorem 2 was derived based
on (8), thus, if a basis set is complete and orthogonal in a
rectangular region and has the even / odd symmetric
relation as in (8), we can also use Theorem 2 to convert it
into the complete and orthogonal basis set in the
triangular and the trapezoid regions.
For example, in digital signal processing [5], the basis
sets of the 2-D discrete Fourier transform (DFT), the 2-D
discrete Hartley transform, the 2-D number theoretic
transform (NTT), the 2-D discrete Legendre transform,
the 2-D discrete orthogonal polynomial expansion, and
the 2-D Hadamard (Walsh) transform all have the
even / odd symmetric relation as in (8). Therefore, we can
use Theorem 2 to convert them into complete and
orthonormal basis sets in a triangular or a trapezoid
region.
We give an example of deriving the complete orthogonal
Hadamard (Walsh) basis set for the triangular region as in
Fig. 7(a). Then, as the method in Fig. 2(b), we first
convert it into a 44 rectangular region. The 2-D
orthogonal Hadamard basis for the 44 rectangular region
is [6]:

door
region

100

150

50

100

150

Fig. 9: (a) A laboratory image. (b) In a 2-D image, the


door always has the shape of trapezoid.
1

proposed

P[ j]
0.99

0.98 Gram-Schmidt

MPEG-4

0.97
0.96

10

15

20

25

This chapter is divided into three parts. First, we will


discuss the proposed method used in a trapezoid region.
Chapter 5.2 introduces the new segmentation and
compression algorithms. Chapter 5.3 shows the
compression procedure of the entire image.

Fig. 10: Normalized partial sums P(j) (see (24), which can
measure the performance of energy concentration)
using (a) the proposed method, (b) the DCT obtained
by the Gram-Schmidt method, and (c) the two
directional 1-D DCT in MPEG 4.
Although a door has the shape of rectangle, in a 2-D
image, it always becomes the trapezoid form, as in Fig.
9(b). Then we use three methods to transform and code
the door region in Fig. 9(b): (a) the proposed method, (b)
using the DCT basis orthogonalized by the Gram-Schmidt
method, and (c) applying the 1-D DCT along x-axis and
y-axis, as the method used in MPEG 4 [4]. Their running
time are:
(a) proposed: 0.0364 sec
(b) Gram-Schmidt: 1032.87 sec
(c) the 1-D DCT method in MPEG 4: 0.0701 sec. (23)
Then, in Fig. 10, we show the normalized partial sums
of the energies of the largest DCT coefficients of the three
methods:
,
From (23), the proposed method is much faster than
the Gram-Schmidt method and its energy concentration is
as good as the results of the Gram-Schmidt method (see
Fig. 10). Moreover, compared with the shape adaptive
DCT method in MPEG 4, since our method perform the
DCT with fixed number of points for each row and
column, our method has both less computation time and
better energy concentration than the 1-D DCT method in
MPEG 4.

5.1. Proposed method in a specific trapezoid region

5.2. New Segmentation and Compression Algorithms

The proposed method provides an efficient way to


transform and code a trapezoid or triangular shape object.
In Figs. 9 and 10, we show a simulation.

With the proposed method, the algorithm for image


compression can become much more general. For the
existing JPEG algorithm, an image is first divided into
several 88 blocks, as Fig. 11(a). Now, with the proposed

5. APPLICATIONS IN IMAGE COMPRESSION AND


SIGNAL ANALYSIS

method, we can divide an image into several trapezoid,


rectangular, or triangular blocks instead of 88
rectangular blocks, as Fig. 11 (b).
(a)

enough and may not cost too much processing time.

(b)

50
all 88 rectangular blocks

trapezoidal, rectangular, or
triangular blocks

Fig. 11: (a) The existing JPEG cuts an image into several
88 rectangular blocks. (b) With the proposed
method, we can divide an image into rectangular,
trapezoid, or triangular blocks.
Compared with the original JPEG algorithm, the
method in Fig. 11 (b) is more flexible. Since the
boundaries between two blocks can have the direction not
parallel to x- and y-axes, we can make them match the
edges of the objects. Then, the YCbCr values in a block
will be more uniform, which is good for compression.
To make the block exactly match the shape of the
object, which is the work in MPEG-4, we need extra data
to record the edges of the objects, which is not good for
compression.
Using the method in Fig. 11 (b) can avoid the problem.
Since the boundary consists of straight lines, to record the
shape of a block, we only have to record its corners.
Moreover, from Section 4, since Theorem 2 can also
be used for deriving the 2-D complete and orthogonal
DFT, NTT, and Hadamard basis in a trapezoid or
triangular region, therefore, the proposed method is also
useful for signal analysis, filter design, CDMA, and other
signal processing applications.
5.3. Image Compression with proposed method
Chapter 5.1 shows the compression in a specific trapezoid
region. However, for general images we can hardly find a
trapezoid which can exactly match the shape of the object.
Therefore, finding the appropriate trapezoids is very
important in our proposed method.
Images are divided into four regions: lower frequency
regions, higher frequency regions, border regions and
the corner and boundaries part. The lower frequency
regions are trapezoids; they are depicted in Fig. 12(b). We
divide this image into eight low frequency parts. The lines
in Fig. 12(b) denote the boundaries of the trapezoid
region. Trapezoid DCT is used in the lower frequency
regions and the corner and boundaries part are coded by
geometric coding techniques. Arbitrary shape DCT using
Gaussian-Schmidt method is used in the higher frequency
regions and the border regions because their size are small

100

50

100

50

100

50

100

Fig. 12: (a) A fruit image. (b) The lower frequency regions
found in the fruit image.
We try to find the largest trapezoid that is contained inside
the lower frequency regions. Therefore, higher
compression ratio can be obtained in the compression
process. Dividing the objects into many trapezoid regions,
the optimal solution is difficult to find.
There are two problems in the dividing procedure:
overlapped trapezoids and missing points. Missing points
mean that we have gap between the trapezoids we found.
This can be dealt with pixel interpolation. The overlapped
trapezoids problem cause when we divide into larger
trapezoids. This can be easily remove by simply choosing
the average value or just drop one of the points. Missing
points may cause larger error so we are willing to process
more data (overlapped trapezoids problem) rather than
have missing points between the regions.
Fig. 13 is the flowchart of our proposed compression
method. An image is divided into four regions as we
mentioned before. The trapezoid DCT will be applied on
the low frequency region; in other words, the low
frequency regions must be divided into trapezoid. The
arbitrary shape DCT using GS is applied on the rest of the
regions.

Lower frequency
region

DCT in trapezoid
regions

Input
image

Coding
ASDCT using
GSO process

Other region

Coding

Fig. 13: The flowchart of our proposed image compression


method using DCT in trapezoid regions
As mentioned, it is hard to find the optimal solution of
dividing the lower frequency region into trapezoids. We
proposed a method to resolve the problem. For each
objects in the image, we do the following processes. The
dividing procedure has mainly two steps: slice the objects
into several stripes, find the inscribed trapezoid in
each stripes. The following is the dividing procedure:
Step 1. Find the corners of the object
Step 2. According to the corners, the object is sliced into
several stripes on the position of the corners. If
the corners are too close, we will merge the
stripes.
Step 3. Find the inscribed trapezoid in each stripe. The
endpoints of the trapezoid are initialized to the
endpoint of the upper side and the lower side.
Step 4. By moving the legs inward we can obtain the
inscribed trapezoid.
Step 5. Record the endpoints of the inscribed trapezoid.
Fig. 14 shows an example of finding inscribed
trapezoid regions according to this process. Fig. 14(a)
shows how we slice the object into stripes. Note that if the
corners are too close then we will merge the two stripes.
Fig. 14(b) is the process that we find the inscribed
trapezoid. We move the legs of the trapezoid until the legs
are all inside the object.
(a)

trapezoid by moving inward the legs of the initial


trapezoid.

Corner too close

(b)

Finding inscribed
trapezoid

Fig. 14: (a) Slicing the object into several stripes


according to the corners. (b) Finding the inscribed

50

100

50

100

50

100

50

100

Fig. 15: (a) The reconstruction fruit image using JPEG


compression standard (692 bytes). (b) The
reconstruction fruit image using our proposed method
(165 bytes).
Fig. 15 shows the reconstruction fruit image using the
JPEG compression standard and our proposed method. We
can see some black points inside the apple in Fig. 15(b).
This is caused by the missing point problem and we do
not fix it yet. The distortions are mainly in the high
frequency region and the border region but it is endurable
for human vision because human eyes are more sensitive
to the lower frequency distortion.
In Fig. 15, compared to the JPEG standard, the
number of bits using our proposed method is 165 bytes
with RMSE equals to 4.7286. The JPEG standard costs
692 bytes with RMSE equals to 2.1198. The data amount
of our proposed method is about one fourth of the JPEG
standard one while looking similar.
If we use smaller quantization step, we will have
RMSE smaller than using JPEG standard but it costs more
bytes whereas still costs only two third of data amount of
the one using the JPEG compression standard.
Furthermore, if we compress the image by using JPEG
compression standard with the same amount of data as our
proposed method it will cause severe block effect. So our
proposed method can also solve the block effect.

is much lower than the arbitrary shape DCT using


Gaussian-Schmidt method.

50
100

50

100

Fig. 16: The reconstruction fruit image using JPEG


compression standard (233 bytes) RMSE= 4.2173.
Fig. 16 is the example of a fruit image using JPEG
compression standard with data amount equals to 233
bytes and RMSE equals to 4.2173. It is obvious that the
block effect becomes severe while using less byte to
encode the image. Compared to our proposed method in
Fig. 15(b), it costs only 165 bytes without block effect.
Moreover, the processing time is much less than the
arbitrary shape DCT using Gaussian-Schmidt method. It
costs only 4.930688 seconds by using our proposed
method while the Gaussian-Schmidt method needs much
more processing time.
In summary, compared to the conventional JPEG
compression standard, our proposed method has the
following advantages:
(a)Less amount of data quantity.
(b)Avoid block effect.
(c)Compress the image according to its characteristics.
Compared to the arbitrary shape DCT using GaussianSchmidt method, our proposed method has the following
advantages:
(a)Reduce massive computation time.
(b)Energy concentration is as good as the results of the
Gram-Schmidt method.
7. CONCLUSION
In this paper, we describe the ways to generate the
complete and orthonormal DCT basis in a trapezoid or a
triangular region efficiently without using the GramSchmidt method. With the proposed method, the JPEG
compression algorithm can become much more general
and we can divide an image into trapezoid or triangular
blocks instead of 88 blocks. Moreover, our method can
also be applied to the DFT basis, the Hadamard (Walsh)
basis, or any other bases with even and odd symmetric
relations. By the new segmentation and compression
algorithm we proposed, the block effect problem in the
JPEG compression standard can be resolved even using
less amount of data quantity to compress the image.
Moreover, the computation time of our proposed method

You might also like