Auto Cartoon

PicToon: A Personalized Image-based Cartoon System
Hong Chen1∗,Nan-Ning Zheng1 , Lin Liang2 ,Yan Li2 , Ying-Qing Xu2 , Heung-Yeung Shum2
1
Xi’an Jiaotong University
2
Microsoft Research, Asia
ABSTRACT those well-trained artists who possess this great skill can do it well.
In this paper, we present PicToon, a cartoon system which can gen- Recently, many technologies have been developed to make it possi-
erate a personalized cartoon face from an input Picture. PicToon is ble for a skilled cartoonist to work entirely on the computer. Such
easy to use and requires little user interaction. Our system consists technologies include stroke rendering [10, 16, 17, 6], and tone con-
of three major components: an image-based Cartoon Generator, an trol [16, 17, 6, 21]. By integrating these rendering technologies,
interactive Cartoon Editor for exaggeration, and a speech-driven various animation systems [11, 20, 13, 19] have been developed
Cartoon Animator. First, to capture an artistic style, the cartoon for interactive cartoon design. Although various drawing templates
generation is decoupled into two processes: sketch generation and have been provided by these systems, it is still difficult for an or-
stroke rendering. An example-based approach is taken to automat- dinary user to create a ”personalized cartoon”, or a cartoon resem-
ically generate sketch lines which depict the facial structure. An bling a particular person.
inhomogeneous non-parametric sampling plus a flexible facial tem-
plate is employed to extract the vector-based facial sketch. Various The PicToon system is developed for ordinary people to create a
styles of strokes can then be applied. Second, with the pre-designed personalized cartoon easily. There are two design goals for Pic-
templates in Cartoon Editor, the user can easily make the cartoon Toon:
exaggerated or more expressive. Third, a real-time lip-syncing al-
gorithm is also developed that recovers a statistical audio-visual • Creating personalized cartoons
mapping between the character’s voice and the corresponding lip
configuration. Experimental results demonstrate the effectiveness • Making the system easy to use
of our system.
To create a personalized cartoon, we adopt an image-based ap-
proach. Specifically, we propose an example-based learning ap-
Categories and Subject Descriptors proach to generate a cartoon sketch from an input face image, by
I.3.3 [Computer Graphics]: Picture/ Image Generation; observing how a particular artist would draw cartoons from train-
I.6.3 [Computer Graphics]: Methodology and Techniques; ing face images. Because it is difficult to describe the rules for how
I.4.8 [Image Processing and Computer Vision]: General an artist draws, we use a non-parametric sampling scheme that cap-
tures the relationship between the training images and their corre-
General Terms sponding sketches drawn by the same artist.
Applications
The generated cartoon sketch can then be enhanced by adding stroke
styles with stroke models like in [10]. The cartoon face can then
Keywords be easily exaggerated by interactively applying pre-designed facial
User interfaces, Multi-modal interaction and integration, Example- expression templates. To animate the cartoon face, a real-time lip-
based Learning, Non-parametric Sampling, Lip-syncing. syncing algorithm is applied to automatically generate cartoon an-
imation. Different facial expressions can be combined with lip-
1. INTRODUCTION syncing to make the animation more lively and believable.
People love cartoons. Cartoons are humorous, satirical, and at
times opinionated. Drawing cartoons is, however, not easy. Only The rest of this paper is organized as follows. In the next section we
∗
give an overview of related work. System architecture is presented
This work was done when Hong was a visiting student at Mi- in Section 3. We then introduce the key technologies of our system,
crosoft Research Asia. followed by a set of examples and demos. Finally we sum up the
features and benefits of our system, and outline some future work.
2. RELATED WORK
Many cartoon creation systems [11, 20, 13, 19] focus on low-level
editing and control tools for easy and flexible cartoon drawing and
animation. Inkwell [13], for example, developed some effective
techniques on exploiting layers, grouping/hierarchy of components
and allowing the manipulation of motion functions. Similarly, the
CharToon system [20] provided special skeleton-driven components,
Cartoon
Editor
Exaggerated
Cartoon
Cartoon Exaggerated
Generator Templates
Library
Input Image Cartoon Face
Cartoon
Animator
Audio Cartoon Animation
Figure 1: The system architecture of PicToon.
an extensive set of building blocks to design faces and the support

to re-use components and pieces of animations. In contrast, our
system allows users to create a personalized cartoon and animation
from an input face image.
Inspired by artist books, in the recent work of [6], artist drawings

are distinguished by several key aspects, such as perspective, out-
lines and control of detail. In their work, the specification of strokes
is decoupled from higher-level attributes. The user has low-level
control over stroke placement, and high-level control over the tone
and amount of detail, while tedious stroke rendering is completed
by the computer. Our system uses a similar concept in cartoon gen-
eration by decoupling the cartoon generation into two processes: Figure 2: Two key aspects of a face drawing: the sketch and a
sketch generation and stroke rendering. But unlike the system [6] particular stroke style.
that was designed for a trained artist (thus requiring precise control
on strokes), our system generates a facial sketch with an artist’s
style using an example-based approach. The system is easier to use
for ordinary users. 3. SYSTEM OVERVIEW
PicToon is developed to help common users to easily create a per-
An earlier prototype of our paper was introduced at ChinaGraph’02 sonalized cartoon. There are three major components, as shown in
[9] (in Chinese). Building upon [9], our system introduces more Figure 1:
stroke styles. Moreover, to make the animation more expressive,
edited cartoon expressions can be set as key frames. Interactive Cartoon Generator generates a cartoon sketch from a given frontal
face alignment and hair segmentation are also adopted to improve face image with little user interaction. The sketch reflects faith-
the system’s robustness. fully the input face image and doesn’t have any exaggeration. The
vector-based sketch representation makes it easy for editing and an-
Voice-driven facial animation seeks to generate mouth motion and imation. Moreover, the user can add different stroke styles to the
facial emotions using voice as the input [3, 2, 14, 22]. Such an ap- cartoon sketch.
proach is also adopted by our system to lessen the user’s work in an-
imation editing. In the Video Rewrite system [3], phoneme-viseme Cartoon Editor is an interactive tool for the user to edit the ex-
mapping is used to generate very exciting results. However, this pressions of a cartoon face or to exaggerate it. Edited expressions
method is language-dependent and not suitable for real-time imple- or exaggerated shapes can be saved as templates to apply to other
mentation. On the other hand our lip-syncing technology directly cartoon faces or to be used as key frames in Cartoon Animation.
maps low-level acoustic signals to visual signals, which has the
benefits of real-time processing and language-independent audio- Cartoon Animator can automatically animate the cartoon face
visual mapping relations. from speech. The lip movements are driven by the pronunciations.
The user can also set key frames as the expression templates created
by Cartoon Editor to change cartoon expressions.
Sketch Generation Stroke Rendering
Training Examples Stroke Library
Interactive Non Stroke

Face Parametric Template
Fitting Rendering
Alignment Sampling
Image Sketch Cartoon
Interactive Hair
Contour Extraction
Figure 3: Our cartoon generation approach consists of two steps: sketch generation and stroke rendering.
4. CARTOON GENERATOR model is used to describe the statistical relationship between the
4.1 Decoupling Cartoon Generation sketch and facial image. The probability distribution of a sketch
point depends on its neighborhood in the facial image. To describe
According to some artistic drawing books [12, 15, 7], there exist
the inhomogeneity of facial features, the probability distribution is
two key aspects of a face drawing: line sketch and stroke style. As
also related to the pixel’s position.
shown in Figure 2, the sketch suggests the fundamental visual per-
ception characteristics of a face: the simple lines depict the global
To construct such a distribution, an inhomogeneous non-parametric
facial structure and highlight subtle but important facial features
sampling strategy is employed: only the points at the same facial
such as double eye lines. Note that the sketch is drawn in a plain
location as a sketch point from different training images are sam-
style without shape exaggeration. At the other hand, stroke styles,
pled.
such as pencil, ink or brush, are used by the artist to describe his or
her perception in different ways.
Since the input image and training images are usually not aligned
with each other, to determine facial points’ correspondence, we
Thus we decouple cartoon generation naturally into two stages:
warp all images to the M eanShape. A constrained AAM model
sketch generation and stroke rendering, as shown in Figure 3.
[5] is employed to locate the feature points required by the warping
process.
Sketch generation creates a vector-based sketch. The sketch lines
indicate the drawing language of the artist: where and how facial
Once the probability distribution for each pixel is obtained, we in-
lines should be drawn to describe the facial structure. Since there
tegrate the distribution to get the “expected sketch image”, then
is no precise rule of grammar in such a language, we have chosen
template fitting is employed to extract the vector-based sketch.
to take an example-based learning approach.
The hair of different people do not have consistent statistical prop-
Stroke rendering turns the sketch into a cartoon using “artistic
erties, or a clear correspondence, thus the hair is not a part of the
strokes”. Existing stroke rendering techniques can be applied. In
example-based approach, but added in a post-processing step by
this paper, a stroke model similar to [10] is used.
tracing the hair contour. The whole procedure is shown in Figure
3. Detailed algorithms are explained in following sections.
Our decoupling approach has several advantages. For example, by
separating sketch from stroke styles, it is easy to generate a faithful
sketch by an example-based learning scheme. Moreover, it is easy 4.2.2 Interactive face alignment
to create different stylistic cartoons from the same sketch simply To locate the feature points, current alignment algorithms are not
by applying different stroke styles. robust enough to get proper results fully automatically. Introduc-
ing a little user interaction though will improve the location result
greatly. Thus a constrained AAM model [5] is adopted in our sys-
4.2 Sketch Generation tem. The user can modify the alignment result by dragging some
4.2.1 An Example-based Approach facial points to the expected positions. Then these constraints are
To construct a sketch, our example-based approach requires a set added as an additional energy item in the posterior energy. The
of training examples: face images and their corresponding sketches optimal solution is searched for again by a gradient decent algo-
drawn by an artist, as shown in Figure 3. rithm. User input constraints help the algorithm to escape from
local minima successfully. With little user interaction, much better
In this work, an inhomogeneous Markov Random Field (MRF ) and robust alignment results can be obtained.
(a) (b) (c) (d)
Figure 4: Generated Cartoon. (a) Original image; (b) Generated sketch; (c) and (d) Cartoon rendered with two different strokes.
4.2.3 Non-parametric sampling are modelled by a Gaussian distribution. For the on-off switch, we
Learning the above MRF model parameters is very complex. In- have defined three types of lines in our system: always appearing;
spired by a non-parametric sampling method successfully used in probably appearing but independent of other lines; dependent of
texture synthesis [8], we employ an inhomogeneous non-parametric other lines. We learn the probability of each type from the training
sampling scheme. examples. We define the difference between the “expected sketch
image” and the generated sketch as the likelihood energy.
At the point q in sketch S, we want to construct the distribution
pq (S(q)|NI (q)) given its neighborhood in the facial image. First, During the fitting procedure, we directly sample from the prior
the sampling set Ωq is constructed by the training exemplars at model, followed by a local search to determine each line’s param-
the corresponding position. Then k nearest neighbors in Ωq are eters more precisely. A final generated facial sketch is shown in
selected to construct pq (S(q)|NI (q)). The probability αqi of the Figure 4(b).
sketch point q having sketch pixel vqi ’s value is inversely propor-
tional to the neighborhood distance. 4.2.5 Interactive Hair Contour Extraction
1 For hair contour extraction, we apply color segmentation to find the
αqi = exp(−d(NI (q), NI (vqi ))/T ) (1) hair region first, then use edge-tracing to get the hair contour.
Zq
where the square neighborhood distance d(NI (q), NI (vqi )) is cal- To determine the color distributions of the hair region and the back-
culated using cross-correlation, Zq is the normalizing constant, and ground more robustly, two brushes are provided for the users to
temperature T is used to control the smoothness of the distribution. mark each region interactively. According to the marked regions,
Thus we get the non-parametric distribution: the color distributions of the hair and background regions are cal-
culated respectively. Then the pixels with smaller Mahalanobis dis-

k
tance to the hair color’s distribution are connected into one hair re-
pq (S(q)|I) = αqi δ(q − vqi ) (2) gion by the floodfill algorithm. The color distribution is represented
i=1
by a mixture Gaussians. The number of kernels and parameters are
where δ(·) is the Delta function. updated adaptively to the marked samples. With two brushes, the
user can modify the segmentation result interactively and easily.
Inspired by a patch-pasting method successfully used in texture Figure 4(b) shows one hair contour extraction result combined with
synthesis [23], we extend the synthesized unit from a pixel to a the face automatically.
small 3 × 3 square patch o accelerate the sampling procedure.
4.3 Stroke Rendering
4.2.4 Template Fitting Stroke Rendering places stylized strokes along a newly-created sketch.
We can get an “expected sketch image” by integrating the distribu- After consulting with cartoonists, we propose the following at-
tion of each point, then we apply template fitting to extract the final tributes for each stroke to simulate different styles:
vector-based facial sketch. To preserve the artist’s drawing style, • Texture: a reference texture image corresponding to a partic-
template fitting is based on a flexible sketch model. The sketch ular stroke style.
model is represented as a set that contains a fixed number of lines.
Each line is defined by an on-off switch and the position of control • Width: the width of the whole stroke representing its thick-
points. The on-off switch is necessary to capture the subtle differ- ness.
ences of facial features, such as the presence of double eye lines.
• Direction: the drawing direction of a stroke.
To fit the sketch model to the “expected sketch image”, we formu- • Path: the skeleton of a stroke. The generated sketch lines are
late it as the minimization of an energy function, which consists of taken as stroke paths.
two components: prior energy and likelihood energy.
Stroke textures are obviously important. As shown in Figure 4, by
We first build a prior model from the training sketches. Similar to applying different artistic strokes, cartoon appearance can be sig-
the ASM model[4], the coordinates of control points in each line nificantly changed. Like [10], we can use stroke textures to simu-
Figure 6: Cartoon editor user interface.
the resulting short strokes. The hair would appear more natural in
this way.
To stretch a texture map over the length of the stroke, each stroke
(a) path is resampled to even segments and described by a Catmull-
Rom spline. The selected stroke texture is warped along the skele-
ton by a local parametric coordinate transformation and texture
mapping [10].
5. CARTOON EDITOR
Cartoon Editor is a graphical editor for the user to edit the cartoon
face freely, i.e., change its size, shape or redraw some facial parts.
For convenience, various pre-designed exaggerated templates are
supplied, including expressions such as smiling, rage, grimacing,
(b) (c) etc. A tool to generate in-between expression states is also pro-
vided. Figure 6 shows the user interface of our cartoon editor.
Figure 5: Cartoon face rendered with different stroke direc-
tions. (a) Stroke texture; (b) and (c) Rendered cartoon faces. Cartoon Editor groups facial lines into seven parts: face contour,
mouth, nose, right-eye, left-eye, right-eyebrow and left-eyebrow.
The editing is performed on the sketch generated by Cartoon Gen-
erator. To edit each facial part, the following manipulations are
late pen, ink or brush, and also produce various stroke styles, such designed.
as taper, flare, wiggle and so on.
• Modify: The user can drag the control points of the sketch
Drawing styles are also reflected by the line width. For example, to modify its shape or scale, and rotate or move the selected
the face and hair contours are often rendered with thick strokes, facial part.
while the strokes depicting the eyes, mouth and nose could be a • Delete Lines: Delete original lines from the sketch.
little more slender. Other supporting lines such as double-eye lines
are the thinnest. • Add Lines: Add additional lines not defined in the sketch
template.
The direction of a stroke is related to the selected stroke texture
• Apply Pre-designed Template: The user can select pre-designed
and the artist’s habit. Stroke direction is introduced to properly
templates to change the expression of the cartoon face or to
place the texture. This is important for some stroke styles, such as
exaggerate it.
the stroke texture with just one tapering end. As shown in Figure 5,
different stroke texture directions will generate different visual ef- • In-between Edit: The user can also generate in-between states
fects. by dragging a scroll bar.
Cartoon Editor records the changes for each facial part separately.
Usually a sketch line is rendered as a stroke. But in some cases, a
To save the edited results as templates for reuse, three kinds of
single sketch line may be drawn with multi-strokes by the artist to
transformation modes can be set for each facial part.
enhance the expressive power. For example, when rendering a car-
toon face with the style shown in Figure 4(c)(d), the facial contour • Replace: The absolute coordinates of the edited shape are
line is rendered with three separated strokes. For the hair, we auto- saved in the template. When reusing, the facial part is totally
matically break the hair contour at strong corners and render it with replaced with the new shape.
• Add Changes: The difference between the new shape and which is associated with a proto-lip template. We further assume
original shape is saved in the template. When using this that the random vector ai for class i has a Gaussian distribution and
mode, the shape difference is added to the cartoon face. each dimension in the vector distributes independently. By regres-
sion, we can compute the mean āij and covariance σij for each
• Erase: The facial part will not be drawn. Gaussian model (for class i, dimension j). After the training pro-
cess, we have the following model parameters:
By using these simple tools, the user can design expressions for the
cartoon or exaggerate it. Saved templates can be easily applied to
any cartoon face and can create impressive effects. Figure 7 shows
some results generated by using pre-designed templates. • Proto-lip templates µi and their relative proportion πi (i =
1...n).
6. CARTOON ANIMATOR • Mean āij and covariance σij for the j-th dimension of the
For users to animate cartoons easily, we have developed lip-syncing i-th class for the acoustic feature vector (i = 1...n; j =
technology which is used by Cartoon Animator to drive cartoons by 1...18).
speech. The lip movements are automatically synthesized by our
algorithm. In addition, expressions designed by Cartoon Editor are
used as key frames in the animation. 6.1.2 Audio to Visual Mapping
Given a new audio clip, we first segment the audio signal into
These pre-designed expressions are all saved in the cartoon tem- frames of 40ms each. Then the acoustic feature vector a for each
plate library. During animation, Cartoon Animator will select ap- frame is calculated as the system input. Since we assume each di-
propriate expression templates from the library as key frames. Since mension of the acoustic feature vector distributes independently,
all the cartoon faces are vector-based, facial animation is generated the likelihood p(a|µi ) can be represented by:
by morphing between these key frames. 18
1 (aj − āij )2
p(a|µi ) = √ exp(− 2
). (4)
j=1
2πσij 2σij
The key to our speech-driven cartoon animator is a real-time lip-
syncing algorithm. Instead of a conventional phoneme-viseme map- According to Bayesian estimation, the posterior probability is
ping (e.g., [3, 14, 22] ), our algorithm uses the acoustic feature vec-
p(a|µi )p(µi )
tor (e.g., MFCC used in speech recognition) as system input. The p(µi |a) =
n (5)
advantage of using the acoustic feature vector is that different lan- p(a|µi )p(µi )
guages (e.g., Chinese and English) do not require training different i=1
models. where p(µi ) = πi is the prior. Then the mapping result becomes
n
6.1 Real-time lip-syncing v= µi p(µi |a) . (6)

i=1
6.1.1 Model Training
In our system, we calculate the Mel-Frequency Cepstrum Coeffi-
cients (MFCC) and the delta coefficients [18]. These coefficients Due to mapping error and the existence of noise, the lips in the
are commonly used for speech recognition and are robust and reli- synchronized sequence will appear to flutter open and closed inap-
able to variations in speakers and recording conditions. Since our propriately. Thus, we apply a Gaussian filter to the synthesized se-
cartoon model is vector-based and each control point is a 2D point, quence to achieve a smooth transition between neighboring frames.
we model the lip configuration as a n-dimensional random vector
v. The random vector is assumed to have a distribution formed by a 6.2 Facial Expression Animation
mixture of n Gaussian distributions. Each cluster, or component of In order to render the final animation sequence, we firstly synthe-
the mixture distribution, is parameterized by its relative proportion size the facial expressions by morphing between the cartoon tem-
πi , its mean µi and its covariance Ri . plates. Then the synchronized lip configurations are combined with
the rendered face images. Figure 8 shows an animation sequence
After we have obtained the training data, an unsupervised algorithm generated by our system. It can be seen that convincing facial ex-
is adopted to model the Gaussian mixtures [1]. The training data is pressions can be generated by key frame morphing. In addition, the
finally clustered into n classes (n = 15 in our case). Since these lip configurations are synchronized to the character’s speech.
classes represent the lip configurations well, any new configuration
can be approximated by a linear combination of the means of all
classes (called proto-lips). 7. EXAMPLES AND DEMOS
Figure 7 demonstrates the richness of the cartoon faces generated
Given the proto-lips, the next step is to classify each lip configu- by PicToon. The second column shows the cartoon faces gener-
ration in the training data into different classes. Here we use the ated by our example-based approach. The artist’s style of facial
Mahalanobis distance as the similarity measure, i.e., if contour drawing and the stroke styles are both simulated and well
presented. Subtle facial features, such as double eye lines, are also
j = arg min(vi − µk )T Ri−1 (vi − µk ) (k = 1..n) (3) captured. The cartoon faces in the last two columns with various
k
expressions are generated by applying pre-designed templates. The
then vi ∈ class j . exaggerated fat and thin faces shown in the third row are in-between
states obtained by using the editing tool. The edited cartoon faces
Because each lip configuration corresponds to an 18-dimensional are vivid and exaggerated, yet retain the person’s facial characteris-
acoustic feature vector a = (a1 , a2 , ..., a18 )T , all the samples in tics. It takes about 4 seconds on a Pentium 4 1.4GHz PC to generate
the audio vector space are also classified into n classes, each of a cartoon face of size 256 × 256.
Figure 7: Cartoon Faces Generated by PicToon. (a) Original images. (b) Realistic-looking Cartoon faces generated automatically.
(c) and (d) Impressive cartoon expressions and exaggerated cartoon faces generated by applying pre-designed templates.
Figure 8: Cartoon Animation Sequence.

Figure 8 shows a cartoon animation sequence generated by our Car- [8] A. Efros and T. Leung. Texture synthesis by non-parametric
toon Animator. Several expression templates have been designed as sampling. In the Seventh International Conference on
key frames according to the input voice. The lip moves naturally Computer Version, pages 20–27, 1999.
by using our lip-syncing algorithm. The cartoon talking face is at-
tractive with expression changes. [9] H.Chen, N. Zheng, L. Liang, Y. Li, Y. Xu, and H. Shum. A
personalized cartoon system (in chinese). In Chinagraph’02,
Beijing, China, 2002.
8. CONCLUSIONS AND FUTURE WORK
We have presented PicToon, a cartoon system that creates a per- [10] S. Hsu and I. Lee. Drawing and animation using skeletal
sonalized cartoon from an input image. The cartoon generated by strokes. In the SIGGRAPH 1994 annual conference on
using our system looks funny, lively and somewhat artistic while Computer graphics, pages 109–118, 1994.
resembling the original person. The system is easy to use for or-
[11] http://www.macromedia.com/software/.
dinary people. Once a lifelike stylistic cartoon face is generated,
it can be exaggerated with our cartoon editor and animated by the [12] K. Kumagai. How to draw expressive face. Tianjin People’s
speech-driven animator. Our system can be used in a variety of ap- Fine Arts Publishing House, China, 2000.
plications, e.g., virtual chatting and personalized e-greeting cards.
[13] P. Litwinowicz. Inkwell: a 2 12 D animation system. Computer
In our cartoon generation approach, sketch generation and stroke Graphics, 25(4):113–122, 1991.
rendering are decoupled. An effective non-parametric sampling
[14] S. Morishima, K. Aizawa, and H. Harashima. An intelligent
scheme along with a flexible facial template is employed to au-
facial image coding driven by speech and phoneme. In IEEE
tomatically extract a stylistic facial sketch from an input image.
ICASSP, 1989.
Stroke rendering turns the sketch into stylized strokes. The gen-
erated realistic-looking cartoon not only preserves facial features [15] H. Munce, editor. How to draw the human figure. Cortinal
of the original image but also simulates the artist’s style. A lip- Learning International, Inc., 1983.
syncing algorithm is also developed to animate the cartoon more
easily and naturally. Our algorithm uses acoustic signals as input [16] J. Northrup and M. Lee. Artistic silhouettes: a hybrid
rather than conventional phoneme-viseme mapping and therefore approach. In the first international symposium on
is language-independent and real-time. Non-photorealistic animation and rendering 2000, pages
31–37, 2000.
There are a number of future directions to further improve our sys-
tem. Currently, cartoon sketches cannot be generated well for non- [17] V. Ostromoukhov. Digital facial engraving. In the
frontal views. 3D face models may be included to solve this prob- SIGGRAPH 1999 annual conference on Computer graphics,
lem. How to generate exaggerated cartoons is a challenging prob- pages 417–424, 1999.
lem. Example-based stroke style transformations will be developed [18] L. Rabiner and B. H. Juang. Fundamentals of Speech
to learn the stroke styles of the artist drawings. Finally, how to rec- Recognition. Prentice Hall, 1993.
ognize the emotions from the input voice and apply appropriate
expression templates automatically to generate lively animation is [19] F. Reeth. Integrating 2 12 D computer animation techniques for
another interesting research topic. supporting traditional animation. In Computer Animation’96,
pages 118–125, 1996.
9. REFERENCES [20] Z. Ruttkay and H. Noot. Animated CharToon faces. In the
[1] C. Bouman. CLUSTER: An unsupervised algorithm for first international symposium on Non-photorealistic
modeling gaussian mixtures. http://dynamo.ecn.purdue.edu/ animation and rendering 2000, pages 91–100, 2000.
∼bouman/.
[21] M. Salisbury, M. Wong, J. Hughes, and D. Salesin.
[2] M. Brand. Voice puppetry. In Proc. ACM SIGGRAPH’99, Orientable textures for image-based pen-and-ink illustration.
pages 21 – 28, 1999. In the 24th annual conference on Computer graphics and
interactive techniques, pages 401–406, 1997.
[3] C. Bregler, M. Covell, and M. Slaney. Video rewrite: driving
visual speech with audio. In ACM SIGGRAPH ’97, pages [22] K. Waters and T. M. Levergood. DECface: An automatic
353–360, 1997. lip-synchronization algorithm for synthetic faces. Technical
report, DEC Cambridge Research Lab, 1993.
[4] T. Cootes and C. Taylor. Statistical models of appearance for
[23] Y. Xu, B. Guo, and H. Shum. Chaos Mosaic:fase and
computer version. Technical report, University of
memory efficient texture synthesis. Technical Report 32,
Manchester, Manchester M13 9PT, U.K., 2000.
Microsoft Research Technical Report, Apr. 2000.
[5] T. Cootes and C. Taylor. Constrained active appearance
models. In ICCV01, pages I: 748–754, 2001.
[6] F. Durand. Decoupling strokes and high-level attributes for

interactive traditional drawing. In Eurographics Rendering
Workshop’01, pages 71–82, 2001.
[7] B. Edwards. The new drawing on the right side of the brain.
J P Tarcher, 1999.

Auto Cartoon

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Auto Cartoon

Uploaded by

Copyright:

Available Formats

PicToon: A Personalized Image-based Cartoon System

Input Image Cartoon Face

Audio Cartoon Animation

Figure 1: The system architecture of PicToon.

an extensive set of building blocks to design faces and the support

Inspired by artist books, in the recent work of [6], artist drawings

Training Examples Stroke Library

Interactive Non Stroke

6.1 Real-time lip-syncing v= µi p(µi |a) . (6)

Figure 8: Cartoon Animation Sequence.

[6] F. Durand. Decoupling strokes and high-level attributes for

You might also like