You are on page 1of 4

Paper ID - 25 1

A Dynamic hand gesture recognition system for


controlling VLC media player
Manuj Paliwal, Gaurav Sharma, Dina Nath, Astitwa Rathore, Himanshu Mishra, Soumik Mondal

performing an action must represent the action which is being


Abstract— In this paper we have discussed a low cost system performed by it and also it must be logically explainable [4,
which uses dynamic hand gesture recognition technique to 11], thus for controlling a media player like VLC dynamic
control the VLC media player. This application contains a hand gestures could be more intuitive and natural [5].
central computation module which segments the foreground part In this paper we are going to present an application which
of the frame using skin detection and approximate median
uses dynamic hand gestures as input to control the VLC media
technique. The recognition of gesture is done by creating a
Decision Tree, that uses various features extracted from the player. We have considered single handed gestures and their
segmented part. This hand gesture recognition technique directional motion defines a gesture for the application. In this
introduces a new, natural way to interact with computers. application image acquisition is done using a Webcam. Some
functions in VLC media players are used more frequently and
Index Terms—Gesture Recognition; Human Computer thus applying controls VLC media player for those functions
Interaction; VLC media player; Decision Trees; using predefined gestures. Fig. 1 shows the defined gestures
according to the VLC player control function.
The organization of the rest of the paper is as follows. In
I. INTRODUCTION
Section II we give the architecture design of our system,

C OMPUTERS and computerized devices have become an


eminent element of our society. They increasingly
influence many aspects of our lives; for example, the way
Section III contains an overview of video processing and
recognition techniques used in the system, In Section IV we
have discussed the data capture phases for training and testing
we communicate, the way we perform our actions, and the the system. Section V shows the performance results of the
way we interact with our environment. Thus a new concept of system, in Section VI we have discussed our system to
interaction emerged, Human Computer Interaction (HCI). conclude and future aspects of this research.
Although computers have made tremendous advancements,
the common HCI still relies on input devices such as
keyboard, mouse, and joysticks. By the underlying prototype,
users express their significance to the computer, a user using
their hands to perform button clicks, positioning the mouse
and key presses. This is rather an unnaturally restrictive way
of interacting with end user systems. With the increase in
interaction of computers in our daily life, it would be worthy
enough to get a Perceptual User Interface (PUI) [1, 2] to
interact with computers as human interact with each other.
Vision-based gesture recognition is an important technology
for friendly human-computer interface, and has received more
and more attention in recent years [3, 4, 10]. The applications
designed for gesture recognition generally require restricted
background, set of gesture command and a camera for
capturing images. The gesture used in application for

Manuscript received November 5, 2012.


M. Paliwal is with the Hindustan College of Sc. & Technology, Farah,
Uttar Pradesh, India (e-mail:manuj.45@gmail.com).
G. Sharma is with the Hindustan College of Sc. & Technology, Farah,
Uttar Pradesh, India (e-mail:gauravsharma240290@gmail.com).
D. Nath is with the Hindustan College of Sc. & Technology, Farah, Uttar
Pradesh, India (e-mail:dinanathkarn@gmail.com).
A. Rathore is with the Hindustan College of Sc. & Technology, Farah,
Uttar Pradesh, India (e-mail:astitwa5690rathore@gmail.com).
H. Mishra is with the Hindustan College of Sc. & Technology, Farah, Uttar Fig. 1. Different gestures used for different commands.
Pradesh, India (email: himanshu.ims2009026@gmail.com).
S. Mondal is with the Hindustan College of Sc. & Technology, Farah,
Uttar Pradesh, India (email: mondal.soumik@gmail.com).
Paper ID - 25 2

II. ARCHITECTURE DESIGN III. TECHNIQUES FOR GEST


TURE RECOGNITION
In this system we have used different different image In this section we are going g to discuss the gesture
preprocessing techniques, feature extraction aand classification recognition process in the step by steep manner.
tool for recognizing the gesture in real tim me and give the 1. Capture 10 second Video ( ) data from the webcam.
appropriate command to the VLC player. F Fig. 2 shows the 2. Split the video into RGB frrames.
block diagram of the system according to thee different phases 3. For each RGB frame ( ) Skin area was detected by
of the system. The system phases are describeed below. using Skin Detection Mod del [6, 8] and a new image
• Data acquisition: Data acquisition for training and was formed in the skin areaa marked by red color. Fig.
testing of our system was done by innbuilt webcam on 3 shows an example of th he image produce by
the laptop. Skin Detection Model.
• Segmentation: Image segmentation was done by two 4. Approximate Median based segmentation [7, 9] was
techniques. The complete process w was described in done on the and a binary y image was obtained. Fig.
the next section. 4 shows an example of th he image produce by
o Skin Detection Model for ddetection of hand Approximate Median techn niques.
region. 5. AND operation was perforrmed on and for
o Approximate Median Technique for obtaining the final bin nary image for feature
subtraction of background. extraction. Fig. 5 showss an example image
It has been observed that by the uuse of these both produces after the AND operation.
methods for segmentation the blob waas obtained much
better for further process. 6. Various features were extraacted from array, like:
• Feature extraction: In our work we hhave used motion dox - direction of movement on n X coordinate.
direction of the hand region as a featture. doy - direction of movement on n Y coordinate.
• Recognition Phase: In this work Decision Tree was area - Area was calculated wh hich contained the hand of
used as a classification tool forr recognition of the user.
gestures. xstart - Starting position off the hand gesture in X
• VLC interaction: Give the approprriate command to coordinate.
the VLC player according to the recoognized gesture. xend - Ending position of the hand gesture in X
coordinate.
ystart - Starting position off the hand gesture in Y
coordinate.
yend - Ending position of the hand gesture in X
coordinate.
7. These features were used d in a Decision Tree for
recognition of the gestu ures in real time. The
algorithm of the decision trree is given in the Fig. 6.

Fig. 3. Image produce by Skin


n Detection Model.

Fig. 2. Architecture of the system


m.

Fig. 4. Image produce by Approxim


mate Median technique.
Paper ID - 25 3

Volume Decrease 70%


%
Crop 90%
%

Fig. 5. The image is produced after AND opperation.

Fig. 6. Decision Tree algorithm.

In Fig. 6 commandg is an application on cc++ which brings


VLC media player in control with the inputt command given
by the Decision Tree algorithm to the VL LC media player.
Commands issued by Decision Tree are thhe integer values
Fig. 7 Result obtained of differentt gestures in real time.
assigned to each respective function in VLC pplayer.
VI. CONCLUSION
IV. DATA ACQUISITION
In today’s world Human comp puter interaction is much
In our work we have captured different ggestures shown in
limited to the device based input. This
T paper shows how we
Fig. 1 by the inbuilt webcam with the laptoop. The data was can use dynamic hand gestures as a means for novel human
captured with four different persons (all are m
male with the age computer interaction in a much natu ural and intuitive way. This
range from 20-23) with different backgrouunds. Five video application defines some gesturess for performing various
sample was collected for each of the gesture from a particular operations of VLC media player and users can perform a
person. respected gesture for the desired function.
f In our work we
We have divided 30% of our data for trainn the system 20% have used very simple features and recognition techniques for
for validate the system and the rest of the 500% for testing the making this system work in a reaal time environment. This
system. paper shows us that there are endless possibilities in
improving the way we interact with computers.
V. RESULTS obust in recognition phase,
The present application is less ro
The system was tested in real time and w we have achieved better algorithms can be employed to improve the efficiency,
very promising results. We have tested our system 10 times an infrared camera attached can be a substitute to skin
for different gestures with different personns. Fig. 7 shows detection part. Instead of using a beeep to mark as the start of
some snap shots of our system. In Table I wee have shown the gesture a better probabilistic approoach can be used to detect
recognition percentage of the different gesturres. The variation the start and end of the gesture. As a future prospect of this
in results comes because of the different nooisy backgrounds research we are also going to in nvestigate with the large
and different light illumination. number of gestures with different peersons and the effect of the
performance of the system with the users age and sex
Table I: Recognition rate of different gestures. variations. We are also going to gen neralize our system so that
Gesture Recognition Rate (%) it can be useful for other different media players available in
Play / Pause 80% the market.
Stop 80%
Volume Increase 70%
Paper ID - 25 4

ACKNOWLEDGMENT
This work was fully supported by Department of Computer
Sc. & Engineering, Hindustan College of Sc. & Technology,
UP, India.

REFERENCES
[1] M. Turk and G. Robertson, “Perceptual user interfaces,”
Communications of the ACM, vol. 43(3), March 2000.
[2] A. Nandy, S. Mondal, J. S. Prasad, P. Chakraborty and G. C. Nandi,
“Recognizing & interpreting Indian Sign Language gesture for Human
Robot Interaction,” 2010 International Conference on Computer and
Communication Technology (ICCCT), pp. 712-717, IEEE, Sept. 2010.
[3] Y. Wu and T. S. Huang, "Vision-Based Gesture Recognition: A
Review", Lecture Notes in Computer Science, Vol. 1739,pp. 103-115,
1999.
[4] V. Pavlovic, R. Sharma and T. S. Huang, “Visual Interpretation of Hand
Gestures for Human-Computer Interaction: A Review,” IEEE Trans. On
Pattern Analysis and Machine Intelligence, vol. 19(7), pp. 677-695,
1997.
[5] S. S. Rautaray and A. Agrawal, “A Vision based Hand Gesture Interface
for Controlling VLC Media Player,” International Journal of Computer
Applications, vol. 10(7), pp. 11–16, November 2010.
[6] C. Ó. Conaire, N. E. O'Connor and A. F. Smeaton, “Detector adaptation
by maximizing agreement between independent data sources,” IEEE
Computer Society Conference on Computer Vision and Pattern
Recognition, pp. 1-6, June 2007.
[7] N. J. B. McFarlane and C. P. Schofield, “Segmentation and tracking of
piglets in images,” Machine Vision and Applications, vol. 8, issue 3, pp.
187-193, 1995.
[8] M. J. Jones and J. M. Rehg, “Statistical color models with application to
skin detection,” IEEE Computer Society Conference on Computer
Vision and Pattern Recognition, vol. 1, pp. 274-280, 1999.
[9] S. Battiato, D. Cantone, D. Catalano and G. Cincotti; “An Efficient
Algorithm for the Approximate Median Selection Problem,” Algorithms
and Complexity Lecture Notes in Computer Science, Springer, vol.
1767, pp. 226-238, 2000.
[10] A. Nandy, J. S. Prasad, S. Mondal, P. Chakraborty and G. C. Nandi,
“Recognition of Isolated Indian Sign Language Gesture in Real Time,”
Information Processing and Management, Communications in Computer
and Information Science, Springer, vol. 70, pp. 102-107, 2010.
[11] A. Nandy, J. S. Prasad, P. Chakraborty, G. C. Nandi, S. Mondal,
“Classification of Indian Sign Language In Real Time,” International
Journal on Computer Engineering and Information Technology
(IJCEIT), vol. 10, issue 15, 2010.

You might also like