Professional Documents
Culture Documents
10EE35001 2
10EE35001 3
This project aims to develop a methodology to identify and recognize different action preformed in
various videos on YouTube which display an expression in each action performed or a sports like
gymnastics which a performances of exercises with strict postures & moves.
Bharata Natyam dance consists of a series of postures called Karanas.Karanas are the 108 key
transitions in the classical Indian dance described in Natya Shastra. Karana is a Sanskrit verbal noun,
meaning "doing".
Gymnastics is a very complex competition involving the performance of exercises requiring physical
strength, flexibility, power, agility, coordination, grace, balance and control. It typically involves the
women's events of uneven bars, balance beam, floor exercise, and vault.
Objective
Our prime objective is to estimate the pose of the key subject from 2D image frames of a video and
to recognize the action performed using machine learning techniques.
10EE35001 4
(a)
(b)
(c)
Fig 3.(a)Input Image(b)Soft labelling of Pixels to body parts or background. Red indicates torso, green
upper arms, blue lower arms and head. Brighter pixels are more likely to belong to a part.(c)
Stickman representation of pose, obtained by fitting straight line segments to the segmentations in
(b). For enhanced visibility, the lower arms are in yellow and the head is in purple.
The exact image regions covered by the parts has to be found. For estimating 2D pose in individual
video frames, we used the image parsing technique of Ramanan.
Image parsing : A person is represented as a pictorial structure composed of body parts tied
together in a tree-structured conditional random eld Parts, li, are oriented patches of xed size,
and their position is parameterized by location and orientation. The posterior of a conguration of
parts L = {li} given an image I can be written as a log-linear model
Fig 4.Single-frame models. Each node represents a body part (h: head, t: torso, left/right
upper/lower arms lua, rua, lla, rla). (a) The kinematic tree includes edges between every two body
parts which are physically connected in the human body. (b) The repulsive model extends the
kinematic tree with edges between opposite-sided arm parts.
10EE35001 5
Fig .5.Estimated pose using 2D Articulated Full body detector for Bharat Natyam dancers in different
poses.
10EE35001 6
(c)Detected Skin
Algorithm:
Step 1: Load an RGB image and convert it to doubles.
Step 2: Compute the Skin likelihood for each pixel.
Step 3: Threshold the likelihood to detect skin.
These binary images can used to detect the action in any given image with HOG(Histogram of
Oriented Gradients). The technique counts occurrences of gradient orientation in localized portions
of an image. This method is similar to that of edge orientation histograms, scale-invariant feature
transform descriptors, and shape contexts, but differs in that it is computed on a dense grid of
uniformly spaced cells and uses overlapping local contrast normalization for improved accuracy.
The final step in action recognition using Histogram of Oriented Gradient descriptors is to feed the
descriptors into some recognition system based on supervised learning. TheSupport Vector
Machine classifier is a binary classifier which looks for an optimal hyperplane as a decision function.
Once trained on images containing some particular action, the SVM classifier can make decisions
regarding activity done such as a straddle, bridge in additional test images.
10EE35001 7
The following Results were obtained from Skin Detection on gymnastics dataset which
correctly identify the shape of the Gymnast.