You are on page 1of 7

Compressed-

Sensing-
Enabled
Video
Streaming
for Wireless
Multimedia
Sensor
Networks

This article1 presents the design of a networked system for joint compression, rate control and
error correction of video over resource-constrained embedded devices based on the theory of
compressed sensing. The objective of this work is to design a cross-layer system that jointly
controls the video encoding rate, the transmission rate, and the channel coding rate to maximize
the received video quality. First, compressed sensing based video encoding for transmission over
wireless multimedia sensor networks (WMSNs) is studied. It is shown that compressed sensing
can overcome many of the current problems of video over WMSNs, primarily encoder
complexity and low resiliency to channel errors. A rate controller is then developed with the
objective of maintaining fairness among video streams while maximizing the received video
quality. It is shown that the rate of compressed sensed video can be predictably controlled by
varying only the compressed sensing sampling rate. It is then shown that the developed rate
controller can be interpreted as the iterative solution to a convex optimization problem
representing the optimization of the rate allocation across the network. The error resiliency
properties of compressed sensed images and videos are then studied, and an optimal error
detection and correction scheme is presented for video transmission over lossy channels.

Existing System:

In existing layered protocol stacks based on the IEEE 802.11 and 802.15.4 standards, frames are
split into multiple packets. If even a single bit is flipped due to channel errors, after a cyclic
redundancy check, the entire packet is dropped at a final or intermediate receiver. This can cause
the video decoder to be unable to decode an independently coded (I) frame, thus leading to loss
of the entire sequence of video frames.

Disadvantages:
Instead, ideally, when one bit is in error, the effect on the reconstructed video should be
unperceivable, with minimal overhead. In addition, the perceived video quality should gracefully
and proportionally degrade with decreasing channel quality.

Proposed System:

With the proposed controller, nodes adapt the rate of change of their transmitted video quality
based on an estimate of the impact that a change in the transmission rate will have on the
received video quality. While the proposed method is general, it works particularly well for
security videos. In addition, all of these techniques require that the encoder has access to the
entire video frame (or even multiple frames) before encoding the video.

Advantages:

The proposed CSV encoder is designed to: i) encode video at low complexity for the encoder; ii)
take advantage of the temporal correlation between frames.

Modules:-

1. CS Video Encoder (CSV)

The CSV video encoder uses compressed sensing to encode video by exploiting the spatial and
temporal redundancy within the individual frames and between adjacent frames, respectively.

?Sensing the channel : those that have the cost of sensing channel have higher energy
consumption and so they are not suitable for WMSNs.

?Using extra packets: Using retransmission time of dropped packets includes not only
retransmission request but also transmission of dropped packet. These methods waste a great
amount of energy for congestion detection in sensor nodes.

?Low cost: Some methods do not necessitate extra cost for congestion detection. These methods
are the most suitable for congestion detection in WMSNs.

2. Rate Change Aggressiveness Based on Video Quality:

With the proposed controller, nodes adapt the rate of change of their transmitted video quality
based on an estimate of the impact that a change in the transmission rate will have on the
received video quality. The rate controller

Uses the information about the estimated received video quality directly in the rate control
decision. If the sending node estimates that the received video quality is high, and round trip time
measurements indicate that current network congestion condition would allow a rate increase,
the node will increase the rate less aggressively than a node estimating lower video quality and
the same round trip time. Conversely, if a node is sending low quality video, it will gracefully
decrease its data rate, even if the RT T indicates a congested network. This is obtained by basing
the rate control decision on the marginal distortion factor, i.e., a measure of the effect of a rate
change on video distortion.

3. Video Transmission Using Compressed Sensing:

We develop a video encoder based on compressed sensing. We show that, by using the
difference between the CS Samples of two frames, we can capture and compress the frames
based on the temporal correlation at low complexity without using motion vectors.

4. Adaptive Parity-Based Transmission:

For a fixed number of bits per frame, the perceptual quality of video streams can be further
improved by dropping error samples that would contribute to image reconstruction with incorrect
information. Which shows the reconstructed image quality both with and without including
samples containing errors? It assume that the receiver knows which samples have errors, they
demonstrate that there is a very large possible gain in received image quality if those samples
containing errors can be removed.

We studied adaptive parity with compressed sensing for image transmission, where we showed
that since the transmitted samples constitute an unstructured, random, incoherent combination of
the original image pixels, in CS, unlike traditional wireless imaging systems, no individual
sample is more important for image reconstruction than any other sample. Instead, the number of
correctly received samples is the only main factor in determining the quality of the received
image.

Hardware Required:

ü System : Pentium IV 2.4 GHz

ü Hard Disk : 40 GB

ü Floppy Drive : 1.44 MB

ü Monitor : 15 VGA color

ü Mouse : Logitech

ü Keyboard : 110 Keys enhanced

ü RAM : 512MB

Software Required:

ü O/S : Windows XP.

ü Language : C#.Net
Local Directional Number Pattern for Face
Analysis: Face and Expression Recognition
This paper proposes a novel local feature descriptor, local directional number pattern (LDN), for
face analysis, i.e., face and expression recognition. LDN encodes the directional information of
the face’s textures (i.e., the texture’s structure) in a compact way, producing a more
discriminative code than current methods. We compute the structure of each micro-pattern with
the aid of a compass mask that extracts directional information, and we encode such information
using the prominent direction indices (directional numbers) and sign—which allows us to
distinguish among similar structural patterns that have different intensity transitions. We divide
the face into several regions, and extract the distribution of the LDN features from them. Then,
we concatenate these features into a feature vector, and we use it as a face descriptor. We
perform several experiments in which our descriptor performs consistently under illumination,
noise, expression, and time lapse variations. Moreover, we test our descriptor with different
masks to analyze its performance in different face analysis tasks

EXISTING SYSTEM:

In the literature, there are many methods for the holistic class, such as, Eigenfaces and
Fisherfaces, which are built on Principal Component Analysis (PCA); the more recent 2D PCA,
and Linear Discriminant Analysis are also examples of holistic methods. Although these
methods have been studied widely, local descriptors have gained attention because of their
robustness to illumination and pose variations. Heiseleet al.showed the validity of the
component-based methods, and how they outperform holistic methods. The local-feature
methods compute the descriptor from parts of the face, and then gather the information into one
descriptor. Among these methods are Local Features Analysis, Gabor features, Elastic Bunch
Graph Matching, and Local Binary Pattern (LBP). The last one is an extension of the LBP
feature that was originally designed for texture description, applied to face recognition. LBP
achieved better performance than previous methods, thus it gained popularity, and was studied
extensively. Newer methods tried to overcome the shortcomings of LBP, like Local Ternary
Pattern (LTP), and Local Directional Pattern (LDiP). The last method encodes the directional
information in the neighborhood, instead of the intensity. Also, Zhanget al. explored the use of
higher order local derivatives (LDeP) to produce better results than LBP. Both methods use other
information, instead of intensity, to overcome noise and illumination variation problems.
However, these methods still suffer in non-monotonic illumination variation, random noise, and
changes in pose, age, and expression conditions. Although some methods, like Gradientfaces,
have a high discrimination power under illumination variation, they still have low recognition
capabilities for expression and age variation conditions. However, some methods explored
different features, such as, infrared, near infrared, and phase information, to overcome the
illumination problem while maintaining the performance under difficult conditions.

DISADVANTAGES OF EXISTING SYSTEM:


v Both methods use other information, instead of intensity, to overcome noise and illumination
variation problems.

v However, these methods still suffer in non-monotonic illumination variation, random noise,
and changes in pose, age, and expression conditions.

v Although some methods, like Gradientfaces, have a high discrimination power under
illumination variation, they still have low recognition capabilities for expression and age
variation conditions.

PROPOSED SYSTEM:

In this paper, we propose a face descriptor, Local Directional Number Pattern (LDN), for robust
face recognition that encodes the structural information and the intensity variations of the face’s
texture. LDN encodes the structure of a local neighborhood by analyzing its directional
information. Consequently, we compute the edge responses in the neighborhood, in eight
different directions with a compass mask. Then, from all the directions, we choose the top
positive and negative directions to produce a meaningful descriptor for different textures with
similar structural patterns. This approach allows us to distinguish intensity changes (e.g., from
bright to dark and vice versa) in the texture. Furthermore, our descriptor uses the information of
the entire neighborhood, instead of using sparse points for its computation like LBP. Hence, our
approach conveys more information into the code, yet it is more compact—as it is six bit long.
Moreover, we experiment with different masks and resolutions of the mask to acquire
characteristics that may be neglected by just one, and combine them to extend the encoded
information. We found that the inclusion of multiple encoding levels produces an improvement
in the detection process.

ADVANTAGES OF PROPOSED SYSTEM:

1) The coding scheme is based on directional numbers, instead of bit strings, which encodes
the information of the neighborhood in a more efficient way

2) The implicit use of sign information, in comparison with previous directional and derivative
methods we encode more information in less space, and, at the same time, discriminate more
textures; and

3) The use of gradient information makes the method robust against illumination changes and
noise.

SYSTEM ARCHITECTURE:

MODULES:

1. Face recognition,

2. Histogram generation,
3. Expression Recognition,

4. Face Retrieval

MODULES DESCRIPTION:

1. Face recognition:

In the first module, we design the system such that first the image dataset folder should be
indexed by the user. After index is made, it shows the number of images in the folder which we
indexed. Next the query image is selected by the user. The LH and MLH are used during the face
recognition process. The objective is to compare the encoded feature vector from one person
with all other candidate’s feature vector with the Chi-Square dissimilarity measure. This measure
between two feature vectors, F1and F2, of length N is measured. The corresponding face of the
feature vector with the lowest measured value indicates the match found.

2. Histogram generation:

In this module, the histogram is generated based on the query image selected from the image
dataset. The horizontal axis of the graph represents the tonal variations, while the vertical
axis represents the number of pixels in that particular tone. The left side of the horizontal axis
represents the black and dark areas, the middle represents medium grey and the right hand side
represents light and pure white areas. The vertical axis represents the size of the area that is
captured in each one of these zones. Thus, the histogram for a very dark image will have the
majority of its data points on the left side and center of the graph. Conversely, the histogram for
a very bright image with few dark areas and/or shadows will have most of its data points on the
right side and center of the graph.

3. Expression Recognition:

We perform the facial expression recognition by using a Support Vector Machine (SVM) to
evaluate the performance of the proposed method. SVM is a supervised machine learning
technique that implicitly maps the data into a higher dimensional feature space. Consequently, it
finds a linear hyperplane, with a maximal margin, to separate the data in different classes in this
higher dimensional space. After the histogram identified in the previous module, we extract all
the feature automatically and the features are stored separately. Based on the extracted features,
the expression is recognized.

4. Face Retrieval:

In this module, we retrieve the similar images based on the expression recognized on the
previous module. The efficiency of the descriptor depends on its representation and the ease
of extracting it from the face. Ideally, a good descriptor should have a high variance among
classes (between different persons or expressions), but little or no variation within classes
(same person or expression in different conditions). These descriptors are used in several
areas, such as, facial expression and face recognition.

SYSTEM CONFIGURATION:-
HARDWARE REQUIREMENTS:-

ü Processor - Pentium –IV


ü Speed - 1.1 Ghz

ü RAM - 256 MB(min)

ü Hard Disk - 20 GB

ü Key Board - Standard Windows Keyboard

ü Mouse - Two or Three Button Mouse

ü Monitor - SVGA

SOFTWARE REQUIREMENTS:

• Operating system : - Windows XP.

• Coding Language : C#.Net

You might also like