Professional Documents
Culture Documents
Hyderabad Campus
Multimedia
Abhishek Thakur
Computing
CSIS
BITS-Pilani, Hyderabad
Campus
BITS Pilani
Hyderabad Campus
Multimedia Computing
Research trends and Review of Key Terms
Module -10 (of 10)
BITS Pilani
Hyderabad Campus
Modules
Module
Coverage
1. Introduction
2. Data Compression
4. Video / Audio
Fundamentals
5. Video Compression
6. Audio and
Synchronization
7. Storage and
Communication Basics
8. Multimedia
Communication
9. Modern Multimedia
10.1 HEVC
High Efficiency Video Coding
Aims for 50% bit-rate reduction for similar video quality as compared
to Advanced Video Coding (AVC / H264 / MPEG-4 part 10)
25% reduction in bit rate with 50% reduction is complexity for same
quality
Feasible because of faster and more processing, larger buffers,
more GPU cores
Ref: Wikipedia + Gary Sullivans Overview of the High Efficiency
Video Coding (HEVC) Standard work (http://goo.gl/6aXHK3 and
http://goo.gl/SoLBEY)
Multimedia Computing
10/4/16
Slide 5
10/4/16
Slide 6
HEVC Needs
Higher resolution Video content and Displays [8k]
Higher frame rates [60 or 120 fps for super slow motion etc.]
Medical and other fields with higher color depth (say 10 bit i.o. 8 bit
on luminance)
More CPU/GPU cores needs better parallelization of logic
Multi View coding (multiple video feeds / 3D - though partially done in AVC)
Version 1 of HEVC formally ratified in April 2013
Multimedia Computing
10/4/16
Slide 7
10/4/16
Slide 8
Other changes
Waveform parallel processing (WPP)
Multimedia Computing
10/4/16
Slide 9
Some times internal (implicit) metadata can be used e.g. Title and
other details within container (say file header fields), time and location of
capture / edit, tools used to capture / edit.
Next level is search for content beyond the headers e.g. subtitles
(transcripts or close-caption info), audio search, story search, image search,
scene search etc.
How the search engine is driven is also important e.g. is it pure text
search, sample image based search, interpret implicit user intent etc.
Multimedia Computing
10/4/16
Slide 10
Search against given input search across videos to match the image
of missing person or pet
Multimedia Computing
10/4/16
Slide 11
Video Analysis
There are three key steps in video analysis:
detection of interesting moving objects,
tracking of such objects from frame to frame, and
analysis of object tracks to recognize their behaviour.
Multimedia Computing
10/4/16
Slide 12
Multimedia Computing
10/4/16
Slide 13
Tracking - phases
Tracking task:
In the simplest form, tracking can be defined as the problem of
estimating the trajectory of an object in the image plane as it
moves around a scene. In other words, a tracker assigns
consistent labels to the tracked objects in different frames of a
video. Additionally, depending on the tracking domain, a tracker can
also provide object centric information, such as orientation, area, or
shape of an object.
How - Two subtasks:
Build some model of what you want to track
Use what you know about where the object was in the previous
frame(s) to make predictions about the current frame and restrict the
search
10/4/16
Slide 14
Tracking : Complexity
Tracking objects can be complex due to:
loss of information caused by projection of 3D world on 2D image
noise in images
complex object shapes / motion
non-rigid or articulated nature of objects
partial and full object occlusions
scene illumination changes
real-time processing requirements
Constraints to Simplify:
Almost all tracking algorithms assume that the object motion is smooth with no
abrupt changes
The object motion is assumed to be of constant velocity
Prior knowledge about the number and the size of objects, or the object
appearance and shape
Multimedia Computing
10/4/16
Slide 15
Multimedia Computing
10/4/16
Slide 16
(a) Centroid,
(b) Multiple
points,
(c) Rectangular
patch,
(d) Elliptical
patch,
(e) Part-based
multiple
patches,
(f) Object
skeleton,
(g) Complete
object
contour,
(h) Control points
on object
contour,
(i) Object
silhouette.
BITS Pilani, Hyderabad Campus
Multimedia Computing
10/4/16
Slide 17
Multimedia Computing
10/4/16
Slide 18
Object Detection
Either at beginning or when an object first appears in the video
Point detectors: find interest points in images which have an
expressive texture in their respective localities
Segmentation: partition the image into perceptually similar
regions
Background subtraction: Build a representation of the scene
called the background model and then finding deviations from
the model for each incoming frame.
Supervised Classifiers: Prior training based on sample images
which are pre-classified (SVM, Neural networks etc.)
Multimedia Computing
10/4/16
Slide 19
Detection Examples
Multimedia Computing
10/4/16
Slide 20
Object Segmentation
Multimedia Computing
10/4/16
Slide 21
Object Tracking
10/4/16
Slide 22
Audio Video Fundamentals Analog video: Connectivity (separate color and audio
etc.), scan lines, interlacing, color and audio subcarrier; Digital Video: Chroma subsampling,
screen resolution, frames rate Digitization of Sound: Sampling and Quantization, aliasing
and Nyquist sampling rate, SNR, logarithmic amplitudes (dB/dBm), non-linear transform before
encoding, human voice vs. music, band-limiting / low pass for voice, synthetic sound, PCM,
DPCM, Adaptive Delta Modulation, Adaptive DPCM
Multimedia Computing
10/4/16
Slide 24
Synchronization: Temporal Sync between audio, video, pointer etc., Logical data units
(LDU), compensating for loss of sync; Synchronization specification models Interval based,
Timeline axis based, Flow Control - Hierarchical (serial / parallel), Reference point based
Flow control, Event based.
Multimedia Computing
10/4/16
Slide 25
10/4/16
Slide 26
Multimedia Computing
10/4/16
Slide 27
Thank You !!
Multimedia Computing
10/4/16
Slide 28