You are on page 1of 46

MULTIMEDIA

SIGNAL PROCESSING
MMSP
SGN-5016
Irek Defe
Tietotalo TF 316
irek.defee@tut.fi

Course info
Lectures: Room TB 219
Tue ja Wed 10.15-12
Exercises mandatory
Exam written

Course info
Course Web page
http:/www.cs.tut.fi/~defee/mulsp.html
Course material is regulary updated,
please use only the updated material

Exercises for SGN-5016


Multimedia Signal Processing
Petri Hirvonen
petri.hirvonen@tut.fi
http://www.cs.tut.fi/~hirvone2/5016_exercises.htm

Exercises

TC303
Group1: 10:15-12:00, TC 303 25.02
Group2: 10:15-12:00, TC 303 26.02
You can participate in one or both of the exercise groups
if there is space, is not attend one group
A written report is returned by e-mail after each exercise.
The details about the report are included in the exercise
material.

MULTIMEDIA SIGNAL PROCESSING


WHAT IS THIS COURSE ABOUT???
1. WHAT IS MULTIMEDIA (MM) ?
2. WHAT IS THE TOPIC OF MULTIMEDIA
SIGNAL PROCESSING?
(THIS AREA IS NOT WELL DEFINED YET)

WHAT IS MULTIMEDIA?
COMPOSED OF MULTI+MEDIA
MEDIA = MEDIUM OF COMMUNICATION
WE COMMUNICATE NATURALLY:
VISUALLY, BY SPEECH, BY TOUCH
WE COMMUNICATE BY TECHNOLOGY:
RADIO (MOBILE PHONES), TV, PRESS,
CINEMA, BOOKS

PEOPLE USE VARIOUSCOMMUNICATION


MEDIA: SPEECH, VISION, TOUCH.
IN THE PAST WHEN PEOPLE
COMMUNICATED THEY HAD TO USE
THOSE MEDIA DIRECTLY.
IN PRESENT CIVILISATION THERE ARE
MANY TECHNOLOGIES WHICH
EXTEND HUMAN COMMUNICATION

GENERAL MODEL OF HUMAN COMMUNICATION


PRODUCER
OF
INFORMATION
HUMAN

RECEIVER
OF
INFORMATION
HUMAN

COMMUNICATION MEDIUM NATURAL


(E.G. VOICE, TOUCH): WE USE SPECIFIC
PHYSICAL MEDIUM E.G. AIR PLUS PRODUCTION
SPECIALLY ENCODED SIGNALS FOR CONVEYING
INFORMATION
COMMUNICATION MEDIUM INDIRECT VIA
TECHNOLOGY (E.G. CINEMA, RADIO, PRESS, TV)

MORE RECENT IS A MODEL OF


HUMAN MACHINE
COMMUNICATION, OR EVEN
MACHINE-MACHINE COMMUNICATION
WHEN WE USE COMPUTERS, WE
COMMUNICATE WITH MACHINE,
THE COMMUNICATION MEDIA ARE:
TOUCH/GESTURE <-> KEYBOARD, MOUSE
VISION <-> DISPLAY
HEARING <-> SOUND

HUMANS CAN USE SEVERAL DIFFERENT


MEDIA FOR COMMUNICATION
E.G. SPEECH, TOUCH, VISUAL SYSTEM
HUMANS OFTEN USE SEVERAL
MEDIA SIMULTANEOUSLY OR IN OTHER
WORDS MULTIPLE MEDIA =MULTIMEDIA
FOR EXAMPLE: WHEN WE TALK WITH
SOMEBODY WE USE GESTURES, FACE
EXPRESSIONS

IN FACT PEOPLE PREFER TO USE


MULTIPLE MEDIA = MULTIMEDIA
- WE CAN USE SINGLE MEDIA, E.G.
SPEECH WHEN TALKING ON THE PHONE
BUT SEEING EACH OTHER WHEN
TALKING ENHANCES THE CONTACT
- WE CAN LISTEN TO THE RADIO, E.G.
NEWS, BUT TV IS PREFERRED EVEN IF
WE JUST SEE A PERSON READING THE
NEWS
- MULTIMEDIA IS MORE NATURAL FOR
PEOPLE

THERE IS ANOTHER USE OF WORD


MEDIA, IN THE SENSE OF
MEDIA INDUSTRY
MEDIA INDUSTRY IS DEALING WITH
PRODUCING, DISTRIBUTING AND
SELLING INFORMATION ADDRESSING
HUMAN MEDIA SYSTEM
MULTIMEDIA INFORMATION IS VERY
IMPORTANT FOR THE INDUSTRY
THERE ARE MANY ENGINEERING
PROBLEMS IN DEALING WITH
MULTIMEDIA INFORMATION

WHAT IS MULTIMEDIA SIGNAL


PROCESSING (MMSP) ?
IT IS ABOUT PROCESSING
COMMUNICATION AND UTILIZATION
OF INFORMATION USED BY HUMANS
ONE CAN CONSIDER THREE
SCENARIOS OF USAGE:
1. HUMAN-HUMAN
2. HUMAN MACHINE
3. MACHINE - MACHINE

WHY MULTIMEDIA SIGNAL PROCESSING


IS POSSIBLE? THIS IS BECAUSE WE HAVE
MEANS FOR DIGITAL REPRESENTATION
AND PROCESSING OF ANY TYPE OF
INFORMATION.
IF WE TALK ON THE PHONE, LISTEN TO
THE MUSIC FROM MP3PLAYER, WATCH
MOVIE FROM DVD DISC, TAKE PICTURE
WITH CAMERA, WE KNOW THAT
INFORMATION IS REPRESENTED BY BITS
AND PROCESSED DIGITALLY

WHAT WE NEED ARE ALGORITHMS


HOW TO PROCESS THE SIGNALS
DIGITALLY

MULTIMEDIA SIGNAL PROCESSING


IS ABOUT ALGORITHMS FOR THE
PROCESSING OF SIGNALS WHICH ARE
USED BY HUMANS FOR COMMUNICATION
WITH OTHER PEOPLE OR MACHINES OR
DEALING WITH THE WORLD AROUND

WHAT ARE THE MEDIA SIGNALS?


MEDIA SIGNALS ARE THOSE SIGNALS
WHICH ARE ACCESSIBLE TO THE HUMAN
INFORMATION PROCESSING SYSTEM
ONE OF THE ISSUES IN MULTIMEDIA
SIGNAL PROCESSING IS WHAT TYPE OF
SIGNALS AND WHAT KIND OF
COMBINATIONS OF SIGNALS CAN BE
USED. FOR EXAMPLE: ACOUSTICAL
SIGNALS: SOUNDS, SPEECH-LANGUAGE,
MUSIC
WE CONVERT THOSE SIGNALS TO
DIGITAL FORMAT AND USE

EXAMPLE: DIGITAL MUSIC (CD, MP3,


DVD, INTERNET RADIO)
EXAMPLE: DIGITAL VIDEO (DVD, BLUE
RAY, INTERNET TV)
THESE ARE SYSTEMS FOR TRANSFERRING
CONTENT PRODUCED BY ARTISTS TO
PEOPLE. THESE SYSTEMS USE SPECIFIC
DIGITAL ENCODING AND COMPRESSION
OF INFORMATION TO RECORD THE
CONTENT.
THE QUESTION IS HOW TO MAKE THIS.

BUT HAVING SUCH SYSTEMS A NEW


PROBLEM EMERGES:
HOW TO PROTECT MEDIA INFORMATION
UNAUTHORIZED USE?

(FOR EXAMPLE ILLEGAL COPYING?)


How to represent media information in
most pleasing way?
Examples are High Definition technologies:
- Flat Displays
- HD DVD, Blue Ray discs, HDTV

THE SECOND MAIN ASPECT OF MMSP


2. HUMAN-MACHINE COMMUNICATION
HOW TO MAKE INTERACTION WITH
COMPUTERS (AND OTHER MACHINES)
MORE NATURAL? NATURAL MEANS E.G. MORE
SIMILAR TO HUMAN-HUMAN INTERACTION,
MORE INTUITIVE, MORE PLEASING,
ATTRACTIVE.

THAT INCLUDE ALSO HOW TO MAKE


MACHINES MORE INTELLIGENT:
FOR EXAMPLE , INSTEAD OF TYPING
WE COULD TALK TO COMPUTERS AND
INSTEAD OF COMPUTERS PRINTING ON
SCREEN ANSWERS THEY WOULD TALK
TO US.
OR, IF COMPUTERS WOULD SEE US
USING CAMERAS, THEY POSSIBLY
COULD REACT MORE LIKE PEOPLE.
BUT TODAY WE STILL USE KEYBOARD
AND MOUSE, WHY?

WE USE KEYBOARD AND MOUSE


BECAUSE WE DO NOT HAVE BETTER
TECHNOLOGY: WE DO NOT KNOW HOW
TO PROCESS SPEECH AND VISUAL
INFORMATION AS EFFECTIVELY AS
PEOPLE ARE ABLE TO DO
BUT WE MAY THINK OF COMPUTERS
WITH CAMERAS AND MICROPHONES
WHICH WILL BE ABLE TO DO SO
THIS MAY BECOME POSSIBLE BECAUSE
OF FAST PROGRESS IN DEVELOPMENT OF
ALGORITHMS AND PROCESSORS

THIS PROGRESS CAN BE ILLUSTRATED ON


MANY EXAMPLES
- COMPARE PC TODAY AND 10 years AGO
(TODAY WE HAVE MULTICORE
PROCESSORS AND THE NUMBER OF CORES
IS GROWING FAST)
- COMPARE MOBILE DEVICE TODAY AND
MOBILE PHONE 10 years AGO
(TODAY THE TELEPHONE FUNCTION IS
JUST ONE ADDITION TO MULTIPLE MEDIA
PROCESSING: MUSIC, VIDEO, CAMERA,
TOUCH, ORIENTATION)
EXTRAPOLATE THIS TO THE NEXT 10 years!

WE CAN EXPECT IN THE FUTURE:


COMPUTERS, MOBILE, AND ALL KIND
OF OTHER DEVICES WILL BE MORE AND
MORE CLEVER (=INTELLIGENT?)
THESE SYSTEMS WILL BE RELYING
ON INCREASINGLY SOPHISTICATED
MULTIMEDIA SIGNAL PROCESSING
CAPABILITIES

WE HAVE THUS TWO MAIN AREAS TO


COVER IN MMSP:
1. MEDIA INFORMATION PROCESSING
IN MULTIMEDIA SYSTEMS
2. MEDIA COMPUTER INTERFACE FOR
HUMAN-COMPUTER INTERACTION
THESE ARE THE TOPICS OF
THE MMSP COURSE

Please note however that our Multimedia Signal


Processing course is matched to the study program
at TUT, especially to the Multimedia Major
We have many courses specialized in single media
processing: Digital Audio, Image Processing, Video
Processing, Video Compression, Pattern
Recognition
We avoid overlapping with those courses. We are
also not going into algorithms which were proposed
by researchers but they are not in wider use yet,
this is covered in other courses and seminars
In other universities they may not have so many
specialized courses, the course content is different

There is one absolutely basic observation:


MANY MULTIMEDIA SIGNAL PROCESSING
TASKS ARE ALREADY IMPLEMENTED IN
BIOLOGICAL SYSTEMS, ESPECIALLY IN
THE HUMAN INFORMATION PROCESSING
SYSTEM
FOR EXAMPLE: VISUAL AND ACOUSTICAL
COMMUNICATION BETWEEN PEOPLE,
USING VISUAL INFORMATION IN
RECOGINIZING OBJECTS. BIOLOGICAL
SYSTEMS DO IT PERFECTLY BUT WE DO
NOT KNOW HOW, THAT IS ALGORITHMS

IN THE FIRST PART OF THIS COURSE


WE SHALL COVER BASIC KNOWLEDGE
RELATED TO

HUMAN INFORMATION PROCESSING


THIS SYSTEM PROCESSESS MEDIA
INFORMATION AND IT DOES IT IN
FANTASTIC WAY. IF WOULD KNOW HOW
IT MAKES IT, IT COULD HELP US TO
MAKE BETTER MEDIA INFORMATION
PROCESSING (BETTER MMSP ALGORITHMS)

BUT BEFORE WE GO FURTHER LET US MAKE


SOME MEDIA TECHNOLOGY OVERVIEW,
WHERE MULTIMEDIA SIGNAL PROCESSING
WILL BE USEFUL IN THE FUTURE

MULTIMEDIA SIGNAL PROCESSING


ALLOWS FOR NEW CLASSESS OF DEVICES

AND SYSTEMS:
MORE SOPHISTICATED COMMUNICATION,
MORE ADVANCED INTERFACES

THEY ARE ILLUSTRATED NEXT

Mobile Multimedia Devices Examples

WHAT THESE MOBILE DEVICE EXAMPLES


SHOW TO US?
-DEVICES HAVE MULTIPLE SENSORS AND
MULITPLE MEDIA PROCESSING CAPABILITIES
- TAKE ONE EXAMPLE - TOUCH

Device is controlled by fingers, e.g. picture size


or even playing guitar

What is still missing?

Maybe makeup, but this is a joke

ANOTHER EXAMPLE: DIGITAL CAMERAS


Digital cameras perform a lot of processing
for best picture quality. But recent cameras
have new features related to analysis of
visual information.
Face Detection automatically detects a face in the frame and
adjusts focus, exposure, contrast, and skin complexion so it
turns out perfectly.
Face Recognition a feature that remembers faces from
previous shots. When a familiar face is recorded several times,
the camera will prompt the users to register the face. Once
registered, if the face appears into the frame again, the camera
will display the name specified for that person and prioritize
focus and exposure for the face.
To make such feature an algorithm for
face detection and recognition is needed
working fast and reliably

COMPLETELY NEW TYPES OF DEVICES ARE


POSSIBLE: EXAMPLE Wii

Game & fitness accessories

Wii by Nintendo
Dancing pad Balance board

Contollers have
motion sensors

Sports game Music performance

Completely New Types of Devices

AIBO DOG PERSONAL ROBOT WITH SENSES

IT HAS SENSES:
MICROPHONE,
CAMERA, TEMPERATURE,
DISTANCE, ACCELERATION,
BALANCE, TOUCH
IT HAS INSTINCTS
AND BEHAVIORS

Completely New Types of Devices


"Is this a real cat?" A
robot cat you can bond
with like a real pet -NeCoRo is born

Omron ready to test demand for robo-cat

Via a learning function, personality traits


such as selfishness and the need for
attention will change in response to the
owner
Equipped with Omron's proprietary MaC (Mind and
Consciousness) technology, feelings are generated
according to recognition feedback, which is dependent on
configurations based on psychological concepts, leading
to cognitive decisions and actions determined by these
feelings (applicable patent acquired)
Feelings of satisfaction, anger, and uneasiness generated
based on recognition feedback

Desires to sleep or be cuddled


generated according to
physiological rhythms

PERSONAL ROBOTS
START APPEARING ...

Fujitsu has developed a new miniature


humanoid robot, named HOAP-1,
designed for wide application in research
and development of robotic technologies.
Fujitsu Automation will begin domestic
sales of the robot from today and hopes to
sell 100 units within three years.
Weighing 6kg and standing 48cm tall, the
light and compact HOAP-1 and
accompanying simulation software can be
used for developing motion control
algorithms in such areas as two-legged
walking, as well as in research on humanto-robot communication interfaces.
The basic simulation software and userdeveloped programs are designed to run
on RT-Linux on an operating command
PC, which communicates with the the
robot through a USB interface. The robot's
internal sensors and actuators (motors)
also use USB interface and can be easily
expanded according to needs

The two-legged walking


technology developed by
Honda represents a unique
approach to the challenge of
autonomous locomotion. Using
the know-how gained from
these prototypes, research and
development began on new
technology for actual use.
ASIMO represents the fruition
of this pursuit.

Menagerie of devicesDEVICES

Progress of technology is fast: Even the old


television is changing, in 2010 a three
dimensional television, 3D TV, will start

3D TV set

Glasses

And also a first TV controlled


by hand gestures will be
available (but very expensive)

What we see from these examples?


We can see that devices are developing to
have
- More complexity
- More intelligence
- More natural interaction with people
To add even more such features one needs
algorithms for multimedia signal processing,
many of these algorithms should have
capabilities similar to biological systems.

You might also like