You are on page 1of 6

Representing Time in Multimedia Systems

Thomas Wahl, Kurt Rothermel

Universitat Stuttgart
Institut f i r Parallele und Verteilte Hochstleistungrechner
{ rothermel, wahl}@informatik.uni-Stuttgart .de

Abstract’ senting time in multimedia environments.


To represent interval relations between multimedia pre-
As multimedia system integrate a variety of temporally
sentation items temporal abstractions are needed. A set of
interrelated media items, synchronization is an important
temporal abstractions is called a temporal model in this pa-
issue in those systems. One part of synchmnization is the
per. E.g.if the notion of time is a number of totally ordered
representation of temporal information. Time models are
events in a one-dimensional time space, the time line model
needed to specij) temporal interrelations. with the emerg-
might be used to represent temporal relations.
ing interactive media, deterministic models are replaced
In any specific context, it is desired that all multimedia
by indeterministic models with more expressiveness. The-
scenarios are representable by applied temporal model. A
refore, this paper classijks and evaluates a selection of the
temporal model is complete in a context if all scenarios are
most common existing models by their expressiveness.
expressible within the model. So, which are complete tem-
Additionally, a temporal model based an temporal opera-
poral models? A number of temporal models have been
tors is proposed providing high-level abstractions and a
proposed by various authors. To decide whether they are
high degree of expressiveness for multimedia systems.
appropriate for multimedia or not, all scenarios would have
to be enumerated in all contexts. Since this is not possible,
1 Introduction we chose a different approach. First, we asked: What are all
possible interrelations between temporal objects? In
Multimedia systems integrate a variety of media with Section 2, a summary of all temporal relations in two dif-
different temporal characteristics, e.g. time dependent me- ferent frameworks is given. Then, the question ‘Which out
dia, such as video, audio or animation, and time indepen- of all possible temporal reIations might be needed for mul-
dent media, such as text, graphics and images [SteigO]. In timedia?’ is examined in Section 3. For a number of com-
monomedia environments, all media show the same basic monly used time models, the representable temporal rela-
temporal behavior. Time does not need any particular atten- tions are extracted in Section 4. Combining the results of
tion. Now with the arising multimedia systems, various Section 3 and 4 with the context constraints, the complete-
temporal interrelations between media items become more ness of a temporal model can be determined.
and more important. Analyzing the temporal models, we found that interval-
Assuring the correct temporal appearance of the media based models generally do not use all interval relations al-
items is called synchronization. The issue of synchroniza- though interval relations represent higher-level abstrac-
tion is twofold. First, the temporal appearance including tions than point-based relations. So, we developed in
the interrelations of presentation items have to be specified. Section 5 a complete operator set to represent a set of rele-
The temporal specification has to be represented for re- vant interval relations.
viewing by the user, presentation planning by the system The expressiveness is only one important criteria when
and storing purposes. Secondly, the multimedia system has choosing a temporal model. A second is the intuitivity and
to guarantee the temporal constraints when presenting the nature of the temporal abstractions. Although formal mod-
media items. This paper focuses on the first issue of repre- els in temporal logics generally have a powerful expres-
siveness, they are not very intuitive and not easy to handle
for multimedia users that are inexperienced in formal spec-
1. The research described in this paper was supported by the Deutsche ification techniques. Thus, intuitive and natural operators
Forschungsgemeinschaft (DFG) unter grant Tiempo 1 were chosen to cover the interval relation space.

538
0-8186-5530-5194 $3.00 Q 1994 IEEE
2 Basic temporal frameworks
To systematically develop a complete set of temporal in-
terrelations for multimedia, we examine how many and
which relations are theoretically possible. Depending on
their elementary units, two basic classes of temporal mod-
els can be distinguished [vBee92], [Rich89]. In the first
class, time is expressed by means of points in a one-dimen-
sional time space [ViKa86] whereas, in a second model
class, intervals are the atomic units of the time space
[Alle83]. This section introduces the basic temporal frame-
works, their elementary units and the relations between
them.

number of PRs number of IRs


basic 3 13
indefinite Z3=8 zi3 = 8192

2.3 Translations between the frameworks


As any interval can be characterized by its beginning
and end, any basic IR can be represented by a conjunction
of PRs on its margins. E.g. (BX<B~)A(EY<BX)A(EXCEY)A
(Bx<Ey)is the equivalent of ‘x overlaps y’. Also, a number
of indefinite IRs can be represented by a PR conjunction on
its margins. Table 3 summarizes how many indefinite IRs
are representable by PR conjunctions.
point relation base number of consistent indefinite IRs
<, =, > 13
<, =, >, ? 29

I 5, <,=, >,L, *, ? [ 187


I
relation symbol inverse n!$An example class TABLE 3. lndefinlte IRs generated by point relations

x beforey c > BxcExcBycEy k$ sequential 3 Temporal relations in multimedia


From Section 2, we learned that 8 indefinite PRs exist.
x overlaps y o oi BXCB~CEXCE~ H $ t parallel However, are all of them necessary to specify time in mul-
timedia? Obviously, the basic PRs c, =, > occur in multi-
y finishes x f fi Bx<BycEx=Ey parallel media because presentation events might be before, simul-
y duringx d di BxcBycEycEx Q; parallel taneous to or after other events.
To evaluate the indefinite PRs s , 2 , *, ?, we have to con-
x starts y S si B X = B ~ < E X < E ~p
1
-
; parallel
sider the fact that small inaccuracies are tolerated in multi-
x equalsy
= I = I BX=B~<EX=E~ I;1-1 I parallel media. E.g. in a video-audio presentation, the audience
does not notice the skew introduced if the audio is present-
ed too early or too late by some milliseconds [Stei92],

539
[LiKo92], [RoDe92]. So, we do not need to specify the resentable within the model. Since it is not possible to enu-
temporal behavior at exactly one point in time rather it is merate all potential scenarios, we characterize the expres-
sufficient to specify the temporal behavior close to each sive power for a number commonly used temporal models
point in time. This implies that there is not any perceptible by the temporal relations that are expressible. Our analysis
difference in the presentation if somebody specifies for two determines the PRs and IRs of the selected temporal mod-
events el and e2 that el < e;! or in the second case el s 9. els and gives a classification whether it is mainly point- or
This holds because the audience cannot distinguish wheth- interval-based.The latter question is not always easy to an-
er el is simultaneous to e2 or el is 1millisecond before e2. swer because some temporal models use intervals as their
Therefore, it is sufficient to be able to express only one of basic units but their relations address at most one end-point
the relations < or 5 . In this paper, we operate with the rela- of each interval. However, those models have essentially
tions < and >, and do not need the relations s and 2. Ana- the same characteristics as point-based approaches.
logically, the relation * differs from the ?-relation only in
one point in time. Since there is not perceptible difference
between the two relations, we do not need the relation if
4.1 Time-line
animation
we have the relation ?. Observe that we need the relation ?
if any basic PR can hold between two events. This indefi- time
nite often occurs during the specification and planning pro- b
cess when not all events are known yet. FIGURE 1. Tlme llne model
Finally, the PR 0 might be used to spe cify inconsisten-
cies because 0 means that no temporal relation between The time line model is applied by [BHL91], [Gibbgl],
2 events exist. Consequentially, only either of the events [Applgl], [Drap93] and in HyTime [HyTi92]. In the time
can occur. In this paper, we focus on consistent scenarios. line model, all events are aligned on a time axis (time-line)
So, we do not consider the inconsistent relation 0. as it is shown in Figure 1.Since events are the atomic units,
To summarize, the relations e, = , > and ? are the rele- the time line model is point-based. All events are totally or-
vant PRs in multimedia environments. Powerful point- dered on a time line. So, exactly one of the PRs <, =, >
based temporal models should be able to express at least holds between any pair of events on a single time line. As
this set of relations. all events are totally ordered, it is impossible not to define
According to Section 2.3, the PRs <, =, > and ? generate a relation between any two events. This means that the re-
the 29 interval relations. An enumeration of the 29 IRs is lation ‘?’ cannot be expressed in the time-line model. This
given in [WaRo93]. Since at most 187 (Table 3) out of 2131 lack of flexibility is a major disadvantage of the time-line
= 8191 consistent IRs have a point-based conjunctive nota- model. With e, = and > being the only possible PRs in the
tion, we qualified only 187-29 = 158 as irrelevant for mul- time-line model, we can conclude that the 13 basic IRs are
timedia by looking at the PR notation. So, what is the rele- the only IRs that are expressible in the time-line model.
vance of the 8191-187 = 8004 indefinite IRs that cannot be
represented by a PR conjunction? We found that not all of 4.2 Temporal point nets
them are irrelevant. One of the relevant IRs is the relation

5
simultaneous
‘not parallel’ which is needed when limited resources are
shared. For example, if there is only one loudspeaker, then animation
two incompatible audio sequences should not be presented simu taneo
simultaneously.Therefore, we specify the audio sequences vi& audio
not to be parallel. This is expressed by the indefinite IR {<, e ore
m, mi, >}. Represented by PRs, a disjunction is needed:
FIGURE 2. Temporal polnt nM
Ex s By v Ey s Bx. Consequently, ‘not parallel’ cannot be
represented in point-based systems that do not allow dis- [BuZe93] use a point net to represent time specifications
junctions. (Figure 2). Relations address events establishing temporal
equalities (=) and temporal inequalities (<, >). Although
4 Evaluation of multimedia time models [BuZe93] does not mention it, a fourth relation (?) can be
specified meaning: The relation between two time points is
To specify the temporal interrelations of media items, not restricted. The ?-relation adds a flexibility to the model
several representations techniques based an different tem- that cannot be found in the time-line model. Using the PRs
poral models have been proposed by various authors. Each <, =, > and ?, 29 IRs can be represented including the 13ba-
technique uses abstractions about time. When choosing a sic IRs.
temporal model, one of the criteria is its expressive power, [BuZe93] also defines a relation construct ‘before by at
which is defined by the multimedia scenarios that are rep-

540
least 6' where 6 is a delay parameter describing the tempo-
ral distance between two events. This is why, the point re-
lations 5 and z can be specified as well. In total, the PRs c, sequence parallel-first parallel-last
s,=, 2 , >, ? are representable in the point net model gener-
FIGURE 4. Path expressions
ating 82 IRs.
single PR. The sequence operator models a relation be-
tween the end-point of the first and the beginning of the
4.3 Timed petri-nets second interval. The IRs that can be expressed by the se-
h

quence operator are {m} and {mi}. Using a delay interval


begin-begin
?: 0 0 [Hoepgl], it is also possible to represent {<} and {>}.
For this classification, the operators parallel-first and
5, <, =, >, 2:

@+@-+o
end-end
3+ parallel-last are identical because the attributes first and
last give reference points for subsequent operators, which
do not have any impact on our relation analysis. The paral-

!z
end-begin
lel operators establish a relation between the start-points of
two intervals. 3 indefinite IRs are expressible by the paral-
=: begin-end lel operators: {s, =, si}, {di, 0,fi, m, <} and {>, mi, oi, f, d}.
. - To summarize, path expressions are only able to represent
FIGURE 3. Petri nets 7 IRs: 4 basic IRs {m}, {mi}, { c } , {>} and 3 non-basic in-
A timed petri net model is proposed by [LiGh90] and definite IRs {s, =, si}, {di, 0,fi, m, c } , {>, mi, oi, f, d}.
[Hoep91]. The petri net of [Hoep91] is a mapping of the MHEG (Multimedia Hypermedia Expert Group)
path notation on petri nets and will be analyzed together [MHEG92], [KrCa92] is a standardization group to estab-
with the path notation in Section 4.4. In this section, we es- lish a new standard for multimedia objects. MHEG uses
sentially follow the petri net definition of [LiGhgO]. There, two temporal operators sequential and parallel similarly to
intervals are represented by places and relations by transi- the path expression model. We will show in section 5 that
tions. In order to avoid ambiguities, we need the additional the both operators are elements of a more general interval
assumption that petri nets in this context are conflict-free. operator set.
The basic units of the model are intervals. Therefore, this
model is classified as interval-based although transitions 4.5 Summary of the evaluation
refer only to end-points of intervals. Table 4 summarizes the multimedia time models, their
The relation '?' is specified if two places are not con- basic types and the corresponding IRs that can be repre-
nected by any transition. As shown in Figure 3, c , =, > can sented. It can be observed that non of the examined tempo-
be modelled by a transition in conjunction with a delay ral models exceeds the expressive power of the point-
place 6. The delay place represents an idle time 6 C !I?+,. If based framework, not even those models that operate on
6 is in !I?+, the corresponding relations are c and >. The re- intervals.
lation = is modelled if 6 = (0). In this case, the place can
be omitted as it is done in Figure 3. If 6 is unrestricted in number of interval relations
%+, then 5 or 2 is expressed. time model type representable
In petri nets, the PRs s,c, =, >, z can be represented. total basic by t h e P y
Since Figure 3 assures that any combination of interval
<, =, >, .
time-line point-based 13 13 13
end-points can be connected by a relation, the petri net
polnt nets point-based 82 13 29
model is as powerful as the point net model. This means
petri nets interval-based 82 13 29
that 82 IRs can be expressed although [LiGh90] described
path expressions interval-based 7 4 7
only the 13 basic IRs.
TABLE 4. Summary: Multimedia time models

4.4 Path expressions 5 An interval-based time representation


Path expressions were introduced by [CaHa74] for pro-
cedure level synchronization and adapted by [Hoep91] for For multimedia, the point-based framework is suffi-
multimedia presentation systems. Path expressions include ciently covered by the point net and petri net model. How-
three operators to represent temporal relations: sequeizce, ever to get higher-level abstractions of temporal relations,
parallel-first and parallel-last. it might be useful to develop operators for the interval-
The basic units of path expressions are intervals. How- based framework because natural presentation actions ap-
ever, all three express only IRs that can be described by a pear as temporal intervals as they have a non-zero, finite

541
duration. Also, the interval-based framework covers more
relations than the point-based. Therefore, a complete set of
interval-based abstractions for multimedia is developed in
this section.

5.1 Modelling presentation actions


First, we have to introduce the notion of a presentation
action. Any multimedia presentation is composed of single
media items. The process of presenting a single media item
is called a presentation action. Any action can be charac- ~&(61,az), 61L (0) endin(6l,az), 6i L (0) 6i {O}
0~crlapa(81,6~63),
terized by its margins, the beginning and the ending, and
the duration 6 which describes the time that is required FIGURE 5. Barb IR pattoms and tholr gonerlo oprratorr
when presenting a media item. The duration 6 has a specific definitions can be derived from the patterns. For example,
fixed value for any real presentation. However in the pro- the operator x before(&) y is defined by Ex + a1 = By, i.e.
cess of planning a presentation, the final duration might not the beginning of the interval y is SI time units after the end
be known. Therefore, the duration is described as a subset of the interval x.
of the non-negative numbers 8’0[KeLoBl] indicating the The first regularity is that some relations are inverse to
pcatential values of the duration. So, the duration can be a each other. E.g., ‘x meets y’ is the inverse of ‘y meets x’.
single real number, a range within the real numbers or to- So, we can use the operator before(61) to specif both rela-
tally unrestricted in 8’0. E.g. the duration of a 90-minute Y
tions: x before(0)y for ‘x meets y ’ and x before- (0)y for ‘y
video that might be stopped on user interaction is written as meets x’. In graphical notations, the inverse is expressed by
[0 min, 90 min] E %t+obecause the real duration is 90 min- an inverted edge.
utes or less depending on the user interaction. In the other The second regularity is that some relations differ only
case, the duration is denoted as [90min, 90minI = by an offset from others. E.g., ‘x meets y’ and ‘x < y’ are
(90 min} G 8+o meaning the duration cannot be modified only in so far distinct as there is a non-zero time span be-
and has a fixed value. tween x and y in the case of ‘x < y’ and a zero time span in
A delay is a time span which passes without presenting the case of ‘x meets y’. IRs that differ only in offsets are
any audio-visual output, and thus it is distinct from a pre- combined to the same operator. Then, the IRs can be distin-
sentation action with a perceivable output. On the other guished by the delay parameter of the operators. In the
hand, the temporal characteristics are similar to those of given example, we specify x before(0)y for ‘x meets y ’ and
presentation actions. So, a delay can be described as a sub- x before(+)y for ‘x < y’. As we introduced in 5.1, the delay
set of the non-negative real numbers %+-., parameter may be any subset of %+eWe use the notation
Note that, in this paper, it suffices to characterize a pre- ‘0’if the delay is zero, ‘+’ if the delay has a positive value,
sentation action only by its temporal behavior. Other at- and ‘*’ if the delay is positive or zero.
tributes including those specifying the location, the quality To avoid having several specification methods for the
or associated media of a presentation are not subject of our same IR, we require 61 rr (0)for some of operators in Fig-
investigation. ure 5. Then, the 10 operators are a complete set to specify
any of the 29 IRs generated by c, =, > and ?.
The construction of the interval operators yields differ-
5.2 Enhanced interval-basedmodel
ent types of operators taking 1 , 2 or 3 delay parameters. The
Section 3 showed that the 4 multimedia-relevant PRs
1-parameteroperators are before, cobegin, beforeendof and
generate 29 IRs. So, an interval-based model should cover
coend. Operators with 2 parameters are while, delayed,
at least the 29 IRs [WaRo93]. If an interval operators is de-
sturtin, endin and cross. Finally, overkaps is an operator
fined for each IR, the 29 operators represent a complete
that takes 3 parameters.
model. However to use 29 different operators, seems to
A delay or a duration parameter isfixed if only one value
complicate a presentation specification.Therefore, it would
is admitted, e.g. a full length video with its natural presen-
be good to have fewer, more powerful operators which are
tation speed has a fixed duration of 90min = [90min,
still intuitive to multimedia authors.
90 min]. When specifying an IR, one has to specify the du-
Fortunately, the number of operators can be reduced by
ration of the two presentation actions and up to 3 delay pa-
exploiting regularities between the IRs. Then, several IRs
rameters. Hence, specifying 3 fixed values for the delay or
can be combined to one operator and the number of opera-
the duration totally determines the final presentation se-
tors can be reduced from the original 29 to 10. Figure5
quence. Therefore, at most 3 fixed delay or duration param-
shows the generic pattern for each of the operators. Formal
eters are allowed to avoid over-constraint specifications.

542
The 10 interval operators are a complete set to represent ed, interval operators or a subset such as the path expres-
the 29 relations generated by <, =, > and ?. But this does not sions are applicable.With the emerging interactive media, it
imply that all operators are needed to define a complete is expected that the advanced models are used more fre-
model for a multimedia environment? Sometimes, only a quently.
selection of the operators is necessary. E.g., if the duration
of all media items is preknown and fixed, the temporal
References
[Me831 J. F. Allen. Maintaining Knowledge about Temporal Intervals.
model may be restricted to the operators taking at most 1 Comm. ACM, 26(11):832843,11 1983.
delay parameter. Note that the requirements ‘preknown and [Appl91] Computer Inc. Apple. QuickTime Developer’s Guide.
Developer technical publications, 1991.
fixed duration’ are very strict and prohibit any kind of in-
[BDF+92] Ingo Barth, Gabriel Dermler, Franz Fabian, Kurt Rothermel,
teraction or flexibility. With the emerging interactive mul- Frank Sembach, Robert Erfle, and Johannes Rueckert. Multimedia
timedia systems, it is expected that a larger subset of the in- -
Document Handling A Survey of Concepts and Methods. Technical
report, IPVR - IBM, 9 1992.
terval operators is needed because interactive media items [BHL91] G. Blakowski, J. Huebel, and U. Langrehr. Tools for
introduce a huge number of unpredictable durations. Specifying and Executing Synchronised Multimedia Presentations.
2nd. Intl. Workshop on Network and Operating System Support for
Digital Audio and Video, Heidelberg, 11 1991.
[Bruc72] B.C. Bruce. A Model for Temporal Reference and its
5.3 Example Applicants in a Question Answering Program.Artificial Intelligence,
3:l-12.1972.
k s l y A 3 r ; 2 .... [BuZe93] M. Cecilia Buchanan and Polle T. Zellweger. Automatic
Temporal Layout Mechanisms. In 1st ACM Intl. Conference on
Multimedia, pages 341 - 350,8 1993.
talk1 talk2 [CaHa74] R. H. Campbell and A. N. Habermann. The Specification of
Process Synchronisation by Path Expressions. In G. &os and
slide x while(3 sec,*) talk x, x €{l,..,n}; J. Hartmanis, editors, Lecture notes in Computer Science No. 16,
slide x before(0) slide x+l, x €{l,..,n-l};
Operating Systems, pages 89-102. Springer Verlag, 1974.
[Drap93] George D. Drapcau. Synchronization in the MAEstro
Multimedia Authoring Environment. In 1st Intl. ACM Conference on
FIGURE 6. Interactive slide show scenario -
Multimedia. aaees 331 340. 8 1993.
[Gibb91] Simbn Cibbs. Composite Multimedia and Active Objects.
A multimedia presentation scenario is used to demon- Proc. OOPSZA’91, pages 97-112, 1991.
strate the interval operators work. Figure 6 shows how an [Hoep91] P. Hoepner. Synchronisation der Praesentation von
Multimediaobjekten-Modellund Beispiele. In J. Encarnacao, editor,
author specifies a slide show applying interval operators. Informatik-Fachberichte 293, Telekommunikation und multimediale
Any slide and its corresponding talk are presented simulta- Anwendungen der Informatik, pages 455-464. GI, Springer-Verlag,
10 1991.
neously. Before each talk, there is a silence of 3 seconds to [HyTi92] HyTime. Information technology - Hypermediaime-based
give the spectator a first impression of the slide. After a talk Structuring Language (HyTime). ISOIIEC DIS 10744,8 1992.
[KeLo91] Somnuk Keretho and Rasiah Loganantharaj. Qualitative and
is finished, the user can look at the slide silently or continue Quantitative Time Interval Constraint Networks. Proc. of ACM, San
to the next slide interactively. Antonio, 1991.
[KrCa92] Francis Kretz and Francoise Calaitis. Standardizing
In contrast, this scenario is not representable in the time- HvDermedia Information Obiects. IEEE Communications Magazine,
line model because the end-points of the slides is deter- 5 i‘992.
[LiGh90] T. D. C. Little and A. Ghafoor. Synchronisation and Storage
mined interactively. This means that the end-points are not Models for Multimedia Objects. IEEE Journal on Selected Areas in
known ahead of time. However, we need the end-point of Communications, 8(3):413-427,3 1990.
[LiKo92] T. D. C. Little and F. Koa. An Intermedia Skew Control
the previous slide to specify the beginning of the next slide. System for Multimedia Data Presentation. In 3rd Intl. Workshop on
We would have to pick a point on the time-line although we Network and Operating Systems Support for DigitalAudio and Video,
pages 121 - 132,11 1992.
do not know when this point in time will be. This specifica- IMHEG921 ISO/IEC/WD MHEG. Information Technology - Coded
L ~~

tion problem of the time-line model is caused by its lack of Represkntationof Multimedia and Hypermedia Informaiion Objects.
Working Draft 5 , ISO/IEC, 3 1992.
flexibility, i.e. the time-line requires a total specification of [Rich891 Michael M. Richter. Prinzipien der Kuenstlichen Intelligenz.
all temporal relations between media items not admitting B.GTeubner Stuttgart, 1989.
[RoDe92] K. Rothermel and G. Dermler. Synchronization in Joint-
any indeterminism. Consequently, the time-line model is Viewing Environments. In 3rd Intl. Workshop on Network and
not appropriate for partial specifications or interactive me- Operating Systems Support for Digital Audio and Video, 1992.
[Stei90] Ralf Steinmetz. Synchronisation Properties in Multimedia
dia environments. Systems. IEEE Journal on Selected Areas in Communications,
8(3):401-412,4 1990.
6 Conclusion [Stei92] Ralf Steinmetz. Multimedia Synchronisation Techniques:
Experiences Based on Different System Structures. In Multimedia
Communications, Monterey, USA CA, 4 1992. 4th IEEE COMSOC
The very intuitive and natural time-line model is suffi- International Workshop.
cient for specifying deterministic scenarios. However as [vBee92] Peter van Beek. Reasoning about qualitative temporal
information.Artificial Intelligence, 8(956):297 - 326, 12 1992.
soon as interactive media are added, indeterministic tempo- [ViKa86] M. Wain and H.A. Kautz. Constraint propagation algorithms
ral relations are present requiring more expressiveness. Ad- for temporal reasoning. In AAAI-86 Philadelphia, PA, pages 132 -
144,1986.
vanced point-based models might be applied such as the [WaRo93] Thomas Wahl and Kurt Rothermel. Representing Time in
petri net or the point net. In cases when the expressive pow- Multimedia Systems. Technical Report 12, Universitaet Stuttgart, 10
1993.
er is still insufficient or higher-level abstractions are need-

543

You might also like