Manual - Reference Speech Signals

SwissQual...
Diversity
Reference Speech Signals for SQuad
Measurements
Manual
Manual
Test & Measurement
Reference Speech Signals 01
The firmware of the instrument makes use of several valuable open source software packages. For information, see the "Open
Source Acknowledgement" on the user documentation CD-ROM (included in delivery).
Rohde & Schwarz would like to thank the open source community for their valuable contribution to embedded computing.
SwissQual AG
Allmendweg 8, 4528 Zuchwil, Switzerland
Phone: +41 32 686 65 65
Fax:+41 32 686 65 66
E-mail: info@swissqual.com
Internet: http://www.swissqual.com/
Printed in Germany Subject to change Data without tolerance limits is not binding.
R&S is a registered trademark of Rohde & Schwarz GmbH & Co. KG.
Trade names are trademarks of the owners.
SwissQual has made every effort to ensure that eventual instructions contained in the document are adequate and free of errors and
omissions. SwissQual will, if necessary, explain issues which may not be covered by the documents. SwissQuals liability for any
errors in the documents is limited to the correction of errors and the aforementioned advisory services.
Copyright 2000 - 2013 SwissQual AG. All rights reserved.
No part of this publication may be copied, distributed, transmitted, transcribed, stored in a retrieval system, or translated into any
human or computer language without the prior written permission of SwissQual AG.
Confidential materials.
All information in this document is regarded as commercial valuable, protected and privileged intellectual property, and is provided
under the terms of existing Non-Disclosure Agreements or as commercial-in-confidence material.
When you refer to a SwissQual technology or product, you must acknowledge the respective text or logo trademark somewhere in
your text.
SwissQual, Seven.Five, SQuad, QualiPoc, NetQual, VQuad, Diversity as well as the following logos are registered trademarks of SwissQual AG.
Diversity ExplorerTM, Diversity RangerTM, Diversity UnattendedTM, NiNA+TM, NiNATM, NQAgentTM, NQCommTM, NQDITM, NQTMTM,
NQViewTM, NQWebTM, QPControlTM, QPViewTM, QualiPoc FreeriderTM, QualiPoc iQTM, QualiPoc MobileTM, QualiPoc StaticTM, QualiWatch-MTM, QualiWatch-STM, SystemInspectorTM, TestManagerTM, VMonTM, VQuad-HDTM are trademarks of SwissQual AG.
The following abbreviations are used throughout this manual: R&S___ is abbreviated as R&S ___.
SwissQual... Diversity
Contents
Contents
1 Pre-Filtering of Reference Speech Material.........................................5
1.1
Narrowband (Telephony) Applications....................................................................... 5
1.2
Wideband (Telephony) Applications........................................................................... 6
1.2.1
Disclaimer....................................................................................................................... 6
2 SQuad-LQ Speech Quality Measurements....................................... 7

2.1
Basics.............................................................................................................................7
2.1.1
Squad-LQ Speech Design Of Samples........................................................................7
2.1.2
SwissQual Speech Material Narrowband.....................................................................8
2.1.3
SwissQual Speech Material Wideband........................................................................ 9
3 SQuad-NS Noise Suppression Measurement.................................11

3.1
Basics...........................................................................................................................11
3.2
SQuad-NS Speech Material........................................................................................ 11
3.3
SwissQual Speech Material........................................................................................12
4 SQuad-AEC (Passive) Passive Echo Disturbance Measurement

............................................................................................................... 14
4.1
Basics...........................................................................................................................14
4.2
SQuad-AEC (Passive) Speech Material.....................................................................14
4.3
5 SQuad-AEC (Active) Active Echo Disturbance Measurement....... 15

5.1
Basics...........................................................................................................................15
5.2
SQuad-AEC (Active) Speech Material....................................................................... 15
5.3
6 SQuad-RTT Round Trip Time Measurement...................................17

6.1
Basics...........................................................................................................................17
6.2
Speech-Like Sequences............................................................................................. 17
6.3
Manual Reference Speech Signals 01
Contents
Pre-Filtering of Reference Speech Material

Narrowband (Telephony) Applications
1 Pre-Filtering of Reference Speech Material

The Quality of Service (QoS) measurements that you perform with SwissQual equipment are designed to provide the same level of quality that a subscriber experiences.
For best results, SwissQual recommends that you use human speech references for
most of your measurements and speech-like test signals for special measurements.
Only human speech samples allow for an in-band transmission of the reference signal
and ensure that the transmission component reacts correctly.
For best results, use the following guidelines when you create a speech sample:
Record in a low noise environment with high quality equipment
Avoid long reverberation times in the complete frequency range of the speaker's
environment
Include male and female voices as well as human utterances that are typical for
telephone conversations
Ensure that the text is well-balanced from a phonological point of view
SwissQual equipment is designed to be connected to the electrical interface of the

sending-side. Accordingly, the acoustical behaviour of a sending device has to be
modelled by the measurement equipment in use.
1.1 Narrowband (Telephony) Applications

Conventional shaped handsets tend to show a weak high-pass characteristic, or preemphasis, in the sending direction, which means that the terminal filters the real spoken voice at the microphone before the signal is transmitted. If the sending interface of
the measurement equipment is the network termination point (two-wire analog or
ISDN), the filtering that is normally done by the handset must be modelled by the measurement equipment. To create this model, the ITU-T recommends an IRS (Intermediate Reference System) characteristic within Recommendations P.48 and P.830.
The IRS (send) filter is defined for traditional narrowband applications (up to 3.4kHz)
and for traditional wideband applications (up to 7kHz).
In narrowband scenarios (traditional telephony band), the usual behaviour of a handset
is similar to the IRS (send). To realize a normative input signal, SwissQual recommends that you use the IRS pre-filtered signals as the input signal to the headset connector. Along with a built-in filter in a Diversity MCM (Mobile Connect Module), this
connector can be considered as flat. When you use an IRS (send) pre-filtered input
signal, exactly one IRS handset is emulated.
For this reason, SwissQual provides pre-filtered reference files. These files can be sent
directly from the electrical interface of the connection to emulate a microphone.
These pre-filtered speech signals should also be used as a high quality reference for
the SQuad-LQ measurements in narrowband measurements. The differences between
the optimal speech signal in a telephone connection (IRS-pre-filtered, but completely
undistorted) and the transmitted one can be taken into account. An MCM (Radio-Inter-
Pre-Filtering of Reference Speech Material

Wideband (Telephony) Applications
face-Manager) provides an interface that is similar to a 4-wire network termination

point and can also be used with IRS pre-filtered speech signals.
1.2 Wideband (Telephony) Applications

The ITU-T also defines an IRS (send) filter for traditional wideband scenarios. However, typical test cases for wideband telephony services tend to prefer unfiltered flat
signals.
To serve wideband and super-wideband (up to 14kHz) test cases, SwissQual provides
all reference speech material for the speech-Wideband test type. This material has a
sampling frequency of 32kHz and an effective audio bandwidth of 50 to 14000 Hz. This
signal can be used to directly feed the electrical headset connector from a Diversity
MCM.
For special wideband applications, the usage of IRS(send) filtered speech material
might be required. For such scenarios, SwissQual provides wideband IRS (send) signals.
1.2.1 Disclaimer
You can only use SwissQual speech material with SwissQual products such as Diversity and QualiPoc. Use of this material with non-SwissQual products as well as further
distribution or deployment is not permitted.
SQuad-LQ Speech Quality Measurements

Basics
2 SQuad-LQ Speech Quality Measurements

The following sections describe the use of SQuad-LQ in speech quality measurements.
2.1 Basics
The measurement of listening quality is based on a comparison between a high quality
un-degraded speech sample, which is used as the input signal and the transmitted and
probably distorted signal that is recorded at the output of the connection. A psychoacoustic model is then applied to both signals after which all perceptible differences
are measured. The result of these measurements forms the overall listening quality
score. Since, the linear distortions (frequency response) also influence the score, the
selection of the input signal can also depend on the sending interface that is used.
2.1.1 Squad-LQ Speech Design Of Samples

The measurement of listening quality is based on a comparison between a high quality
un-degraded speech sample, which is used as the input signal and the transmitted and
probably distorted signal that is recorded at the output of the connection. A psychoacoustic model is then applied to both signals after which all perceptible differences
are measured. The result of these measurements forms the overall listening quality
score. Since, the linear distortions (frequency response) also influence the score, the
selection of the input signal can also depend on the sending interface that is used.
The SQuad-LQ algorithm calculates the listening quality for any arbitrary speech signal, where the characteristics of the speech material that you use can influence the
end result.
To obtain representative and reproducible measurements, the speech sample should
reflect typical human utterances in a telephone conversation.
For auditory tests that are in accordance with ITU-T P.800, short sentence pairs are
used. Both sentences were spoken from one speaker. The average derived by scoring
of some of these sentence pairs forms the mean opinion score.
For network measuring purposes, the transmission of separate files over a longer
period is not an acceptable solution. For this reason, SwissQual recommends speech
clips that contain at least two sentences from a male and a female native speaker.
The sentences are selected to avoid QoS dependencies on the text.
Table 2-1: Description of the settings for an SQuad-LQ measurement

Setting
Description
Length
6.0 s
Speech Activity
Approximately 70 %

Basics
Setting
Description
Structure
Two sentences, pause between sentences > 0.5s
Speaker
Male and female native speakers
Sampling frequency
16 kHz (for narrowband telephony)

32 kHz (for wideband telephony)
File Format
WAVE, 16bit, INTEL
Level
-26.0 dB OVL
Pre-Filtering
ITU-T Rec. P.830, mod. IRS(send)
If you want to use your own speech material, SwissQual strongly recommends a minimum sample length of 5 seconds of which at least 50% contains speech activity.
2.1.2 SwissQual Speech Material Narrowband

For illustration purposes, SwissQual provides speech material in different languages in
accordance with ITU-T P.800 and ITU-T P.862.3 recommendations. For consistent
results, SwissQual recommends the IRS pre-filtered speech samples in table 2-2 for
narrowband telephony scenarios.
Table 2-2: Description of the IRS pre-filtered reference speech samples for narrowband samples
Reference Sample
Description
am_fm_IRS.wav
American English, male+female
ar_fm_IRS.wav
Arabian, male+female
ch_fm_IRS.wav
German, Swiss pronunciation, male+female
cn_fm_IRS.wav
Chinese Mandarin, male+female
en2_fm_IRS.wav
British English, male+female

The en2_fm_IRS.wav file replaces the en_fm_IRS.wav, which is
also included for existing deployments. For new deployments,
use the en2_fm_IRS.wav reference sample.
SwissQual would like thank Psytechnics Ltd, UK, for their kindly
permission to use their British English source material to generate the new speech sample.
fr_fm_IRS.wav
French, male+female
This sample systematically yields slightly lower MOS values
than the other language reference samples in comparable situations. This discrepancy might be the result of the generation and
recording process of the French source material.
ge_fm_IRS.wav
German, male+female
gr_fm_IRS.wav
Greek, male+female
hu_fm_IRS.wav
Hungarian, male+female
it_fm_IRS.wav
Italian, male+female
jp_fm_IRS.wav
Japanese, male+female

Basics
Reference Sample
Description
pl_fm_IRS.wav
Polish, male+female
pt_fm_IRS.wav
Portuguese, male+female
ru_fm_IRS.wav
Russian, male+female
sp_fm_IRS.wav
Spanish, male+female
tk_fm_IRS.wav
Turkish, male+female
On request, SwissQual can provide all speech material for narrowband as un-filtered
(flat) source material. In addition to the 6 s samples, a 11 s sample in American English (AM_CallQual_IRS.wav ) is also provided, which you can use for Call Quality
measurements.
2.1.3 SwissQual Speech Material Wideband

For illustration purposes, SwissQual provides speech material in different languages in
accordance with ITU-T P.800 and ITU-T P.862.3 recommendations. SwissQual recommends the NON-IRS pre-filtered speech samples in table 2-3 for wideband telephony
scenarios.
Table 2-3: Description of the non-IRS pre-filtered reference speech samples for wideband scenarios
Reference Sample
Description
am_fm_wide.wav
du_fm_wide.wav
Dutch, male+female
ch_fm_wide.wav
en_fm_wide.wav

SwissQual would like thank Psytechnics Ltd, UK, for their kindly permission
for use their British English source material to generate the new speech
sample.
ge_fm_wide.wav
German, male+female
it_fm_wide.wav
For special purposes, all speech material in wideband is also available with WB-IRS
(send) pre-filtering.
The use of wideband material with IRS pre-filtering can lead to a recognizable limitation in audio bandwidth. In speech wideband test cases, this limitation is scored as a
degradation.
Table 2-4: Description of the WB-IRS pre-filtered reference speech samples for wideband scenarios
Reference Sample
Description
am_fm_IRS_wide.wav
du_fm_IRS_wide.wav
Dutch, male+female

Basics
Reference Sample
Description
ch_fm_IRS_wide.wav
en_fm_IRS_wide.wav

SwissQual would like thank Psytechnics Ltd, UK, for their kindly permission
for use their British English source material to generate the new speech
sample.
ge_fm_IRS_wide.wav
German, male+female
it_fm_IRS_wide.wav
10
SQuad-NS Noise Suppression Measurement

Basics
3 SQuad-NS Noise Suppression Measurement

The following sections describe the use of SQuad-NS in speech quality measurements.
3.1 Basics
The main parameter of the SQuad Noise Suppression (NS) measurement assesses
the improvement or degradation of the noisy speech sample during transmission by
comparing the input of the noisy speech sample to the output sample. Other parameters assess the noise suppression and level deviations.
The SQuad-NS measurement requires the following reference signals:
Noise-free reference speech signal
Same noise-free signal but mixed with an additive noise signal
3.2 SQuad-NS Speech Material

You can use an arbitrary speech signal in combination with a noise to run SQuad-NS.
However, for a proper SQuad-NS measurement, you need to include the noise free
(clean) reference sample and the sample with the background noise (noisy sample) for
a proper measurement. Furthermore, the speech signal must be in conformance with
the signals that are described for SQuad-LQ. The initial pause of the signal must be a
minimum of 2.0 s. To account for the so-called Lombard-effect, the speech level must
be at 3 dB above the recommended value for SQuad-LQ.
Table 3-1: Description of the settings for an SQuad-NS measurement
Setting
Description
Length
8.0 s
Speech Activity
approx. 50 %
Structure
Two sentences initial pause > 2.0 s pause between the sentences > 0.5 s
Speaker
Male and/or female native speakers
Sampling frequency
16 kHz
File Format
WAVE, 16bit, INTEL
Speechlevel
-23.0 dB OVL
Noiselevel
-23.0 -50.0 dB OVL (recommended)
Pre-Filtering
11

SwissQual Speech Material
3.3 SwissQual Speech Material

SwissQual provides IRS pre-filtered speech material in different languages. This material is based on the reference samples that are recommended for SQuad-LQ.
The reference speech samples are available with the following types of background
noise:
In-car (stationary)
Street-noise (non-stationary)
Each of these noise types is mixed to each of the speech samples in four different level
steps:
-26 dB OVL noise level (SNR = 3 dB)
The filenames of the reference speech samples provide a description of the file content. For example, the Am_fm_IRS_16k_car_32.wav file is the American English
speech sample that has been mixed with car noise at 32 dB. The
Am_fm_IRS_16k_car_32_clean.wav file is the corresponding speech sample without the background noise.
For measurements in English, the following reference samples are recommended:
Am_fm_IRS_16k_car_26.wav / Am_fm_IRS_16k_car_26_clean.wav
Am_fm_IRS_16k_car_44.wav / Am_fm_IRS_16k_car_44_clean.wav
The same speech signals are also available interlaced with street noise:
Am_fm_IRS_16k_str_26.wav / Am_fm_IRS_16k_str_26_clean.wav
Am_fm_IRS_16k_str_44.wav / Am_fm_IRS_16k_str_44_clean.wav
The following table contains the other languages that SwissQual provides similar reference samples for.
Table 3-2: Description of the prefix for a reference speech sample
Reference Sample
Language
En_*.wav
English
Ge_*.wav
German
Gr_*.wav
Greek
It_*.wav
Italian
Jp_*.wav
Japanese
Ru_*.wav
Russian
Sp_*.wav
Spanish
For best results, use the SwissQual reference sample files.
12

You can use custom reference material if the material fulfills the defined requirements.
13
SQuad-AEC (Passive) Passive Echo Disturbance Measurement

Basics
4 SQuad-AEC (Passive) Passive Echo Disturbance Measurement

The following sections describe the use of SQuad-AEC (Passive) in speech quality
measurements.
4.1 Basics
The SQuad-AEC (passive) algorithm searches for reflections (echoes) of a sent
speech signal and if present, calculates the delay of the reflection with respect to the
sent signal and, if also present, the echo return loss of the reflected signal. Side-tones,
that is, reflections with a delay of less than 20 ms, are ignored by both of the algorithms, but are still signalized.
4.2 SQuad-AEC (Passive) Speech Material

You can use an arbitrary speech signal to measure the passive echo disturbance with
the SQuad-AEC (passive) algorithm.
Table 4-1: Description of the settings for an SQuad-AEC (Passive) measurement
Setting
Description
Length
> 12.0 s
Speech Activity
> 90 %
Structure
Continuous speech
Speaker
Sampling frequency
16 kHz
File Format
WAVE, 16bit, INTEL
Speech level
-23.0 -29.0 dB OVL
Pre-Filtering
squad_aec.wav
14
SQuad-AEC (Active) Active Echo Disturbance Measurement

Basics
5 SQuad-AEC (Active) Active Echo Disturbance Measurement

The following sections describe the use of SQuad-AEC (Active) in speech quality
measurements.
5.1 Basics
The SQuad-AEC (active) algorithm searches for reflections (echoes) of a sent speech
signal that is generated actively by the far-end side and calculates the echo delay as
well as the echo return loss of the residual echo. Side-tones (reflections with a delay of
less than 20 ms) are ignored by both of the algorithms, but are still signalized.
5.2 SQuad-AEC (Active) Speech Material

Due to the complexity of the measurement, the file-length of the active measurement is
shorter than the passive measurement. Basically, you can use an arbitrary speech signal with an exact length of 6 s to measure the echo disturbance with SQuad-AEC
(active).
Table 5-1: Description of the settings for an SQuad-AEC (Active) measurement
Setting
Description
Length
6.0 s
Speech Activity
> 90 %
Structure
Continuous speech
Speaker
Sampling frequency
16 kHz
File Format
WAVE, 16bit, INTEL
Speech level
-26.0 dB OVL
Pre-Filtering
SwissQual also provides 6-second speech clips to generate double talk at the far end
side. The speech activity is lower than the default speech clips and focused on speech
bursts. Although you can use the default clips (length = 6sec), some echo mis-spotting
can occur.
Table 5-2: Description of the settings for a double-talk SQuad-AEC (Active) measurement
Setting
Description
Length
6.0 s
Speech Activity
10 50%
15
SQuad-AEC (Active) Active Echo Disturbance Measurement

Setting
Description
Structure
Isolated utterances
Speaker
Male or female native speakers
Sampling frequency
8 kHz (!)
File Format
WAVE, 16bit, INTEL
Speech level
-26.0 dB OVL
Pre-Filtering
SQuadAECact.wav
SwissQual strongly recommends that you use the default reference sample files as
they are optimally adjusted to avoid interactions between the files.
Table 5-3: Description of the double-talk reference speech samples
Reference Sample
Language
dt_10_8kHz.wav
(10% speech activity, female Croatian)
dt_25_8kHz.wav
dt_50_8kHz.wav
16
SQuad-RTT Round Trip Time Measurement

Basics
6 SQuad-RTT Round Trip Time Measurement

The following sections describe the use of SQuad-RTT in speech quality measurements.
6.1 Basics
The measurement of the round trip time is based on an in-band transmission of short
voice-like sequences. During the measurement, one sequence is sent repeatedly from
the A-side to B-side and after the signal is received, a different sequence is sent back
from the B-side to A-side. SwissQual strongly recommends that you use the default
reference speech samples RTTvoice_A.wav and RTTvoice_B.wav.
6.2 Speech-Like Sequences

For the in-band RTT measurement, two different sequences are necessary, where
each sequence must fulfil the technical characteristics in table 6-1.
Table 6-1: Description of the characteristics for a SQuad-RTT measurement
Setting
Description
Length
0.5 s 0.6 s
Speech Activity
> 80 %
Sampling frequency
16 kHz
File Format
WAVE, 16bit, INTEL
Level
-27.0 -23.0 dB OVL
Pre-Filtering
RTTvoice_A.wav
RTTvoice_B.wav
17

Manual - Reference Speech Signals

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Manual - Reference Speech Signals

Uploaded by

Copyright:

Available Formats

SwissQual...

Test & Measurement

Reference Speech Signals 01

Narrowband (Telephony) Applications....................................................................... 5

Wideband (Telephony) Applications........................................................................... 6

2 SQuad-LQ Speech Quality Measurements....................................... 7

Squad-LQ Speech Design Of Samples........................................................................7

SwissQual Speech Material Narrowband.....................................................................8

SwissQual Speech Material Wideband........................................................................ 9

3 SQuad-NS Noise Suppression Measurement.................................11

SQuad-NS Speech Material........................................................................................ 11

SwissQual Speech Material........................................................................................12

4 SQuad-AEC (Passive) Passive Echo Disturbance Measurement

SQuad-AEC (Passive) Speech Material.....................................................................14

SwissQual Speech Material........................................................................................14

5 SQuad-AEC (Active) Active Echo Disturbance Measurement....... 15

SQuad-AEC (Active) Speech Material....................................................................... 15

SwissQual Speech Material........................................................................................16

6 SQuad-RTT Round Trip Time Measurement...................................17

SwissQual Speech Material........................................................................................17

Manual Reference Speech Signals 01

Manual Reference Speech Signals 01

Pre-Filtering of Reference Speech Material

1 Pre-Filtering of Reference Speech Material

Record in a low noise environment with high quality equipment

Ensure that the text is well-balanced from a phonological point of view

SwissQual equipment is designed to be connected to the electrical interface of the

1.1 Narrowband (Telephony) Applications

Manual Reference Speech Signals 01

Pre-Filtering of Reference Speech Material

face-Manager) provides an interface that is similar to a 4-wire network termination

1.2 Wideband (Telephony) Applications

Manual Reference Speech Signals 01

SQuad-LQ Speech Quality Measurements

2 SQuad-LQ Speech Quality Measurements

2.1.1 Squad-LQ Speech Design Of Samples

Table 2-1: Description of the settings for an SQuad-LQ measurement

Manual Reference Speech Signals 01

SQuad-LQ Speech Quality Measurements

Two sentences, pause between sentences > 0.5s

Male and female native speakers

16 kHz (for narrowband telephony)

WAVE, 16bit, INTEL

ITU-T Rec. P.830, mod. IRS(send)

2.1.2 SwissQual Speech Material Narrowband

American English, male+female

German, Swiss pronunciation, male+female

Chinese Mandarin, male+female

British English, male+female

Manual Reference Speech Signals 01

SQuad-LQ Speech Quality Measurements

2.1.3 SwissQual Speech Material Wideband

American English, male+female

German, Swiss pronunciation, male+female

British English, male+female

American English, male+female

Manual Reference Speech Signals 01

SQuad-LQ Speech Quality Measurements

German, Swiss pronunciation, male+female

British English, male+female

Manual Reference Speech Signals 01

SQuad-NS Noise Suppression Measurement

3 SQuad-NS Noise Suppression Measurement

Noise-free reference speech signal