Professional Documents
Culture Documents
Diversity
Reference Speech Signals for SQuad
Measurements
Manual
Manual
The firmware of the instrument makes use of several valuable open source software packages. For information, see the "Open
Source Acknowledgement" on the user documentation CD-ROM (included in delivery).
Rohde & Schwarz would like to thank the open source community for their valuable contribution to embedded computing.
SwissQual AG
Allmendweg 8, 4528 Zuchwil, Switzerland
Phone: +41 32 686 65 65
Fax:+41 32 686 65 66
E-mail: info@swissqual.com
Internet: http://www.swissqual.com/
Printed in Germany Subject to change Data without tolerance limits is not binding.
R&S is a registered trademark of Rohde & Schwarz GmbH & Co. KG.
Trade names are trademarks of the owners.
SwissQual has made every effort to ensure that eventual instructions contained in the document are adequate and free of errors and
omissions. SwissQual will, if necessary, explain issues which may not be covered by the documents. SwissQuals liability for any
errors in the documents is limited to the correction of errors and the aforementioned advisory services.
Copyright 2000 - 2013 SwissQual AG. All rights reserved.
No part of this publication may be copied, distributed, transmitted, transcribed, stored in a retrieval system, or translated into any
human or computer language without the prior written permission of SwissQual AG.
Confidential materials.
All information in this document is regarded as commercial valuable, protected and privileged intellectual property, and is provided
under the terms of existing Non-Disclosure Agreements or as commercial-in-confidence material.
When you refer to a SwissQual technology or product, you must acknowledge the respective text or logo trademark somewhere in
your text.
SwissQual, Seven.Five, SQuad, QualiPoc, NetQual, VQuad, Diversity as well as the following logos are registered trademarks of SwissQual AG.
Diversity ExplorerTM, Diversity RangerTM, Diversity UnattendedTM, NiNA+TM, NiNATM, NQAgentTM, NQCommTM, NQDITM, NQTMTM,
NQViewTM, NQWebTM, QPControlTM, QPViewTM, QualiPoc FreeriderTM, QualiPoc iQTM, QualiPoc MobileTM, QualiPoc StaticTM, QualiWatch-MTM, QualiWatch-STM, SystemInspectorTM, TestManagerTM, VMonTM, VQuad-HDTM are trademarks of SwissQual AG.
The following abbreviations are used throughout this manual: R&S___ is abbreviated as R&S ___.
SwissQual... Diversity
Contents
Contents
1 Pre-Filtering of Reference Speech Material.........................................5
1.1
1.2
1.2.1
Disclaimer....................................................................................................................... 6
Basics.............................................................................................................................7
2.1.1
2.1.2
2.1.3
Basics...........................................................................................................................11
3.2
3.3
Basics...........................................................................................................................14
4.2
4.3
Basics...........................................................................................................................15
5.2
5.3
Basics...........................................................................................................................17
6.2
Speech-Like Sequences............................................................................................. 17
6.3
SwissQual... Diversity
Contents
SwissQual... Diversity
Avoid long reverberation times in the complete frequency range of the speaker's
environment
Include male and female voices as well as human utterances that are typical for
telephone conversations
SwissQual... Diversity
1.2.1 Disclaimer
You can only use SwissQual speech material with SwissQual products such as Diversity and QualiPoc. Use of this material with non-SwissQual products as well as further
distribution or deployment is not permitted.
SwissQual... Diversity
2.1 Basics
The measurement of listening quality is based on a comparison between a high quality
un-degraded speech sample, which is used as the input signal and the transmitted and
probably distorted signal that is recorded at the output of the connection. A psychoacoustic model is then applied to both signals after which all perceptible differences
are measured. The result of these measurements forms the overall listening quality
score. Since, the linear distortions (frequency response) also influence the score, the
selection of the input signal can also depend on the sending interface that is used.
Description
Length
6.0 s
Speech Activity
Approximately 70 %
SwissQual... Diversity
Setting
Description
Structure
Speaker
Sampling frequency
File Format
Level
-26.0 dB OVL
Pre-Filtering
If you want to use your own speech material, SwissQual strongly recommends a minimum sample length of 5 seconds of which at least 50% contains speech activity.
Description
am_fm_IRS.wav
ar_fm_IRS.wav
Arabian, male+female
ch_fm_IRS.wav
cn_fm_IRS.wav
en2_fm_IRS.wav
fr_fm_IRS.wav
French, male+female
This sample systematically yields slightly lower MOS values
than the other language reference samples in comparable situations. This discrepancy might be the result of the generation and
recording process of the French source material.
ge_fm_IRS.wav
German, male+female
gr_fm_IRS.wav
Greek, male+female
hu_fm_IRS.wav
Hungarian, male+female
it_fm_IRS.wav
Italian, male+female
jp_fm_IRS.wav
Japanese, male+female
SwissQual... Diversity
Reference Sample
Description
pl_fm_IRS.wav
Polish, male+female
pt_fm_IRS.wav
Portuguese, male+female
ru_fm_IRS.wav
Russian, male+female
sp_fm_IRS.wav
Spanish, male+female
tk_fm_IRS.wav
Turkish, male+female
On request, SwissQual can provide all speech material for narrowband as un-filtered
(flat) source material. In addition to the 6 s samples, a 11 s sample in American English (AM_CallQual_IRS.wav ) is also provided, which you can use for Call Quality
measurements.
Description
am_fm_wide.wav
du_fm_wide.wav
Dutch, male+female
ch_fm_wide.wav
en_fm_wide.wav
ge_fm_wide.wav
German, male+female
it_fm_wide.wav
Italian, male+female
For special purposes, all speech material in wideband is also available with WB-IRS
(send) pre-filtering.
The use of wideband material with IRS pre-filtering can lead to a recognizable limitation in audio bandwidth. In speech wideband test cases, this limitation is scored as a
degradation.
Table 2-4: Description of the WB-IRS pre-filtered reference speech samples for wideband scenarios
Reference Sample
Description
am_fm_IRS_wide.wav
du_fm_IRS_wide.wav
Dutch, male+female
SwissQual... Diversity
Reference Sample
Description
ch_fm_IRS_wide.wav
en_fm_IRS_wide.wav
ge_fm_IRS_wide.wav
German, male+female
it_fm_IRS_wide.wav
Italian, male+female
10
SwissQual... Diversity
3.1 Basics
The main parameter of the SQuad Noise Suppression (NS) measurement assesses
the improvement or degradation of the noisy speech sample during transmission by
comparing the input of the noisy speech sample to the output sample. Other parameters assess the noise suppression and level deviations.
The SQuad-NS measurement requires the following reference signals:
Description
Length
8.0 s
Speech Activity
approx. 50 %
Structure
Two sentences initial pause > 2.0 s pause between the sentences > 0.5 s
Speaker
Sampling frequency
16 kHz
File Format
Speechlevel
-23.0 dB OVL
Noiselevel
Pre-Filtering
11
SwissQual... Diversity
In-car (stationary)
Street-noise (non-stationary)
Each of these noise types is mixed to each of the speech samples in four different level
steps:
The filenames of the reference speech samples provide a description of the file content. For example, the Am_fm_IRS_16k_car_32.wav file is the American English
speech sample that has been mixed with car noise at 32 dB. The
Am_fm_IRS_16k_car_32_clean.wav file is the corresponding speech sample without the background noise.
For measurements in English, the following reference samples are recommended:
Am_fm_IRS_16k_car_26.wav / Am_fm_IRS_16k_car_26_clean.wav
Am_fm_IRS_16k_car_44.wav / Am_fm_IRS_16k_car_44_clean.wav
The same speech signals are also available interlaced with street noise:
Am_fm_IRS_16k_str_26.wav / Am_fm_IRS_16k_str_26_clean.wav
Am_fm_IRS_16k_str_44.wav / Am_fm_IRS_16k_str_44_clean.wav
The following table contains the other languages that SwissQual provides similar reference samples for.
Table 3-2: Description of the prefix for a reference speech sample
Reference Sample
Language
En_*.wav
English
Ge_*.wav
German
Gr_*.wav
Greek
It_*.wav
Italian
Jp_*.wav
Japanese
Ru_*.wav
Russian
Sp_*.wav
Spanish
12
SwissQual... Diversity
You can use custom reference material if the material fulfills the defined requirements.
13
SwissQual... Diversity
4.1 Basics
The SQuad-AEC (passive) algorithm searches for reflections (echoes) of a sent
speech signal and if present, calculates the delay of the reflection with respect to the
sent signal and, if also present, the echo return loss of the reflected signal. Side-tones,
that is, reflections with a delay of less than 20 ms, are ignored by both of the algorithms, but are still signalized.
Description
Length
> 12.0 s
Speech Activity
> 90 %
Structure
Continuous speech
Speaker
Sampling frequency
16 kHz
File Format
Speech level
Pre-Filtering
squad_aec.wav
14
SwissQual... Diversity
5.1 Basics
The SQuad-AEC (active) algorithm searches for reflections (echoes) of a sent speech
signal that is generated actively by the far-end side and calculates the echo delay as
well as the echo return loss of the residual echo. Side-tones (reflections with a delay of
less than 20 ms) are ignored by both of the algorithms, but are still signalized.
Description
Length
6.0 s
Speech Activity
> 90 %
Structure
Continuous speech
Speaker
Sampling frequency
16 kHz
File Format
Speech level
-26.0 dB OVL
Pre-Filtering
SwissQual also provides 6-second speech clips to generate double talk at the far end
side. The speech activity is lower than the default speech clips and focused on speech
bursts. Although you can use the default clips (length = 6sec), some echo mis-spotting
can occur.
Table 5-2: Description of the settings for a double-talk SQuad-AEC (Active) measurement
Setting
Description
Length
6.0 s
Speech Activity
10 50%
15
SwissQual... Diversity
Setting
Description
Structure
Isolated utterances
Speaker
Sampling frequency
8 kHz (!)
File Format
Speech level
-26.0 dB OVL
Pre-Filtering
SQuadAECact.wav
SwissQual strongly recommends that you use the default reference sample files as
they are optimally adjusted to avoid interactions between the files.
Table 5-3: Description of the double-talk reference speech samples
Reference Sample
Language
dt_10_8kHz.wav
dt_25_8kHz.wav
dt_50_8kHz.wav
16
SwissQual... Diversity
6.1 Basics
The measurement of the round trip time is based on an in-band transmission of short
voice-like sequences. During the measurement, one sequence is sent repeatedly from
the A-side to B-side and after the signal is received, a different sequence is sent back
from the B-side to A-side. SwissQual strongly recommends that you use the default
reference speech samples RTTvoice_A.wav and RTTvoice_B.wav.
Description
Length
0.5 s 0.6 s
Speech Activity
> 80 %
Sampling frequency
16 kHz
File Format
Level
Pre-Filtering
RTTvoice_A.wav
RTTvoice_B.wav
17