You are on page 1of 180

OpenSpeech Dialog 1.

Developers Guide

Notice
OpenSpeech Dialog 1.4 Developers Guide Copyright 20012007 Nuance Communications, Inc. All rights reserved. Published by Nuance Communications, Inc. One Wayside Road, Burlington, Massachusetts, 01803, U.S.A. Last updated July 16, 2007. Nuance Communications, Inc. provides this document without representation or warranty of any kind. The information in this document is subject to change without notice and does not represent a commitment by Nuance Communications, Inc. The software and/or databases described in this document are furnished under a license agreement and may be used or copied only in accordance with the terms of such license agreement. Without limiting the rights under copyright reserved below, and except as permitted by such license agreement, no part of this document may be reproduced or transmitted in any form or by any means, including, without limitation, electronic, mechanical, photocopying, recording, or otherwise, or transferred to information storage and retrieval systems, without the prior written permission of Nuance Communications, Inc. Nuance, the Nuance logo, DialogModules, and RealSpeak are trademarks or registered trademarks of Nuance Communications, Inc. or its affiliates in the United States and/or other countries. All other trademarks referenced herein are the property of their respective owners.

Contents
Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi Restrictions and limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xii Audiences and objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xii Abbreviations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii Available documentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiv Related standards and third-party documents . . . . . . . . . . . . . . . . . xiv Chapter 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 Overview of OSD applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 Dialog configurations with xHMI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 VoiceXML runtime platforms with the OSD framework . . . . . . . . . . . 1 Application Java code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Sample applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Running the pizza application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Chapter 2. OSDM node configuration (<osdm>) . . . . . . . . . . . . . . . . . . . . . . . . . . 5 <OSDM> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Chapter 3. Dynamic property framework (DPF) . . . . . . . . . . . . . . . . . . . . . . . . . 15 Using DPF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 Performance guidelines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 Error handling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 Setting up the DPF tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 Storing HTTP parameters in the DPF tree . . . . . . . . . . . . . . . . . . . . . . . . . . 18 DPF facades . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 Generic facade . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 OSDM-like facade . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

Nuance Proprietary

iii

Chapter 4. Creating an OSD component . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 Example PIN component . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 Details for the example component . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 Implementing OSD components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 Create the directory structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 Configure the DPF tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 Defining the DPF (dpfInit.xml) . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 Enabling DPF (appconfig.xhmi) . . . . . . . . . . . . . . . . . . . . . . . . . . 29 Create xHMI file for the component (pin.xhmi) . . . . . . . . . . . . . . . . . 29 The collection node example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 The confirmation node example . . . . . . . . . . . . . . . . . . . . . . . . . . 31 The error <catch> example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 Write a wrapper for the component (appconfig.xhmi) . . . . . . . . . . . . 34 Calling OSD components from VoiceXML applications . . . . . . . . . . . . . . 36 Calling OSD components from xHMI applications . . . . . . . . . . . . . . . . . . 37 Create the application structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 Create the application configuration (appconfig.xhmi) . . . . . . . . . . . 37 Example parameter list . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 Example reference to a component . . . . . . . . . . . . . . . . . . . . . . . . 38 Example call to a component . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 Example result handling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 Example error handling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 Chapter 5. TransitionNode configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 Chapter 6. Server-side event handling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 Handling events on application servers . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 Performance considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 Enabling server-side event handling . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 Counting events as they occur . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 How to catch events on the application server . . . . . . . . . . . . . . . . . . 47 Examples of event handling on the application server . . . . . . . . . . . 47

iv

OpenSpeech Dialog 1.4 Developers Guide

Nuance Proprietary

Restarting a node and varying the output . . . . . . . . . . . . . . . . . . 48 Using conditions to vary the output . . . . . . . . . . . . . . . . . . . . . . . 49 Setting the maximum retries of a node . . . . . . . . . . . . . . . . . . . . . 50 Running scripts inside event handlers . . . . . . . . . . . . . . . . . . . . . 51 Complete event-handling example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 Chapter 7. OSD logging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 About OSD logging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 Diagnostic logging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 Page logging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 Application logging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 Log message formats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 Scoping of log messages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 Nesting log events . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 Log events, parameters, and values . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 Generic log events . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 Transaction events . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 Node transitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 Catch handlers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 Final transitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 Database interactions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 Caller segmentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 Transfers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 Turning application logging on and off . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 Chapter 8. OSD administration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 Deployment to a web server . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 Providing an XML parser . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 Starting a session . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 Operation administration & management (OA&M) . . . . . . . . . . . . . . . . . . 73 Using JMX in Tomcat . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 Configuring the JMX connector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 Set JMX parameters in web.xml . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 Load the configurator class . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75

Nuance Proprietary

Managing configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 Balancing system loads . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 Controlling shutdown and update operations . . . . . . . . . . . . . . . . . . 76 The IOAM interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 OA&M event notifications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 Default error messages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78 Extending error messages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 Localizing OA&M notifications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 Routing calls to the application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 Background concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 Registering OSD applications for routing . . . . . . . . . . . . . . . . . . 81 Configuring the routing servlet . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 Setting the persistent storage directory . . . . . . . . . . . . . . . . . . . . 81 Setting the initial routing table . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82 Enabling the routing servlet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82 Configuring the web application server . . . . . . . . . . . . . . . . . . . . 83 Static routing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83 Dynamic routing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83 Using the routing servlet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83 Chapter 9. Application development topics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85 Using skip lists to avoid recognizing specific words . . . . . . . . . . . . . . . 85 Key facts about skip lists . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86 When skip list processing occurs . . . . . . . . . . . . . . . . . . . . . . . . . . 87 Controlling where skip list processing occurs . . . . . . . . . . . . . . . 87 OSD automatically adds homophones to skip lists . . . . . . . . . . 88 Sample skip list grammar . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 Dynamic prompts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90 Writing prompt text to an attribute . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90 Adding output to the StepResponse . . . . . . . . . . . . . . . . . . . . . . . . . . . 92 Working with dates and times programmatically . . . . . . . . . . . . . . . . . . . . 94 Setting dates and times . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94 Getting dates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
vi OpenSpeech Dialog 1.4 Developers Guide Nuance Proprietary

Details on date and time formats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96 Exceptions for invalid timestamps . . . . . . . . . . . . . . . . . . . . . . . . . 97 Creating grammars dynamically (at runtime) . . . . . . . . . . . . . . . . . . . . . . . 98 Comparison of dynamic and static grammars . . . . . . . . . . . . . . . . . . . 98 Overview of OSD support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98 Example of a dynamic grammar . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 Reading the <config> content of a node . . . . . . . . . . . . . . . . . . . . . . . . . . . 100 Add the elements by extending the DTD . . . . . . . . . . . . . . . . . . . . . . 101 Use the new element in your xHMI configuration . . . . . . . . . . . . . . 101 Write classes to access the custom node . . . . . . . . . . . . . . . . . . . . . . . 102 Extending the application object . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108 Rendering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 Extending the rendering system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 Create a custom node . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110 Configure the custom node in xHMI . . . . . . . . . . . . . . . . . . . . . . 110 Change the <vuiforward> map in xHMI . . . . . . . . . . . . . . . . . . 110 Create a jsp page . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111 Using custom Render Data objects . . . . . . . . . . . . . . . . . . . . . . . 112 Chapter 10. The OSD Datamodel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113 Overview of variables and data storage . . . . . . . . . . . . . . . . . . . . . . . . . . . 113 Datamodel error events . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114 Accessing variables with xHMI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114 Accessing variables with ECMAScript . . . . . . . . . . . . . . . . . . . . . . . . 115 Passing variables to subdialogs . . . . . . . . . . . . . . . . . . . . . . . . . . 116 Writing your own bean . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116 Accessing variables with Java code . . . . . . . . . . . . . . . . . . . . . . . . . . . 118 Access from a custom node . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118 Access from an update rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118 Access from an attribute facade . . . . . . . . . . . . . . . . . . . . . . . . . . 119 Access from a custom bean . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120 ISessionFrame methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120 Differences between a.best() and s.best(a) . . . . . . . . . . . . . . . . . . . . 121 AttributeBean methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
Nuance Proprietary vii

Writing a factory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122 When do you need a factory? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123 Steps for writing a factory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123 Implementing IDataElementFactory . . . . . . . . . . . . . . . . . . . . . . . . . . 123 The factory lifecycle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124 Configuring factories . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124 Using the OSD datamodel interface . . . . . . . . . . . . . . . . . . . . . . . . . . 125 Implementing IDataModelAccess . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125 Example user-defined JNDI factory . . . . . . . . . . . . . . . . . . . . . . . . . . 126 Chapter 11. FAQ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129 Evaluating variables and making logic decisions . . . . . . . . . . . 129 Accessing variables in ECMAScript expressions . . . . . . . . . . . 130 Declaring custom classes in xHMI . . . . . . . . . . . . . . . . . . . . . . . . 131 Creating attributes via Java . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131 Configuring OSDM parameters dynamically . . . . . . . . . . . . . . 131 Timing of updates in the SessionFrame . . . . . . . . . . . . . . . . . . . 131 Changing the provided rendering jsp . . . . . . . . . . . . . . . . . . . . . 132 Recognizing long utterances with robust parsing grammars . 132 Chapter 12. Getting started with development . . . . . . . . . . . . . . . . . . . . . . . . . 133 Speech application development lifecycle . . . . . . . . . . . . . . . . . . . . . . . . . 134 Application design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134 Design the callflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136 Design the prompts and speech grammars . . . . . . . . . . . . . . . . . . . . 137 Application development . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138 Create a directory structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138 Configure the application (create xHMI files) . . . . . . . . . . . . . . . . . . 139 Validating with a DTD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139 Validating with a W3C schema . . . . . . . . . . . . . . . . . . . . . . . . . . 140 Implementing custom nodes . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140 General activities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140 Test the application callflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141 Application deployment and tuning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141

viii

OpenSpeech Dialog 1.4 Developers Guide

Nuance Proprietary

Chapter 13. Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143 Appendix A. Predefined properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147 Miscellaneous properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147 Ambiguous recognition results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147 Skip list processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148 Properties for OpenSpeech Insight logging (OSI) . . . . . . . . . . . . . . . . . . . 148 Appendix B. Command line tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151 Summary of command line tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151 Prerequisites . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151 Recording list tool (listing prompts for the recording studio) . . . . . . . . . 151 Grammar List tool (lists grammars in an xHMI file) . . . . . . . . . . . . . . . . . 154 Validate tool (validating xHMI configuration files) . . . . . . . . . . . . . . . . . 155 Appendix C. Timestamp abbreviations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157 Appendix D. Negative confirmations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159

Nuance Proprietary

ix

OpenSpeech Dialog 1.4 Developers Guide

Nuance Proprietary

Preface

Introduction
OpenSpeech Dialog (OSD) is an open environment for platform vendors and application developers to accelerate development time for speech applications, lower the total cost of application deployment, and provide a higher level of service to customers. OSD interprets the xHMI1 configuration language, which is an open XML specification language for dialog applications. The xHMI language provides a high-level language for designing and specifying speech applications. Together, OSD and the xHMI language address the complexity of implementing user interfaces that personalize telephone calls for each customer and that offer natural-language dialogs with broad vocabularies. Without OSD and xHMI, considerable design and programming skill is required to build these advanced applications using VoiceXML or JSP pages, and because of the complexity the costs are often too expensive to justify development. OSD and xHMI change this cost structure by speeding development, improving reliability, automating operational reports, and reducing the amount of tuning needed for initial phases of deployment.

The name xHMI is an abbreviation for Extensible Human Machine Interface.


xi Introduction

Nuance Proprietary

In addition, environments that integrate OSD gain access to Nuance products that also use OSD. These products include:

SpeechPAKs (vertical application suites for healthcare, finance, utilties, etc.) DirectoryAssistant (telephone directory solutions) SpeechAttendant (automated telephone attendant solutions) X|Mode (multi-modal applications) Custom applications built by the Nuance Professional Services organization

Restrictions and limitations


OSD 1.4 is compatible with xHMI 1.4.
OSD compatibility with OSR

For its natural language capabilities, OSD requires version 3.0.4 of the OpenSpeech Recognizer (OSR 3.0.4 and higher). This includes the following OSD capabilities:

Robust parsing grammars (first available in OSR 3.0.3, but only with sentence-based confidence scores) Slot-base confidence scoresBy default, OSR provides confidence scores for entire sentences and not for individual grammar (attribute) slots. To enable slot-based scores, you must configure OSR to add the special grammar key SWI_attributes to the recognition result. This can be done with the swirec_ extra_nbest_keys parameter inn an OSR user configuration file. For example, in a configuration file the parameter might appear as follows:
<!-- Add a Nuance grammar key to the XML result. --> <param name="swirec_extra_nbest_keys"> <value>SWI_meaning</value> <value>SWI_literal</value> <value>SWI_grammarName</value> <value>SWI_utteranceSNR</value> <value>SWI_attributes</value> </param>

Audiences and objectives


Application developers

Application developers are responsible for building speech applications that meet their customers business needs. With xHMI configurations and the OSD runtime environment, application developers can provide truly conversational speech interfaces. The xHMI configuration defines the conversational callflow of

xii

OpenSpeech Dialog 1.4 Developers Guide

Nuance Proprietary

applications, and the OSD runtime provides interfaces, classes, and methods for developing xHMI applications. To develop OSD applications, you also need the xHMI Reference Guide.

Abbreviations
Abbreviation ASP ASR CCXML DPF DTD FIA HTTP J2EE JSP MVC NL OSD RDO SALT SRGS SSFT SSML TTS VCL VoiceXML W3C Description Application Services Platform Automatic Speech Recognition Call Control eXtensible Markup Language Dynamic Property Framework working draft (from the W3C) Document Type Definition Form Interpretation Algorithm Hypertext Transfer Protocol Java 2 Platform, Enterprise Edition Java Server Pages Model-View-Controller Natural Language OpenSpeech Dialog Render Data Object Speech Application Language Tags Speech Recognition Grammar Specification (from the W3C) ScanSoft (the former name of Nuance Communications, Inc.) Speech Synthesis Markup Language (from the W3C) Text to Speech Verification Candidate List Voice Extensible Markup Language World Wide Web Consortium

Nuance Proprietary Abbreviations

xiii

Abbreviation xHMI XML

Description eXtensible Human-Machine Interface eXtensible Markup Language

Available documentation
The documentation set for the OSD product includes the following:

OSD Migration Guide Topics for applications and platforms built on previous releases of OSD and xHMI. xHMI Reference Guide Overview, architecture, and the xHMI language. OSD Integration Guide How to add OSD to an existing application development platform. OSD Developers Guide How to use OSD to build applications.

Related standards and third-party documents


DPF ECMAScript HTTP j2EE Jakarta Struts Language tags (RFC 3066) RFC 2806Telephone URLs SALT SRGS SSML Velocity template engine VoiceXML 2.0 http://www.w3.org/TR/DPF/ http://www.ecma-international.org/publications/files/e cma-st/ECMA-262.pdf http://www.ietf.org/rfc/rfc2616.txt http://java.sun.com/j2ee/ http://struts.apache.org/ http://www.ietf.org/rfc/rfc3066.txt http://www.ietf.org/rfc/rfc2806.txt http://www.saltforum.org http://www.w3.org/TR/speech-grammar/ http://www.w3.org/TR/speech-synthesis/ http://jakarta.apache.org/velocity/ http://www.w3.org/TR/voicexml20/

xiv

OpenSpeech Dialog 1.4 Developers Guide

Nuance Proprietary

W3C Schema

http://www.w3.org/TR/xmlschema-0/ http://www.w3.org/TR/xmlschema-1/ http://www.w3.org/TR/xmlschema-2/ http://www.w3.org/TR/2004/REC-xml-20040204/ http://www.w3.org/TR/REC-xml-names/

XML 1.0 XML Namespaces

Nuance Proprietary Available documentation

xv

xvi

OpenSpeech Dialog 1.4 Developers Guide

Nuance Proprietary

Chapter 1

Introduction

This guide helps Java application developers to build speech applications using the OSD runtime framework with xHMI configurations.

Overview of OSD applications


An OSD application (the OpenSpeech Dialog) is a set of dialogs configured in the xHMI language and run as a web application on a platform that has integrated the OSD framework. Each dialog has one or more dialog nodes that are implemented in Java. Only one dialog is active at one time.

Dialog configurations with xHMI


xHMI (the eXtensible Human-Machine Interface) is an xml markup language that configures the application callflow. The callflow consists of dialogs and their nodes; the callflow progresses as one node transitions to the next. The configuration also specifies prompts, grammars, and slots for recognition results. Application developers can write the xHMI configuration manually, or they can generate it automatically with a tool (such as Nuance V-Builder), or they can use a combination of the two (for example, automatically generating a skeleton application, and then manually coding the details).

VoiceXML runtime platforms with the OSD framework


OSD is an application framework that simplifies the building of Java web applications (speech and multi-modal applications). These applications read

Nuance Proprietary

Introduction Overview of OSD applications

xHMI configurations that define dialog behavior and the transitions between dialogs (for example, the prompts, confirmations, speech grammars, and so on). Most OSD installations are integrated into a runtime platform that is partnered with Nuance. You can also use OSD as provided directly by Nuance. OSD is a flexible and extensible framework providing several integration APIs for customizing your Java applications. OSD provides interfaces, classes, and methods for controlling dialogs and interacting with the session states configured via xHMI. OSD renders VoiceXML pages to your browser platform. The pages define everything needed for playing output to users and communicating with speech recognizers and text-to-speech engines. OSD writes detailed log messages for application analysis and tuning, and it provides an interface for systems operations. OSD does not provide telephony, operational control, speech recognition, text-to-speech, or the logging of log messages. We assume these services are controlled by your browser platform.

Application Java code


Here is a partial list of the responsibilities of your Java code:

Implementing decision logic based on the current status of a session (the SessionFrame status) and on external information (for example, using a customer database to personalize the application callflow). Accessing back-end databases and performing transactions (for example, a money transfer in a bank application).

Sample applications
OSD installations include sample code. Here is the path to the samples:
installdir\OpenSpeech Dialog\samples\

The pizza application The ./pizza directory contains a simple application for ordering pizza. The application illustrates basic xHMI configuration concepts and Java coding practices. The restaurant guide application The ./restguide directory contains a simple application for selecting a restaurant and booking a table. The application illustrates the use of advanced natural language features.

OpenSpeech Dialog 1.4 Developers Guide

Nuance Proprietary

The controller example The ./sourcecode/servlet directory contains the source code for the front controller for handling events that affect the model or views. The transfer example The ./transfer directory contains code that demonstrates how to use the Transfer node for telephony transfers (typically to human agents). The routing example The ./routing directory contains the OSD routing servlet. Use the servlet for routing incoming telephone calls to OSD applications, and for performing rolling updates of applications. For an overview, see Routing calls to the application on page 81. Dialog node examples The ./sourcecode/nodes directory contains source code of the OSD implementation of the xHMI primitive nodes: Output, Collection, and Transfer. The sources are provided for educational purposes and serve as a reference for application developers building custom nodes. OSDM samples These samples are useful if you plan to use OpenSpeech DialogModules (OSDMs) in your application. OSDMs perform specific callflow and prompting actions such as collecting a telephone number from a user. The ./dateofbirth directory contains an xHMI application that makes use of the date OSDM to ask for a date. To run this example an OSDM installation is required. The ./appconfig/OSDM directory contains fragments of xHMI configurations that show how to use OpenSpeech DialogModules in OSD. Use TmplAppConfig.fragment.xml as a template for other OSDM nodes.

Running the pizza application


After installation of OSD, the pizza sample application is ready for deployment and is found in the following directory:
installdir\SpeechWorks\OpenSpeech Dialog\samples\pizza\

Before installing this (or any OSD application) on an application server, you must copy dialogmanager-shared-1.4.jar to the shared library folder of the application server:
Application server Tomcat WebSphere Shared library folder TomcatInstallDir\shared\lib WebSphereInstallDir\lib

Nuance Proprietary

Introduction Running the pizza application

To install the sample on the application server perform these steps: 1 Deploy the pizza.war file
Tomcat WebSphere Copy pizza.war to TomcatInstallDir\webapps\ and start deployment on the management console. In the Administrative Console click ApplicationsInstall New Applications. Follow the standard procedures to install a war file.

Ensure that the jar files included in the sample application have precedence.
No action required. In the Administrative Console click ApplicationsEnterprise Applications. Select the pizza application. In the next page make the following changes: Classloader Mode PARENT_LAST WAR Classloader Policy

Tomcat WebSphere

Ensure that you are using an XML parser that is compatible with JAXP 1.3. OSD installs the Apache Xerces parser, which you can use. OSD installs the parser in the java/lib/ext directory. For Tomcat 5.0, copy the XML parser libraries (xercesImpl.jar and xml-apis.jar) to the common/endorsed folder of the Tomcat installation, and then restart Tomcat. Change the properties in installdir\webapps\pizza\WEBINF\ global.prop. Most importantly, you must specify your servers:
set grammar_server_ext prompt_server_ext osdm_server_ext Location of grammars Location of prompts Location of dialog modules

OpenSpeech Dialog 1.4 Developers Guide

Nuance Proprietary

Chapter 2

OSDM node configuration (<osdm>)

The <osdm> element invokes sub-components and enables the creation of re-usable, modular, building blocks for applications. You can call any OpenSpeech DialogModule (OSDM) provided by Nuance or other vendor, and you can implement your own sub-components. (For details and an example, see Creating an OSD component on page 23.) You can run sub-components as client-side or server-side executions: a client-side invocation runs as a VoiceXML <subdialog>, and a server-side invocation runs inside the application itself. You cannot randomly change from client- to server-side; you must use one or the other, but not both. Server-side sub-components must be OSD applications configured with xHMI. (See Server-side versus client-side sub-components below.)

<OSDM>
Defines a modular, re-usable dialog executed as an application sub-component.
Scope Scope Parent Allowed child elements node <config> <fills>

Nuance Proprietary

OSDM node configuration (<osdm>) <OSDM>

Attributes Attribute1 srcexpr src Description Required (except when you use src). An ECMAScript expression containing the URL of an OSDM or an OSD application. Required (except when you use srcexpr). For client-side, this is the URL of the sub-components VoiceXML subdialog. (For client-side, the <osdm> tag is a simple wrapper around a VoiceXML <subdialog>.) For server-side, this is the URL of the sub-components xHMI configuration file. The path can be relative or absolute. 1. Either srcexpr or src is required; you cannot specify both. Implemented tags

The OSDMNode and ServerSideOSDMNode node classes implement these tags:


Config element <osdm> <fills> <property> Description On the client-side, specifies an OSDM address. On the server-side, specifies the name of an xHMI file. Specifies elements to map the return values to xHMI attributes. Specifies sub-component parameters. A child of <config>.

Server-side versus client-side sub-components

Each <osdm> element in your xHMI configuration invokes a client-side or server-side component:

The server-side capability enables complete encapsulation of the called module within a single application context. (You can use the server-side class to call any OSD application configured with xHMI.) The client-side capability enables partial encapsulation of non-OSD applications. You can use the client-side class to call OpenSpeech DialogModules (OSDMs) created by Nuance and any other component that can run as a <subdialog> on the VoiceXML page.

There are performance differences when calling an OSD component with OSDMNode or ServerSideOSDMNode. We recommend implementing components as described in Creating an OSD component and using OSDMNode initially to call it. If you suspect degraded performance, change to ServerSideOSDMNode and compare results.

OpenSpeech Dialog 1.4 Developers Guide

Nuance Proprietary

Here are the basic differences:

Client-sideWhen you execute an OSD component using OSDMNode, OSD renders a VoiceXML page with a <subdialog> that calls the component. When the component exits, the OSDMNode regains control and continues. OSDMNode adds a roundtrip communication with the browser platform. The cost depends mostly on the number of parameters passed between the application and the components: if the application passes many parameters to components, or the components return many parameters to the application, the performance cost increases.

Server-sideIn comparison, when you execute an OSD component on the server-side, the ServerSideOSDMNode creates a component handler without generating a VoiceXML page to get to the component. There is no roundtrip communication with the browser platform. ServerSideOSDMNode adds a validation procedure and uses more memory. The cost depends on the size of the components xHMI configuration files: larger files adds load.

The configuration differences are as follows:

The node class:


Client-side class Server-side class com.scansoft.osd.nodes.OSDMNode com.scansoft.osd.nodes.ServerSideOSDMNode

The src location:


Client-side src Server-side src URL to the OSDM subdialog. This URL will be rendered as <subdialog src="myURL"/> on the VoiceXML page. URL to an xHMI configuration file. Can be absolute or relative. A relative URL is resolved as follows:

If the URL starts with a forward slash (/), the location is relative to the application context root. For example, if the root is http://server/app, then these src values are equivalent: /subapps/getPIN.xhmi (relative) and http://server/app/subapps/getPIN.xhmi (absolute). If the URL does not start with a forward slash, the file location is relative to the parent xHMI file location (the calling applications xHMI file). For example, if the parent xHMI is at: /xhmi/callPIN.xhmi, then using a src value of subapps/getPIN.xhmi resolves to http://server/app/xhmi/subapps/getPIN.xhmi.

Nuance Proprietary

OSDM node configuration (<osdm>) <OSDM>

Events:
Client-side Server-side The sub-component must handle all events. Events are automatically forwarded to the calling application if not handled by the sub-component. This allows applications to write top-level event handlers (for example, a single configuration to handle hang-up events in all components).

Sharing data:
Client-side The calling application can use <property> to configure the VoiceXML page that runs the OSDM subdialog. It cannot set attributes in the sub-component. The subdialog can return attribute values. The calling application can use <property> to set attribute values in the sub-component. The sub-component can return values from any of its top-level (global-scope) attributes. Both the application and the sub-component share the same DPF tree (see Dynamic property framework (DPF) on page 15).

Server-side

Return values:
Client-side Server-side The calling application evaluates explicit return values when performing transitions to the <next> target. The calling application evaluate the expr return value. (See examples below.)

Global transitions in the calling application are not available to the sub-component. (This applies to both client- and server-side sub-components.)

Example client-side invocation

This example invokes a client-side OSDM.


<!--Abbreviated header information--> <xhmi xml:lang="en-US" root="PinDialog"> <dialog id="PinDialog" root="PIN"> <var-list> <var name="PINReturnValue" type="attribute"/> <var name="PINReturnCode" type="attribute"/> <var name="PINConfidence" type="attribute"/> </var-list>

OpenSpeech Dialog 1.4 Developers Guide

Nuance Proprietary

<!--This example invokes a client-side OSDM--> <node id="PIN" class="com.scansoft.osd.nodes.OSDMNode"> <config> <!-- Configure the called sub-component--> <property-list> <property name="length" value="4"/> </property-list> <!--Specify the OSDM location--> <osdm src="%{osdm_server}osdm3-pin/osd-component"> <!--Map returned values to attributes--> <fills name="PINReturnValue" slot="returnvalue"/> <fills name="PINReturnCode" slot="returncode"/> <fills name="PINConfidence" slot="confidencescore"/> </osdm> </config> <transition> <!--Transition based on OSDM returncode--> <next name="SUCCESS"> <target path="playbackResult"/> </next> <next name="FAILURE"> <target path="/ErrorHandler(PINReturnValue)"/> </next> </transition> </node> <!--...--> </dialog> </xhmi>

The OSDMnode class allows you to specify transitions directly to other nodes by setting transition properties to the values returned by the OSDM. The above example uses blue text to show the transitions that use the returned properties.
Example server-side invocation

The next example invokes a server-side sub-component. It is nearly identical to the previous client-side example; the differences are highlighted with comments:
<!--Abbreviated header information--> <xhmi xml:lang="en-US" root="PinDialog"> <dialog id="PinDialog" root="PIN"> <var-list> <var name="PINReturnValue" type="attribute"/> <var name="PINReturnCode" type="attribute"/> <var name="PINConfidence" type="attribute"/>

Nuance Proprietary

OSDM node configuration (<osdm>) <OSDM>

</var-list> <!--This example invokes a server-side OSDM--> <node id="PIN" <!-- specify the server-side class --> class="com.scansoft.osd.nodes.ServerSideOSDMNode"> <config> <!-- Configure the called sub-component--> <property-list> <!-- set global attribute; creating it if necessary --> <property name="length" value="4"/> </property-list> <!-- Specify location; getPIN must be an OSD application --> <!-- Relative path = same dir as the calling application --> <osdm src="getPIN.xhmi"> <!-- Map returned values to attributes in calling appn--> <fills name="PINReturnValue" slot="returnvalue"/> <fills name="PINReturnCode" slot="returncode"/> <fills name="PINConfidence" slot="confidencescore"/> </osdm> </config> <transition> <!-- Transition targets based on the returned expr --> <next name="expr"> <target condexpr="PINReturnCode == 'SUCCESS'" path="playbackResult"/> <target condexpr="PINReturnCode == 'FAILURE'" path="="/ErrorHandler(PINReturnValue)"/> </next> </transition> </node> <!--...--> </dialog> </xhmi> Returning server-side values to the calling application

The calling application can define attributes to be filled when the sub-component returns (see the <fills> elements in the Example server-side invocation). To accomplish the return, the sub-component must define the return path and assign values to the returning attributes. The following example shows how the getPIN.xhmi sub-component might be implemented:
<!--...--> <transition> <next name="expr"> <!--If PIN is verified, then return--> <target condexpr="s.ver('pin')" path="return">

10

OpenSpeech Dialog 1.4 Developers Guide

Nuance Proprietary

<assign name="returnvalue" value="s.best('pin')"/> <assign name="returncode" value="'SUCCESS'"/> <assign name="confidencescore" value="s.conf('pin')"/> </target> <!--Confirm PIN (not shown) if defined, but not verified--> <target condexpr="s.def('pin')" path="confirmPIN"/> <!--If retry counter exceeded, then return failure--> <target condexpr="maxRetries == 3" path="return"> <assign name="returnvalue" value="''"/> <assign name="returncode" value="'FAILURE'"/> <assign name="confidencescore" value="0"/> </target> <!--If PIN not defined, then retry (not shown)--> <target path="getPIN"/> </next> </transition> <!--...-->

Note: The sub-component can only return data in globally-scoped attributes. In the example, returnvalue, returncode, and confidencescore must be declared as top-level attributes in the sub-component (or declared in the attributes attribute of the sub-component root dialog.
Passing parameters to sub-components

The calling application uses properties to configure attributes in sub-components. The attributes are globally scoped (they are mapped to top-level attributes in the sub-component; you cannot set local attributes in dialogs or nodes). The <property> elements in the calling application will reset the values of the corresponding attributes (if they exist) in the sub-component. Thus, you can define default values in the sub-component, and override those attributes as needed with the calling application. You cannot configure properties on the VoiceXML platform with server-side components. The <property> configurations are identical for client-side and server-side components, but the effect is different: for client-side sub-components, OSD renders the properties as VoiceXML properties. Note: A good coding practice is to declare expected attributes at the top of sub-component. This is recommended but not required (since OSD will create the attributes if they do not already exist).

Nuance Proprietary

OSDM node configuration (<osdm>) <OSDM>

11

Setting parameters at runtime

To use dynamic content in sub-components, use ECMAScript expressions. For example:


<property name=collection_parallelgrammar1 expr=s.get(nameOfCommandgrammar)/>

Example call to the Date OSDM

This example shows a client-side invocation of the Date OSDM:


<!--Abbreviated header information--> <xhmi xml:lang="en-US" root="Horoscope"> <dialog id="Horoscope" root="nodeAskForDate"> <var-list> <!--Global variables--> <var name="osdmReturnCode" type="attribute"/> <var name="osdmDateReturnValue" type="attribute"/> <var name="osdmInputMode" type="attribute"/> <var name="osdmDateYear" type="attribute"/> </var-list> <!--This example invokes a client-side OSDM--> <node id="nodeAskForDate" class="com.scansoft.osd.dm.nodes.OSDMNode"> <config> <!--Specify the sub-component location--> <osdm src= ="http://osdmserver:8080/osdm2-core/date"> <!--Map returned values to attributes--> <fills name="osdmReturnCode" slot="nodeAskForDate.returncode"/> <fills name="osdmReturnValue" slot="nodeAskForDate.returnvalue "/> <fills name="osdmReturnInputMode" slot="nodeAskForDate.inputmode "/> <fills name="osdmReturnDateYear" slot="nodeAskForDate.returnkeys.Year "/> </osdm> <!-- Configure the called sub-component--> <property name="propertiesfile1"> value="http://myserver/params/application.properties"/> <property name="collection_parallelgrammar1" value="command.grxml" /> </config>

12

OpenSpeech Dialog 1.4 Developers Guide

Nuance Proprietary

<transition> <!-- The OSDM node sets the transition property with the OSDM return code nodeAskForDate.returncode. The actual jump destination depends on it. --> <next name="SUCCESS"> <target path="#nodeOnSuccess"/> </next> <next name="AMBIGUOUS"> <target path="#nodeOnSuccess"/> </next> <next name="COMMAND"> <target condexpr="s.best('osdmReturnValue')== 'Help'" path="#nodeOnCmdHelp"/> <target condexpr="s.best('osdmReturnValue')== 'Guided Tour'" path="#nodeOnCmdGuidedTour"/> <target path="#nodeOnCmdOther"/> </next> </transition> <vuiforward> <forward name="callOSDM" path="/outputOSDM.jsp" /> </vuiforward> </node> </dialog> </xhmi>

Nuance Proprietary

OSDM node configuration (<osdm>) <OSDM>

13

14

OpenSpeech Dialog 1.4 Developers Guide

Nuance Proprietary

Chapter 3

Dynamic property framework (DPF)

The Dynamic Property Framework (DPF)1 enables changing and passing property values during runtime execution. You can set properties statically or dynamically with as many DPF trees as needed in your OSD applications and components. One use for DPF is to pass parameters when calling OSD components from a client application (see Creating an OSD component on page 23).

Using DPF
After you set up a DPF tree, the DPF object and its properties are available to all ECMAScript expressions in the xHMI configuration. For example myDPF.mycomponent.daily is a reference to the daily property, which appears in mycomponent in the root tree named myDPF. Use the root name as a prefix in property paths. For example, consider this script:
<output id="advertisement"> <audio srcexpr="myDPF.mycomponent.daily"/> <output>

Assuming that daily has the value http://server/usedcars.wav, the system automatically renders this VoiceXML fragment: <prompt><audio src=" http://server/usedcars.wav "/><prompt>

1The

dynamic property framework is modeled according to the W3C working draft DPF from November 2004.
Dynamic property framework (DPF) Using DPF 15

Nuance Proprietary

Performance guidelines
For all DPF structures (DPF trees and DPF facades):

Loading large DPF trees can affect runtime performance. Because OSD creates DPF structures when it encounters <var> elements, the best location to define DPF trees is at the global scope. Applications must treat DPF structures as read-only when sharing those structures with other applications. Otherwise, one application can corrupt property values in other applications by writing to the shared DPF tree. Avoid defining property names that might cause problems with your ECMAScript interpreter. Do not define a property name that is also the name of a DPF tree or the String Function.

Error handling
When DPF errors occur, OSD throws error events for datamodel errors just as it would any error. For a description of events, see <catch> in the xHMI Reference Guide. Applications should always catch the error.datamodel root event or at least the general error event.s

Setting up the DPF tree


To use dynamic properties in an OSD application or component, you must define a DPF tree in xHMI. Its best to do this at global scope. For example:
<xhmi> <!-- ...heading elements... --> <var-list> <var name="myDPF" type="dpf" expr="'myComponent'"/> <!-- ...other global variables... --> </var-list> <!-- ...dialogs and nodes... --> </xhmi>

Above, the example creates a tree named DPF. The tree automatically reads these properties files, in order: 1 2 /dpf/dpfInit.xml initialization file. This file is required. /all.properties file. This file is optional; if it does not exist, the system writes a warning.

16

OpenSpeech Dialog 1.4 Developers Guide

Nuance Proprietary

/myComponent.properties file. This file is optional; if it does not exist, the system writes a warning. Typically, the name you provide for myComponent is the same name you provide for <mycomponent> in the dpfInit.xml file described below.

You can create as many DPF trees as needed. Define the contents of each tree in an xml file, then refer to this file in the configuration. This example creates a tree named myDPF that is defined in myDPFInit.xml:
<var name="myDPF" type="dpf:/dpf/myDPFInit.xml"/>

For example, the contents of myDPFInit.xml might look like this:


<?xml version="1.0" encoding="UTF-8"?> <DPF> <mycomponent> <todays_ad> http://server/usedcars.wav </todays_ad> </mycomponent> </DPF>

In the example above, the content is static. You can supply dynamic values programmatically by writing a java class and linking it to the property:
<?xml version="1.0" encoding="UTF-8"?> <DPF> <mycomponent> <todays_ad class="com.mycorp.myclassname"> </todays_ad> </mycomponent> </DPF>

Your class must extend com.scansoft.osd.dpf.DPFProperty, and supply the value of the property. The class must overwrite the following method: public String getValue() You must avoid name conflicts when designing property names; this is especially important for reusable components. One way to accomplish this is to use a subtree in DPF which has the same name as the component (as shown by <mycomponent> in the previous example). The framework creates the DPF tree whenever it encounters the dpf variable. Because the variable is scoped, the tree is also scoped. For example, when a <node> contains a <var> that loads a DPF tree, the framework re-creates the tree every time the callflow enters the node (the name is always the same, as is the content). You can create a DPF that is available globally to more than one application. When applications define this tree, they share a single instance of the tree instead of creating new instances.
Nuance Proprietary Dynamic property framework (DPF) Setting up the DPF tree 17

To create a global DPF, use the static identifier as follows:


<var name="myCallDPF" type="dpf:static:/dpf/myCallDPFInit.xml"/>

Use this for read-only properties. Applications should not write to these DPF trees because any changes immediately affects all applications using the tree.

Storing HTTP parameters in the DPF tree


When calling OSD components, applications pass parameters using HTTP requests. The components store these parameters in DPF trees (this is discussed in Creating an OSD component on page 23).2 The parameters are inserted as children of the DPF property that has the name of the component. For example, assume the following dpfInit.xml file:
<DPF> <mycomponent> <a>40</a> <b>41</b> </mycomponent> </DPF>

If you send a parameter myparam=42 to mycomponent, it is inserted into the DPF as follows: DPF mycomponent a = 40 b = 41 myparam = 42 Note: Avoid using underscores in parameter names. Using the underscore (_) character in parameter names will create hierarchies in the DPF tree. This feature was designed for special purposes and should be avoided in your applications. For every underscore in the parameter name, a new level in the DPF tree is created. For example, if you send a_b_c=42 to mycomponent, the DPF tree looks like this: DPF mycomponent todays_ad class="com.mycorp.myclassname" a

OSD components allow insertion of HTTP parameters; OSD applications do not.


Nuance Proprietary

18

OpenSpeech Dialog 1.4 Developers Guide

b c=42

DPF facades
You can group several DPF trees into a facade, andthen reference properties in the facade instead of the specific trees where the properties reside. When OSD looks up a property, it searches the trees in the facade in an order that you define. Accessing DPF facade entries with scripts is the same as for DPF trees. For example, consider the ECMAScript expression in this configuration:
<output id="advertisement"> <audio srcexpr="myDPF.mycomponent.daily"/> <output>

Generic facade
This is a generic (list-like) approach that could be used to implement the following OSDM-like DPF facade.
<var name="myDPF" type="dpf_facade"/>

Setting dpf trees to this dpf facade can be done by executing the following ECMAScript expression:
<script> myDPF.addTree(dpfA); </script>

This DPF facade has the following methods: addNamedTree(String name, IDPFProperty dpf)adds a DPF tree with the specified name. addTree(IDPFProperty dpf)adds a DPF tree. removeTree(int index)removes the tree specified by the index. setTree(int index, IDPFProperty dpf) getTree(int index)returns the tree specified by the index. size()returns the number of DPF trees in the DPFFacade.

Nuance Proprietary

Dynamic property framework (DPF) DPF facades

19

OSDM-like facade
This facade is tailored to the usage needed in packaged applications like OSDMS. Filling the DPF facade works like this:
<var name="myDPF" type="dpf_facade_osdm"> <param name="all" expr="dpfAll"/> <param name="component" expr="dpfComponent"/> <param name="language" expr="dpfEnglish"/> <param name="instance" expr="dpfInstance"/> </var>

The single DPF trees could be instantiated like the following:


<var <var <var <var name="dpfAll" type="dpf:/dpf/all.xml"/> name="dpfComponent" type="dpf:/dpf/component.xml"/> name="dpfEnglish" type="dpf:/dpf/language.xml"/> name="dpfInstance" type="dpf:/dpf/instance.xml"/>

The DPF tree instantiation can also be done using the mechanism described in the Outside Bean Creation document. The dpf_facade type allows renewing the references at runtime like in the following example for language. The other references can be renewed similarly.
<script> myDPF.language = dpfGerman; </script>

Once the DPF facade is accessed like a normal DPF tree the referenced DPF trees will be searched in the following order: 1 2 3 4 instance dpf language DPF tree component DPF tree all DPF tree

This DPF facade has the following methods:


setAll(IDPFProperty) getAll() setComponent(IDPFProperty) getComponent() setLanguage(IDPFProperty) getLanguage() setInstance(IDPFProperty) getInstance()

20

OpenSpeech Dialog 1.4 Developers Guide

Nuance Proprietary

Nuance Proprietary

Dynamic property framework (DPF) DPF facades

21

22

OpenSpeech Dialog 1.4 Developers Guide

Nuance Proprietary

Chapter 4

Creating an OSD component

This chapter shows how to write an OSD component that you can call from OSD applications or directly from a platform browser. It has a step-by-step description for writing and calling the component

Introduction
An OSD component is a web application. Conceptually, it is a re-usable building block for speech applications. You can call a component from VoiceXML or OSD applications:

By calling OSD components from existing VoiceXML applications, you can use the components without fully re-implementing the applications. By adding a <subdialog> to the VoiceXML application, the applications benefits without being configured with xHMI. By calling OSD components from OSD applications, you increase modularity and re-usability. Applications call components as if they were nodes (using the OSDMNode class). The node generates a VoiceXML page that uses <subdialog>.

An OSD component is similar to an OSD application except for these differences:

Components run inside the calling application; they are not standalone applications. For example, this has implications for logging since the component is part of a session whereas an OSD application comprises the entire session. When a component exits (<target path="return"/>), the calling application regains control of the session. When a standalone OSD application exits, the session ends.

Nuance Proprietary

Creating an OSD component Introduction

23

When you call a component, the URI ends with /osd-component. When you call a standalone OSD application the URI ends with /osd. When calling a component, applications pass property values using HTTP parameters. When components return to their calling application, they fill values in a formal parameter list defined in the applications root dialog. The list reinforces the purpose of a component to perform a specific activity, and ensures that the component returns a specific amount of information.

Example PIN component


Consider an application that requests a personal identification number (PIN) for security purposes. We have a client application and a re-usable component:

The application configures the collection prompts, and defines the PIN length. Then it calls the component, gets the result, and continues. The component collects the PIN, confirms it if necessary, and returns the result. The component uses xHMI to configure the prompts, grammars, confirmations, and so on, needed to perform the collection. The example is kept simple to demonstrate the key features. For example, there is no configuration for nomatch, noinput, and help.

Here is a sample conversation:


System User Please enter your 4-digit PIN one-two-three-four

System User

I think you said 1234. Correct? Yes [SUCCESS returned]

Another example call flow:


System User Please enter your 3-digit PIN one-eight-three

System

I think you said 123. Correct?

24

OpenSpeech Dialog 1.4 Developers Guide

Nuance Proprietary

User

No

System User

Please enter you 3-digit PIN one-eight-three

System User

I think you said 183. Correct? Yes [SUCCESS returned]

This figure shows the callflow for the example OSD component:

Initial Prompt

Nothing heard or Low confidence Nothing heard or Low confidence Collection

Something recognized

Rejected by user

Confirmation

Too many retries

Success

Failure

Nuance Proprietary

Creating an OSD component Introduction

25

Details for the example component


For implementation and discussion of each piece of the example PIN component, see these details:

Implementing OSD components Calling OSD components from VoiceXML applications Calling OSD components from xHMI applications

26

OpenSpeech Dialog 1.4 Developers Guide

Nuance Proprietary

Implementing OSD components


The following subsections show how to create an OSD component by implementing the Example PIN component. You must:

Create the directory structure Configure the DPF tree Create xHMI file for the component (pin.xhmi) Write a wrapper for the component (appconfig.xhmi)

Create the directory structure


OSD components use the same directory structure as OSD applications. Follow the conventions in Create a directory structure on page 138. Below, we show key pieces of the PIN example. See the following subsections for details on the blue entries. The parent directory is osdm3-pin:
osdm3-pin WEB-INF dpf dpfInit.xml xhmi app inc pin.xhmi appconfig.xhmi dtd ...dtd files... all.properties ...rendering jsp files... pin.properties return.jsp ...

Configure the DPF tree


OSD components define and enable DPF trees for receiving variables from their calling applications. This enables the applications to dynamically insert variables into the trees when invoking components. Steps: 1 2 Defining the DPF (dpfInit.xml) Enabling DPF (appconfig.xhmi)

For detailed DPF information, see Dynamic property framework (DPF) on page 15.
Nuance Proprietary Creating an OSD component Implementing OSD components 27

Defining the DPF (dpfInit.xml)

You must define a DPF tree so applications can set parameters for OSD components to use in its callflow. You should define default values for every entry in the DPF tree. The examples below use the length property to demonstrate setting and changing the default value. For the Example PIN component, the application sets the PIN length. Below, we create the <length> property in the dpfInit.xml file as well as properties for prompts and various counters. The example configures a single element called pin below the DPF root element. The system automatically includes all properties in the pin.properties file under the pin element.
<?xml version="1.0" encoding="UTF-8"?> <DPF> <pin> <length>4</length> <collection_initialprompt> <![CDATA[ Please state your <value expr="DPF.pin.length"/>-digit PIN]]> </collection_initialprompt> <collection_maxnoinputs>3</collection_maxnoinputs> <collection_maxnomatches>3</collection_maxnomatches> <collection_maxretries>3</collection_maxretries> <confirmation_initialprompt>I think you said </confirmation_initialprompt> <confirmation_initialprompt2>Correct? </confirmation_initialprompt2> <confirmation_maxnoinputs>3</confirmation_maxnoinputs> <confirmation_maxnomatches>3</confirmation_maxnomatches> <confirmation_maxretries>3</confirmation_maxretries> </pin> </DPF>

You can reference DPF properties using DPF.pin.name syntax. For example: <value expr="DPF.pin.length"/>. In a component, this xHMI fragment assigns the length from the DPF:
<property-list> <property name="length" expr="DPF.pin.length"/> </property-list>

28

OpenSpeech Dialog 1.4 Developers Guide

Nuance Proprietary

The component can access all its properties with the asterisk symbol (*) as a wildcard. The following example references all pin properties:
<property-list> <property-set expr="DPF.pin.*"/> </property-list>

To see DPF in a more complete example, see The collection node example on page 30.
Enabling DPF (appconfig.xhmi)

The OSD component must enable DPF in its xHMI configuration. Typically, you configure the DPF in a global <var-list>. For the Example PIN component, this is done in the appconfig.xhmi file as follows:
<var-list> <var name="DPF" type="dpf" expr="'pin'"/> </var-list>

See Write a wrapper for the component (appconfig.xhmi) on page 34.

Create xHMI file for the component (pin.xhmi)


This section shows the complete structure of a configuration file for an OSD component including the initial header and placeholders for the important blocks of code. This file, pin.xhmi, implements the Example PIN component:
<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE dialog SYSTEM "../../dtd/xhmi.dtd"> <dialog id="pin" root="collection" attributes="returncode returnvalue confidencescore"> <var-list> <!--The pin attribute this will contain the collected PIN --> <var name="pin" type="attribute"/> </var-list> <!--see The collection node example on page 30... --> <!--see The confirmation node example on page 31... --> <!--see The error <catch> example on page 33... --> </dialog>

Note: Do not enclose the OSD component with <xhmi> </xhmi> elements. The header defines a dialog called pin that has a root node called collection, which uses three parameters: returncode, returnvalue, and confidencescore. The parameters are implicitly defined in the component with no need for <var> definitions.

Nuance Proprietary

Creating an OSD component Implementing OSD components

29

The collection node example

The PIN component first executes the collection node, which makes extensive use of the DPF tree defined in Configure the DPF tree. The user hears the prompt: Please state your n-digit PIN, and the value of n is set dynamically. Here is the collection node configuration in pin.xhmi. It defines properties, prompts, grammars, and transitions; and it gets these values from the DPF tree (the values previously set by the calling application):
<node id="collection" class="Collection"> <config> <property-list> <!--Get properties set by the calling application --> <property name="_maxnoinputs" expr="DPF.pin.collection_maxnoinputs" /> <property name="_maxnomatches" expr="DPF.pin.collection_maxnomatches"/> <property name="_maxretries" expr="DPF.pin.collection_maxnoretries"/> </property-list> <output-list> <initial> <output> <!--Get prompt set by the application --> <value expr="DPF.pin.collection_initialprompt"/> </output> </initial> </output-list> <grm-list> <!--Use the digits built-in grammar, and use the length set by the application. <grm srcexpr="'builtin:grammar/digits?minlength='+DPF.pin.length+';max length='+DPF.pin.length"> <!--NONE is a required keyword for built-in grammars --> <fills name="pin" slot="NONE"/> </grm> </grm-list> <!--Perform recognition and fill slot with the result --> <understand namelist="pin"/> </config>

30

OpenSpeech Dialog 1.4 Developers Guide

Nuance Proprietary

<transition> <next name="expr"> <!--If pin is verified, return SUCCESS, score, and pin --> <target condexpr="pin.ver()" path="return"> <assign name="returncode" value="SUCCESS"/> <assign name="confidencescore" expr="pin.conf()"/> <assign name="returnvalue" expr="pin.best()"/> </target> <!--If pin is defined, confirm with the user --> <target condexpr="pin.def()" path="confirmation"/> <!--The SessionFrame automatically records executions. If pin is not defined after 2 attempts, return failure --> <target condexpr="s.nodeVisits()>2" path="return"> <assign name="returncode" value="FAILURE"/> <assign name="returnvalue" value="noValueInCaseOfFailure"/> <assign name="confidencescore" value = "0"/> </target> <!--If the pin slot was not filled, try again to collect --> <target path="collection"/> </next> <!--Only one <next> since collection always returns expr --> </transition> </node>

The collection node maps values to several predefined OSD properties: _maxnoinputs, _maxnomatches, and _maxnoretries. OSD automatically renders these properties to configure the browser. To prepare for recognition, the node defines a grammar. In this example, we use the built-in digits grammar with parameters for the minimum, expected, and the maximum number of digits.1 The values for these parameters are retrieved from the DPF tree (where set by the calling application). The grammar fills the pin attribute with the recognition result.
The confirmation node example

The confirmation node executes when pin is defined, but not verified. This means the dialog collected a valid PIN, but the confidence score was too low to automatically verify. When the user confirms, the node returns the PIN to the calling instance of the PIN component, and the instance returns to the calling application.

1The

documentation for built-in grammars is provided in the OSR Language Supplement for the recognized language.
Creating an OSD component Implementing OSD components 31

Nuance Proprietary

Here is the configuration of the confirmation node in pin.xhmi:


<node id="confirmation" class="Collection"> <var-list> <!-- Scope the YESNO attribute inside the node --> <var name="YESNO" type="attribute"> <param name="temporary" expr="true"/> </var> </var-list> <config> <property-list> <!--Get values set by the calling application --> <property name="_maxnoinputs" expr="DPF.pin.confirmation_maxnoinputs" /> <property name="_maxnomatches" expr="DPF.pin.confirmation_maxnomatches"/> <property name="_maxretries" expr="DPF.pin.confirmation_maxnoretries"/> </property-list> <output-list> <initial> <output> <!--Define prompt start --> <value expr="DPF.pin.confirmation_initialprompt"/> </output> <output> <!--Define prompt for pin length --> <say-as interpret-as="number" format="digits"> <value expr="pin.best()"/> </say-as> </output> <output> <!--Define prompt end --> <value expr="DPF.pin.confirmation_initialprompt2"/> </output> </initial> </output-list> <grm-list> <!-- Use built-in boolean grammar to fill YESNO attribute--> <grm src="builtin:grammar/boolean"> <!-- NONE keyword is required --> <fills name="YESNO" slot="NONE"/> </grm> </grm-list>

32

OpenSpeech Dialog 1.4 Developers Guide

Nuance Proprietary

<!--Perform recognition and fill slot with the result --> <understand namelist="YESNO"/> </config> <transition> <next name="expr"> <!--If user confirmed, return success, score, and pin --> <target condexpr="YESNO.best()=='true'" path="return"> <assign name="returncode" value="SUCCESS"/> <assign name="returnvalue" expr="s.best('pin')"/> <assign name="confidencescore" expr="pin.conf()"/> </target> <!--If not confirmed after 2 attempts, return failure --> <target condexpr="s.nodeVisits()>2" path="return"> <assign name="returncode" value="FAILURE"/> <assign name="returnvalue" value="noValueInCaseOfFailure"/> <assign name="confidencescore" value = "0"/> </target> <!--If caller rejected, collect a new PIN --> <target condexpr="YESNO.best()=='false'" path="collection"/> <!--If not confirmed, try again --> <target path="confirmation"/> </next> </transition> </node> The error <catch> example

The PIN component must catch and handle any errors that occur during collection and confirmation. This is configured by the following block at the dialog scope in pin.xhmi:
<catch> <!-- Handler for disconnected sessions --> <target event="connection.disconnect" path="return"> <assign name="returncode" value="FAILURE"/> <assign name="returnvalue" value="noValueInCaseOfFailure"/> <assign name="confidencescore" value = "0"/> </target> <!-- Handler for speech that is too long -->

Nuance Proprietary

Creating an OSD component Implementing OSD components

33

<target event="maxspeechtimeout" path="return"> <assign name="returncode" value="FAILURE"/> <assign name="returnvalue" value="maxspeechtimeout"/> <assign name="confidencescore" value = "0"/> </target> <!-- Handler when no speech is detected too many times --> <target event="maxnoinputs" path="return"> <assign name="returncode" value="FAILURE"/> <assign name="returnvalue" value="maxnoinputs"/> <assign name="confidencescore" value = "0"/> </target> <!-- Generic handler for all remaining error events --> <target event="error" path="return"> <assign name="returncode" value="FAILURE"/> <assign name="returnvalue" value="error"/> <assign name="confidencescore" value = "0"/> </target> <!-- Generic handler for all remaining events--> <target event="." path="return"> <assign name="returncode" value="FAILURE"/> <assign name="returnvalue" value="defaultEventHandlerActivated"/> <assign name="confidencescore" value = "0"/> </target> </catch>

Write a wrapper for the component (appconfig.xhmi)


The final step to building an OSD component is to create a wrapper that loads the component as a web application. This is done in an appconfig.xhmi file (see Create the directory structure). Once the wrapper is complete, the implementation of the component is done. You can pack the component into a war or ear file and install it on any servlet container.

34

OpenSpeech Dialog 1.4 Developers Guide

Nuance Proprietary

This example wraps the OSD component described in Example PIN component:
<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE xhmi SYSTEM "../dtd/xhmi.dtd"> <xhmi root="pin" xmlns="http://www.scansoft.com/2004/xhmi" xmlns:xi="http://www.w3.org/2001/XInclude"> <vuiforward> <!--Define jsp's for generating VoiceXML --> <forward name="_collection" path="/collection.jsp"/> <forward name="_error" path="/error.jsp"/> <forward name="_return" path="/return.jsp"/> </vuiforward> <!--Enable the DPF --> <var-list> <var name="DPF" type="dpf" expr="'pin'"/> </var-list> <!--Include the pin dialog --> <xi:include href="inc/pin.xhmi"/> </xhmi>

Above, the wrapper defines an xHMI application with the pin root dialog. This structure (using <xi:include>) is recommended because it enables you to also use the pin.xhmi file directly in an OSD application.

Nuance Proprietary

Creating an OSD component Implementing OSD components

35

Calling OSD components from VoiceXML applications


Here is a VoiceXML fragment that calls a subdialog named mypin, and passes the PIN length as an HTTP parameter. The subdialog is actually an OSD component named osdm3-pin that collects a personal identification number. When the component returns, the application checks the returned result. The example does not show what the VoiceXML application does next. We omit error handling for brevity:
<?xml version="1.0" encoding="UTF-8"?> <vxml version="2.0" xmlns="http://www.w3.org/2001/vxml" xml:lang="en-US"> <form> <var name="result" /> <var name="length" expr="4" /> <subdialog name="mypin" src="http://somehost:8080/osdm3-pin/osd-component" namelist="length"> <filled> <assign name="result" expr="mypin.returnvalue" /> </filled> </subdialog> </form> </vxml>

The VoiceXML application can use the HTTP request to set properties because the OSD component configures a DPF tree to receive them. For an example, see Implementing OSD components. Above, the <subdialog> inserts length into the tree, and the component accesses the property using syntax like DPF.pin.length. For an example, in The collection node example on page 30).

36

OpenSpeech Dialog 1.4 Developers Guide

Nuance Proprietary

Calling OSD components from xHMI applications


The following subsections show how to use OSD components. This example creates a client application that calls the component described in the Example PIN component on page 24. The application configures the PIN length, and decides what to do after a valid PIN is collected and verified.

Create the application structure


The directory structure follows the conventions described in Create a directory structure on page 138. Here is the directory structure for the example client application. The parent directory is named secured:
secured WEB-INF xhmi app appconfig.xhmi dtd ...dtd files... ...rendering jsp files... ...

Create the application configuration (appconfig.xhmi)


OSD applications call OSD components as if they were nodes. In this example, the OSDM node configures and calls the OSD component (itself an OSDM node). The code below shows the complete xHMI file including a header followed by placeholders for the important blocks of code (which are shown later). Here is the application configuration:
<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE xhmi SYSTEM "../dtd/xhmi.dtd"> <xhmi root="Main" xmlns="http://www.scansoft.com/2004/xhmi" xmlns:xi="http://www.w3.org/2001/XInclude"> <vuiforward> <!--Define jsp's for generating VoiceXML --> <forward name="_collection" path="/collection.jsp"/> <forward name="_outputExit" path="/outputExit.jsp"/> <forward name="callOSDM" path="/outputOSDM.jsp"/> <forward name="_error" path="/error.jsp"/> </vuiforward>

Nuance Proprietary

Creating an OSD component Calling OSD components from xHMI applications

37

<dialog id="Main" root="intro"> <!-Insert Main dialog; see Example parameter list... Insert intro node; see Example reference to a component... Insert pinOSDM node; see Example call to a component... Insert presentResult node; see Example result handling... Insert ErrorHandler dialog; see Example error handling... --> </dialog> <catch> <!-- Handle the most common expected events --> <target event="connection.disconnect" path="exit"/> <target event="error" path="exit"/> <target event="." path="exit"/> </catch> </xhmi> Example parameter list

To receive return values from OSD components, the OSD application must declare a formal list of attributes in its root dialog. Below is the root dialog for the Example PIN component. The Main dialog defines intro as the root node, and declares attributes to contain the return values. Here is the beginning of the Main dialog:
<dialog id="Main" root="intro"> <var-list> <var name="osdmPINReturnValue" type="attribute"/> <var name="osdmPINReturnCode" type="attribute"/> <var name="osdmPINConfidence" type="attribute"/> </var-list>

Example reference to a component

The OSD application calls components as if they were nodes. Below, intro is the first node called by the Main dialog of the Example PIN component. The node

38

OpenSpeech Dialog 1.4 Developers Guide

Nuance Proprietary

adds a prompt to the output queue and calls the pinOSDM node. Here is the configuration:
<!--The output class sends the prompt to an output queue --> <node id="intro" class="Output"> <config> <output-list> <initial> <output>Welcome to the PIN demonstrator</output> </initial> </output-list> </config> <transition> <next name="expr"> <!--Call the next node, in this case an OSD component --> <target path="pinOSDM"/> </next> <!--Only one <next> because the target is unconditional--> </transition> </node> Example call to a component

This example defines a node that invokes an OSD component. You can use the OSDMNode or ServerSideOSDMNode classes for the definition. Below is the configuration for the pinOSDM node, which calls the Example PIN component. Like all nodes using the OSDMNode class, pinOSDM uses a namelist to send properties to the OSDM (in this example, we set the PIN length). The system writes property values to a DPF tree provided by the OSD component, and the values override any defaults defined in the component.
<node id="pinOSDM" class="com.scansoft.osd.nodes.OSDMNode"> <config> <property-list> <property name="length" value="3"/> </property-list> <!--Call the PIN component --> <osdm src="%{osdm_server}osdm3-pin/osd-component"> <!--Fill slots with values of returned parameters --> <fills name="osdmPINReturnValue" slot="returnvalue"/> <fills name="osdmPINReturnCode" slot="returncode"/> <fills name="osdmPINConfidence" slot= "confidencescore"/> </osdm> </config>

Nuance Proprietary

Creating an OSD component Calling OSD components from xHMI applications

39

<transition> <next name="SUCCESS"> <target path="presentResult"/> </next> <next name="FAILURE"> <target path="/ErrorHandler(osdmPINReturnValue)"/> </next> </transition> </node>

The <transition> block defines what happens after the PIN component returns. On SUCCESS, the application calls the presentResult node. On FAILURE the application calls the ErrorHandler dialog (and passes the return value as a parameter for use in the error message).
Example result handling

A real application would do something useful with the collected PIN, but this sample merely repeats the collected PIN to the user and then exits. Here is the configuration for the presentResult node:
<node id="presentResult" class="Output"> <config> <output-list> <initial> <output> Your pin is <say-as interpret-as="number" format="digits"> <value expr="osdmPINReturnValue.best()"/> </say-as> Goodbye </output> </initial> </output-list> </config> <transition> <next name="expr"> <target path="exit"/> </next> </transition> </node> </dialog>

Note how the presentResult node controls the text-to-speech output: because the format is declared as digits, the TTS engine will speak the PIN as individual digits (for example, one-two-three-four) instead of combining digits into some natural number expression (for example, twelve thirty-four or one thousand two hundred thirty-four).

40

OpenSpeech Dialog 1.4 Developers Guide

Nuance Proprietary

Example error handling

The ErrorHandler dialog is called when an error occurs.2 This example error handler simply informs about the error (for testing purposes), and then the application exits. A real application would do something more useful; for example, it might transfer the telephone call to a human operator. Here is the configuration for the ErrorHandler dialog:
<dialog id="ErrorHandler" root="errorHandler" attributes="resval"> <node id="errorHandler" class="Output"> <config> <output-list> <initial> <output>An error occurred: <value expr="s.best('resval')"/> </output> <output>please try again later</output> </initial> </output-list> </config> <transition> <next name="expr"> <target path="exit"/> </next> </transition> </node> </dialog>

2For

this example, the error handler is implemented as a dialog instead of a node. This choice emphasizes that a re-usable dialog is more useful when the example performs a more complicated transaction.
Creating an OSD component Calling OSD components from xHMI applications 41

Nuance Proprietary

42

OpenSpeech Dialog 1.4 Developers Guide

Nuance Proprietary

Chapter 5

TransitionNode configuration

The TransitionNode node class provides a way to specify transitions in a node. Using a Transition node allows you to centralize and re-use transition handling instead of repeating transition configurations in every node.
Description <config> Element <property-list> Description Specifies predefined properties that steer the transition to the next node.

TransitionNode ignores all elements in <config> except for the predefined properties (see Predefined properties on page 147). Your Transition nodes can log activities just as any other node: use <log> inside the <node> and its <target> elements.
Example

Below is an example Transition node. It defines three targets that could be used by several nodes in the application:
<node class="com.scansoft.osd.nodes.TransitionNode" id="transit"> <transition> <next name="expr"> <target condexpr="s.best('serviceType')=='WEATHER'" path="/Weather"/> <target condexpr="s.best('serviceType')=='NEWS'" path="/News"/> <target condexpr="true" path="/Help"/> </next> </transition> </node>

Nuance Proprietary

TransitionNode configuration

43

Here is an example of a node that uses the Transition node:


<var-list> <var name="serviceType" type="attribute"/> </var-list> <node class="Collection" id="welcome"> <config> <output-list> <initial> <output> Welcome, what service do you want to use? </output> <output> You can have information about the WEATHER or NEWS </output> </initial> </output-list> <grm-list> <grm src="serviceNames.grxml"> <fills name="serviceType"/> </grm> </grm-list> <understand namelist="serviceType"/> </config> <transition> <next name=expr> <target path=#transit/> </next> </transition> </node>

44

OpenSpeech Dialog 1.4 Developers Guide

Nuance Proprietary

Chapter 6

Handling events with application servers

When events arise, application designers and developers can handle them client-side or server-side:

Client-side event handling means handling events on the VoiceXML page. Server-side event handling means handling events on the application server.

By default, event handling is done client-side. To change the default, see Enabling server-side event handling on page 46.

Overview
Prior to OSD 1.4, applications handled user-defined events on the application server and handled predefined recognition events (nomatch, noinput and help) on the VoiceXML page. As of OSD 1.4, applications can define a single event-handling mechanism located on the server-side. When recognition events arise, and the application handles them client-side, the only possible actions are to play a prompt and restart the node to collect the information. Many more actions are possible on the server-side:

<assign <clear> <output> <log> <script>

Nuance Proprietary

Handling events with application servers Overview

45

Performance considerations
Server-side event handling shifts processing requirements from client to server. Applications are likely to demand more processing on the server-side (because more actions are possible there). A disadvantage to server-side event handling is that it adds one roundtrip of messages between the VoiceXML interpreter and the web application server for each recognition event that occurs. If you anticipate high counts of recognition events (nomatch, noinput and help), then you should consider the additional network load of roundtrip messages for your application.

Enabling server-side event handling


By default, server-side event handling is disabled. This is done to preserve compatibility with previous OSD releases. Applications can enable server-side event handling by setting the following predefined properties to false:

_renderNoMatchOutputs _renderNoInputOutputs _renderHelpOutputs

You can set the properties at the global, dialog, or node scope. (Thus, applications can switch between client- and server-side event handling on a node- by-node basis. Note: Applications can use client-side or server-side event handling for each type of recognition event. For example, you can handle noinput events on the client and nomatch and help on the server. (However, you cannot mix client and server for the same type of event; for example, you cannot handle noinput events on both the client and server.) Setting _renderNoMatchOutputs to false disables automatic playing of the nomatch output and sends the nomatch event to the server.

46

OpenSpeech Dialog 1.4 Developers Guide

Nuance Proprietary

Counting events as they occur


Applications can use the following xHMI attributes to track when events occur in the current node:
Predefined attribute __helpCounter __noinputCounter __nomatchCounter Description Counts how many times the help event has been raised. Resets to zero (0) when exiting the current node. Counts how many times the noinput event has been raised. Resets to zero (0) when exiting the current node. Counts how many times the nomatch event has been raised. Resets to zero (0) when exiting the current node.

For examples showing counters, see Restarting a node and varying the output on page 48 and Using conditions to vary the output on page 49.

How to catch events on the application server


To catch events on the server-side, use the <catch> element. You can scope your configurations as global, dialog, and node. The <catch> configuration contains one child <target> for each handled event. This example handles the nomatch and noinput events. Both targets invoke the collect node after performing their actions:
<catch> <target event="nomatch" path="#collect"> <log>nomatch log message with <value expr="'values'"/></log> <clear name="slot_to_be_cleared"/> <assign name="slot_to_be_assigned_to" expr="'some value'"/> <output>I did not understand, please repeat</output> </target> <target event="noinput" path="#collect"> <log>noinput log message with <value expr="'values'"/></log> <output>Please answer the question carefully</output> ... </target> </catch>

Nuance Proprietary

Handling events with application servers Counting events as they occur

47

Examples of event handling on the application server


This section shows snippets and a complete example of xHMI configurations for handling recognition events (noinput, nomatch, and help). The examples show how to count iterations of node executions and customize output prompts based on the number of retries experienced by the end-user.

Restarting a node and varying the output


This example collects one item and transitions to the next node (which is always the bye node). If a recognition event arises, the handler restarts the node.
<node class="Collection" id="collect"> <config> <var name="counter" type="int" expr="0"/> <property-list> <!-- enable server-side event handling --> <property name="_renderNoMatchOutputs" value="false"/> <property name="_renderNoInputOutputs" value="false"/> <property name="_renderHelpOutputs" value="false"/> </property-list> <output-list> <initial> <output>This is a collection</output> </initial> </output-list> <grm-list> <grm src="collection.grxml"> <fills name="collectedItem"/> </grm> </grm-list> <understand namelist="collectedItem"/> </config> <transition> <next name="expr"> <!-- Optional max retry counter below. See Setting the maximum retries of a node on page 50 --> <target condexpr="counter > 17" path="Help()"/> <target condexpr="collectedItem.def()" path="#bye"/> </next>
48 OpenSpeech Dialog 1.4 Developers Guide Nuance Proprietary

</transition> <!-- The simplest event handler restarts the node --> <catch> <target event="nomatch" path="#collect"/> <target event="noinput" path="#collect"/> </catch> </node>

Using conditions to vary the output


The next example uses conditions to vary the <output> elements each time the node is restarted. Because the conditions require processing that cannot execute on the VoiceXML page, the recognition event handling must happen server-side. The previous example (Restarting a node and varying the output on page 48) used <output count=... to implement its conditional logic. Here, user-defined attributes serve the same purpose. The behavior of this node is the same shown in Restarting a node and varying the output on page 48. The difference is that this node controls the contents of the counters. The catch handlers contain executable content that compute counter values:
<node class="Collection" id="collect"> <var-list> <!-- initialize user-defined counters --> <var name="nomatches" type="attribute" expr="0"/> <var name="noinputs" type="attribute" expr="0"/> <var name="counter" type="int" expr="0"/> </var-list> <config> <property-list> <!-- enable server-side event handling --> <property name="_renderNoMatchOutputs" value="false"/> <property name="_renderNoInputOutputs" value="false"/> <property name="_renderHelpOutputs" value="false"/> </property-list> <output-list> <initial> <output condexpr="counter==0"> This is a collection </output> <!-- use next prompt on the retry --> <output condexpr="counter==1">
Nuance Proprietary Handling events with application servers Examples of event handling on the application server 49

This is still a collection </output> </initial> </output-list> <grm-list> <grm src="collection.grxml"> <fills name="collectedItem"/> </grm> </grm-list> </config> <transition> <next name="expr"> <target condexpr="collectedItem.def()" path="#bye"/> </next> </transition> <catch> <!-- if nomatch arises, increment counters and restart node--> <target event="nomatch" path="#collect"> <assign name="nomatches" expr="Number(s.best('nomatches'))+1"/> <assign name="counter" expr="Number(s.best('counter'))+1"/> </target> <!-- if noinput arises, increment counters and restart node--> <target event="noinput" path="#collect"> <assign name="nomatches" expr="Number(s.best('noinputs'))+1"/> <assign name="counter" expr="Number(s.best('counter'))+1"/> </target> </catch> </node>

(In the preceding example, we omit <understand> assuming that it is defined at a higher scope.)

Setting the maximum retries of a node


This example implements a maximum number of restarts for a node. (A node restart is a retry state for the user: the node begins again after an unsuccessful collection of information from the user.) The counter in this example is the total of all retry causes.

50

OpenSpeech Dialog 1.4 Developers Guide

Nuance Proprietary

Define the maximum retry counter as first or last transition in the node. Here, the counter is the first transition target:
<node> <!-- ... --> <transition> <target condexpr="s.best('counter')>17" path="Help"> <log>counter reached maximum number of events</log> </target> <!-- remaining targets go here --> <!-- ... --> </transition> <!-- ... --> </node>

To see where this <target> fits inside the node, see the example in Restarting a node and varying the output on page 48.

Running scripts inside event handlers


The next example adds the powerful scripting feature. (This feature is sometimes called scripts as a child of <target>.) Upon catching any event (not just recognition events, but also user-defined events), the application can run a script. Because the event handler is server-side, the scripts have full access to the data in the OSD session. This example catches only the nomatch event (the <catch> is at the bottom):
<node class="Collection" id="collect"> <config> <property-list> <property name="_renderNoMatchOutputs" value="false"/> <property name="_renderNoInputOutputs" value="false"/> <property name="_renderHelpOutputs" value="false"/> </property-list> <output-list> <initial> <output>This is a collection</output> <output count="2">This is still a collection</output> </initial> </output-list> <grm-list> <grm src="collection.grxml"> <fills name="collectedItem"/> </grm>
Nuance Proprietary Handling events with application servers Examples of event handling on the application server 51

</grm-list> <understand namelist="collectedItem"/> </config> <transition> <next name="expr"> <target condexpr="collectedItem.def()" path="#bye"/> </next> <!-- ... <next> ... --> </transition> <catch> <!-- if nomatch happened, restart collect --> <target event="nomatch" path="#collect"> <output>Please say something I can understand!</output> <script> a = 'executed every time this target is used'; </script> </target> </catch> </node>

The first execution of the node sends the following output: This is a collection. The first nomatch event, and any subsequent nomatch, sends this output: Please say something I can understand! This is still a collection.

52

OpenSpeech Dialog 1.4 Developers Guide

Nuance Proprietary

Complete event-handling example


This example shows a simple, complete application for reference using features provided by default in OSD. There is a single dialog (Main), and several nodes (intro, collect, and bye). When noinput and nomatch recognition events arise, the example handles them on the application server. This allows for logging, scripting, and interactions with the sessions data. This example application refers to counters that are managed internally by OSD, and it has one user-defined counter for the total number of nomatch and noinput events:
<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE xhmi SYSTEM "../dtd/xhmi.dtd"> <xhmi root="Main" xml:lang="en-US" xmlns="http://www.scansoft.com/2004/xhmi" xmlns:xi="http://www.w3.org/2001/XInclude"> <vuiforward> <forward name="_outputExit" path="/outputExit.jsp"/> <forward name="_collection" path="/collection.jsp"/> </vuiforward> <dialog root="intro" id="Main"> <var-list> <var name="collectedItem" type="attribute"/> <var name="counter" type="int" expr="0"/> </var-list> <node class="Output" id="intro"> <config> <output-list> <initial> <output>Welcome to the demo</output> </initial> </output-list> </config> <transition> <next name="expr"> <target path="#collect"/> </next> </transition> </node>

Nuance Proprietary

Handling events with application servers Complete event-handling example

53

<node class="Collection" id="collect"> <log>collection entered</log> <log>counter is '<value expr="counter"/>'</log> <config> <property-list> <!-- do not handle reco events on client-side --> <property name="_renderNoMatchOutpurs" expr="'false'"/> <property name="_renderNoInputOutpurs" expr="'false'"/> <property name="_renderNoHelpOutpurs" expr="'false'"/> </property-list> <output-list> <initial> <output>This is a collection</output> </initial> </output-list> <grm-list> <grm src="collection.grxml"> <fills name="collectedItem"/> </grm> </grm-list> <understand namelist="collectedItem"/> </config> <transition> <next name="expr"> <!-- Check the total retry counter. --> <target condexpr="counter > 17" path="#bye"> <log> maximum retries and events exceeded; leaving unsuccessful collection </log> </target> <!-- if data was collected, go to next node --> <target condexpr="collectedItem.def()" path="#bye"> <log>leaving successful collection</log> <clear name="counter"/> </target> <!-- by default, retry the node --> <target condexpr="true" path="#collect"> <log>executing node again</log> <assign name="counter"
54 OpenSpeech Dialog 1.4 Developers Guide Nuance Proprietary

expr="Number(s.best('counter'))+1"/> </target> </next> </transition> <catch> <!-- handle nomatch and noinput recognition events --> <target event="nomatch" path="#collect"> <log> nomatch event occurred; the number of times this node has tried to collect is <value expr="counter.best()"/> </log> <script>counter=counter+1;</script> <clear name="collectedItem"/> </target> <target event="noinput" path="#collect"> <log> noinput event occurred; the number of times this node has tried to collect is <value expr="counter.best()"/> </log> <script>counter=counter+1;</script> <clear name="collectedItem"/> </target> </catch> <!-- The final is triggered when the node is exited without an explicit transition at the dialog or global level.--> <final> <log>non node transition used to leave collection</log> </final> </node> <node id="bye" class="Output"> <config> <output-list> <initial> <output>Thanks for calling</output> </initial> </output-list> </config> <transition> <next name="expr"> <target path="exit"/> </next>
Nuance Proprietary Handling events with application servers Complete event-handling example 55

</transition> </node> <final> <!-- this <final> is for the dialog scope --> <log> Dialog exited without transitioning to another dialog </log> </final> </dialog> <catch> <!-- global handler for handup events --> <target event="session.connection.disconnect" path="exit"/> </catch> </xhmi>

56

OpenSpeech Dialog 1.4 Developers Guide

Nuance Proprietary

Chapter 7

OSD logging

This chapter describes OSD logging mechanisms, including these topics:

About OSD logging, an overview of the available logging streams. Application logging, a description of xHMI configuration and critical events your applications should log. Turning application logging on and off

About OSD logging


OSD generates log events, and writes log files for various purposes. OSD implements these types of logging.

Diagnostic loggingFor debugging and monitoring system operations, OSD uses log4j, an open-source utility that is a project of the Apache Software Foundation. Page loggingFor debugging, OSD writes copies of every VoiceXML page it renders. Application loggingFor analysis and tuning of your applications, OSD writes log files to a documented file system. You have control of the content of log messages, and you can choose more than one format of records in the files. For example, the most common format is the one used by the OpenSpeech Insight (OSI) tuning tool.

Nuance Proprietary

OSD logging About OSD logging

57

Diagnostic logging
For debugging and monitoring system operations, OSD uses log4j, an open-source Java utility that is a project of the Apache Software Foundation. The utility has six logging levels:

FATAL ERROR WARN INFO DEBUG TRACE

You control the logging mostly with the help of a log4j property file. The contents of the file control the format of the log entries and the amount of logged information. For each OSD application, you point to the log4j property file with the
log4jPropertyFileName context parameter in the applications web.xml. For

example:
<web-app> <context-param> <param-name> log4jPropertyFileName </param-name> <param-value> WEB-INF/log4j.properties </param-value> <description> Location of log4j property file for diagnostic logging. </description> </context-param> </web-app>

If you omit this parameter, the default location of the property file is /log/OSD.log inside the installed application. For concepts, reference, and configuration details, see this website: http://logging.apache.org/log4j/docs/ Performance tips:

Disable console output in the log4j.properties file. In a production system, do not use DEBUG or INFO levels. Limit logging for each class if specific DEBUG or INFO logs are required.

58

OpenSpeech Dialog 1.4 Developers Guide

Nuance Proprietary

Page logging
For debugging, OSD can write a copy of every VoiceXML page it renders to a local directory. When you enable this feature, OSD creates a directory named pages, and writes the pages there. Page logging adds a substantial load to your computer system. Do not use it during normal operations. When you use it, enable it for one http session (web service) at a time, since the code running in the background is not thread safe. Note: Do not enable page logging on production systems that are already operating near capacity. To enable page logging, add the following <filter> and <filter-mapping> elements to your applications web.xml file:
<web-app>

...

<filter> <filter-name>PageLog</filter-name> <filter-class> com.scansoft.osd.servlet.FilterPageLog </filter-class> </filter> <filter-mapping> <filter-name>PageLog</filter-name> <servlet-name>osdservlet</servlet-name> </filter-mapping> </web-app>

To disable page logging, convert the section to a comment or delete it.

Application logging
Application logging writes information about the dialog flow as it occurs during each session: you write messages with the <log> element, and your subsequent analysis of the logs reveals the performance of the application, its overall success, and the success of its discrete parts. Although OSD automatically writes some application logs, application developers provide the majority of logs for the OSD applications and components they write. By using <log> in the appropriate locations of your xHMI configuration, you control when the application writes messages and the content of those messages (for example, you might indicate the success or failure of a transaction when leaving a dialog).
Nuance Proprietary OSD logging Application logging 59

Log message formats


You can write any text in a log message. For example:
<log>This message is simple text. </log>

However, it is better to write event-based messages so that the logs can merge with other Nuance speech products in a complementary fashion. Each speech product has a different role during sessions, and their logs describe different aspects of runtime events. During a session, the products write logs to different directories and machines. The files are complementary because you can assembly all the logs for analysis by a single tuning tool. Here is the format of an event-based message:
<log> EVNT=event-name |parameter-name <value>=value <!--Insert more parameter/value pairs here--> </log>

You can use any number of <log> elements and parameter/value pairs. The field EVNT is required; it classifies the event. Log message values can be scripts or constants. Use ECMAScript to provide dynamic values that are only available at runtime. For example, a script can access the SessionFrame:
<log> EVNT=OSDInfo| INFO=pizzatopping: <value expr="pizzatopping.best()"/> </log>

In this example, the expression writes the name of the pizza topping collected from the caller. The name is the first hypothesis in the recognition result.

Scoping of log messages


You can write logs in these locations in the xHMI configuration file: <dialog> <node> <transition> <catch> <final> The scope where you insert a <log> element is important. For example, if you log an event at the <dialog> level, the system writes the message (once) when

60

OpenSpeech Dialog 1.4 Developers Guide

Nuance Proprietary

invoking any node inside that dialog. Alternatively, if you log an event inside a <node>, the event is logged only when invoking that node.

Nesting log events


Some events indicate the start or end of a situation. For example, a transaction starts, processing occurs, and the transaction ends. During the processing, you can nest additional log events inside the started event. Conceptually, nested events work like this:
dialog transaction [startA] node transaction [startB] node transaction [endB] node transaction [startC] node transaction [endC] dialog transaction [endA]

Log events, parameters, and values


Below are the log events used by the OSI tuning tool. By using these events and their associated parameters, you ensure that OSI can analyze and report on application performance in a standard way:
EVNT OSDError OSDInfo SWIcllr SWIdbrx SWIdbtx SWItrfr SWItrxb SWItrxe Purpose Logs an error situation. Logs any general information. Categorizes the caller into a user population. Ends a database transaction. Starts a database transaction. Starts a transfer. Starts a transaction. Ends a transaction.

Nuance Proprietary

OSD logging Application logging

61

Each log event accepts one or more parameters:


Event OSDError OSDInfo SWIcllr SWIdbrx SWIdbtx SWItrfr SWItrxb SWItrxe Allowed parameters INFO INFO CLID, GRP1, GRP2, GRP3, GRP4 INFO, NAME, RSLT NAME,SERV INFO, NAME, RESN NAME INFO, NAME, RESN, RSLT

Here are definitions of the parameters:

CLIDa caller id. For example the telephone number of the caller. GRP1, GRP2, GRP3, GRP4a category for the caller. INFOany additional information about the event. NAMEthe name of the event. SERVthe name of a database server. RESNthe reason for the RSLT. RSLTthe result of the event. The values are:

FAIL indicates a failed event, for example, in which all the required information was not collected from the caller due to a recognition or user interface problem. SUCC indicates a successful situation, for example, where all the required information was collected from the caller. UNKN indicates an unknown situation. For example, the caller disconnected or inexplicably requested a transfer to a human agent.

62

OpenSpeech Dialog 1.4 Developers Guide

Nuance Proprietary

Generic log events

Use the OSDInfo event for any generic message, for example at the beginning and end of dialogs and nodes, and after successful transfers. This example shows the beginning of a dialog:
<dialog id="Main" root="intro"> <!-- ... --> <log> EVNT=OSDInfo | INFO=Gathering user input </log>

This example shows the beginning of a node:


<node id="pizzaSize" class="Collection" > <log>EVNT=OSDinfo|INFO=Node pizzaSize entered.</log> <!-- ... -->

Use the OSDError event for any generic error message. For example:
<log> EVNT=OSDerror | INFO=An error occurred, going to last anchor. </log>

OSDInfo and OSDError are also useful for certain transfer situations. See Transfers.
Transaction events

One use for log messages is to signal the start and end of application transactions such as the accessing of records in a database or the execution of a group of nodes. A transaction consists of one or more collections from a caller, and can also include database interactions. For example, applications that identify users might define a transaction to include these parts:

Collect caller ID Collect password Validate password (database interaction).

To measure the success of an application, your logs must report each transaction. For example, in a banking application users identify themselves, view account balances, transfer currency among accounts, and make payments. The application associates transactions with each of these tasks and tracks them using the <log> element. Because transactions begin and end, their associated log messages must also begin and end. You can start and end transactions anywhere in your xHMI configuration, but in general a <dialog> starts a transaction and a <node> starts a task (or a sub-transaction) within a transaction. Here are recommendations:

Nuance Proprietary

OSD logging Application logging

63

Write start and end events for each <dialog>. Write start and end events for each <node> in the <dialog>.

Use SWItrxb at the start of a transaction. For example:


<node id="pizzaSize" class="Collection"> <log>EVNT=SWItrxb|NAME=get_size_from_user</log> <!-- ... -->

Use SWItrxe in the transitions at the end of a transaction. The NAME must match the name of a previous SWItrxb event:
<!--transition--> <!--next--> <!--target--> <log> EVNT=SWItrxe|NAME=get_size_from_user|RSLT=SUCC |INFO=prompts queued </log> </target> </next> </transition> Node transitions

A node transition is always the result of gathering new information (for example, from a recognizer or a database). Depending on the status of the dialog (resulting from the new information applied to its attributes), the transition chooses the next target. For transitions that are transfers, see Transfers on page 67. Generally, a start transaction has already been logged, and the transition needs an end transaction event. Because transitions have the characteristics of an if-then-else decision, you need to log information for each possible outcome, including messages for:

Each <target> in the nodes <transition> Each <target> in the dialogs <transition> Each <target> in a global <transition> Each <target> in a thrown event

Heres an example for successful outcomes: When the transaction for the node is successful (target condition=true), configure this log in the <transition> element:
<log> EVNT=SWItrxe | NAME=get_pizzasize |RSLT=SUCC </log>

64

OpenSpeech Dialog 1.4 Developers Guide

Nuance Proprietary

When the transaction is a failure, configure the log message in the <final> element. See Final transitions.
Catch handlers

Catch handlers should log the thrown events. The <catch> element enables the application to react to specific situations. Your logs can provide details and frequencies of those situations. Using <log> in a global catch handler (the <catch> inside the <xhmi> element) defines global log messages. OSD processes catch handlers before node transitions. When the system throws an event, and the <catch> has a true transition, the system invokes the target and never visits the nodes transition. This behavior ensures catching events immediately when they occur.

Final transitions

Use <log> inside <final> to report final status before the application exits, but remember that applications do not visit <final> at the end of every session. For example, events such as failing transactions are good candidates for final logging. The logging is done in these cases:

OSD calls the nodes <final> element when leaving the node without using a node transition. This occurs when there is an exception during execution, and when using a <transition> or <catch> at the dialog or global scope. OSD calls the dialogs <final> element when leaving the dialog without using a dialog or node target. This occurs when there is an exception during execution, and when using a global <transition> or <catch>.

Heres an example:
<final> <log> EVNT=SWItrxe |NAME=get_pizzasize |RSLT+FAIL |RESN=targetmissing |INFO=Node left without a target used. User data not complete. </log> Database interactions

You can log database interactions as individual transactions or nest them in other transactions. Because it is logged as a transaction, a database transaction has a start and end event log. Use SWIdbtx at the beginning of a database transaction. For example:
<node id="Your_databaseAccess_node_name" class="com.YourName.nodes.Database"> <log> EVNT=SWIdbtx|NAME=go_fetch_my_data></log>

Use SWIdbrx at the end of a database transaction. The NAME must match the name of a previous SWIdbtx event:
Nuance Proprietary OSD logging Application logging 65

<!--transition--> <!--next--> <!--target--> <log> EVNT=SWItrxe |NAME=go_fetch_my_data |RSLT=SUCC </log> </target> </next> </transition> Caller segmentation

You can use logs to group user populations into categories. For example, you can identify particular aspects of a call (its id number) and the callers (what they say and what they want). You can categorize types of callers, types of products, or the parts of a product a caller wants. To accomplish this, log the SWIcllr event in a <dialog> whenever a collection reveals new information that characterizes the caller. If you write the same information more than once, OSI uses the latest value logged. It is a good idea to log caller information as soon as it becomes available and if information changes, it should be logged again. Repetition of previously logged data is not necessary. This example identifies the user with the session ID from the servlet container, which is stored in a predefined OSD property. The message categorizes the user by the size of pizza ordered and their desired topping
<log> EVNT=SWIcllr |CLID=s.best('_sessionId') |GRP1=s.best('pizzasize') |GRP2=s.best('pizzatopping') </log>

Below, is a more detailed example. GRP1 should be filled with the pizza size and GRP2 should be filled with the pizza topping requested by the caller. 1 In the beginning of the application the first information logged is the call ID stored in the SessionFrame:
<!--Global scope near the top of the xHMI file--> <log>EVNT=SWIcllr|CLID=${s.best("__callId")}</log>

In the log file, this might appear as: EVNT=SWIcllr|CLID=1234 2 Then the application asks the caller about the pizza size. Assume the caller answers large. Below, the next SWIcllr event fills GRP1 with the answer:

66

OpenSpeech Dialog 1.4 Developers Guide

Nuance Proprietary

<log>EVNT=SWIcllr|GRP1=s.best('pizzasize')</log>

In the log file, this might appear as follows. Note that the CLID remains known to the system.: EVNT=SWIcllr|CLID=1234|GRP1=large 3 Next, the application asks the caller about the topping. Assume the caller answers ham. Below, the next SWIcllr event fills GRP2 with the answer:
<log>EVNT=SWIcllr|GRP2=s.best('pizzatopping')</log>

In the log file, this might appear as follows. Again, the previously logged values remain known: EVNT=SWIcllr|CLID=1234|GRP1=large|GRP2=ham
Transfers

Logging transfers is another type of transaction logging. Because a speech application should result in a minimum of transfers, logging them provides critical information for tuning an application. OSD makes three types of transfers:

Blind transfer Silent bridge transfer Bridge transfer

It is very important to log the SWItrfr event when the application transfers the caller to another application or to a human operator. This is the last opportunity to add session information, and it enables analysis of how sessions are ending. You can write transfer log messages in either of these locations:

In the node from which the caller is transferred In the Transfer node when the transfer is happening

Use SWItrfr to log the critical part, that is, the actual transfer, inside the <transition> element. Example for blind transfers:
<log> EVNT=SWItrfr |NAME=blind |RESN=caller demanded a blind transfer| INFO=Transferring caller blindly </log>

Nuance Proprietary

OSD logging Application logging

67

Example for silent bridge transfers:


<log> EVNT=SWItrfr |NAME=silen |RESN=caller demanded a silent bridge transfer |INFO=Transferring caller to %{_transferDestination} </log>

Example for bridge transfers:


<log> EVNT=SWItrfr|NAME=bridge |RESN=caller demanded an interruptible bridge transfer |INFO=Transferring caller to%{_transferDestination} </log>

Use OSDinfo and OSDerror to log transfer events other than the actual transfer. For example:
<transition> <target condexpr= "s.best('typeOfTransfer')=='hangup'" path="#nodeBye"> <log> EVNT=OSDinfo |INFO=Caller does not want a transfer. Transferring caller to %{_transferDestination} </log> </target> </transition>

Turning application logging on and off


OSD provides a logging web service as a web archive file (osd-osilogger.war). To use the service, update the configuration of its web.xml file and deploy it on an application server. The deployment is the same as any other web application. OSD uses an EventLogger class to write log messages. The logging is done on the server-side, a feature that helps to centralize the location of logged data while minimizing load on client browsers. By default, logging is turned off. You turn it on by setting the OSILogServer property to true in the applications xHMI configuration file. To configure the logger, use the INF/classes/com/nuance/log/config.properties file:
# FILTER CONFIGURATION # # filter=XYZ The filter selection, XYZ, needs to be defined

68

OpenSpeech Dialog 1.4 Developers Guide

Nuance Proprietary

# using filter.XYZ.class=... # # PLEASE NOTE: use ONE filter only # # Configure each filter XYZ as follows: # # [REQUIRED] filter.XYZ.class # The implementation of com.nuance.log.IFilter # # [OPTIONAL] filter.XYZ.logdir # The directory to write log files. The default is: logs/ # # [OPTIONAL] filter.XYZ.filepattern # The pattern for generating the logfile. The default pattern # uses the sessionid in the file name: sid_%{SESSIONID}.log # filter=OSI_SIMPLIFIED filter.OSI_SIMPLIFIED.class=com.nuance.log.OSISimplifiedFilter filter.OSI_SIMPLIFIED.logdir=log filter.OSI_SIMPLIFIED.logmerge=classic filter.OSI_SIMPLIFIED.maxBackupIndex=100 filter.OSI_SIMPLIFIED.bufferSize=52428800 # Here is the default filepattern # filter.OSI_SIMPLIFIED.filepattern=coreEvents.log # Here is the filepattern used by OSD filter.OSI_SIMPLIFIED.filepattern=http://host:8080/osd-osilogge r/log

Nuance Proprietary

OSD logging Turning application logging on and off

69

70

OpenSpeech Dialog 1.4 Developers Guide

Nuance Proprietary

Chapter 8

OSD administration

Operators can configure OSD applications as web applications in various ways:

You can copy and edit the global.prop file that resides in the WEB-INF directory of any OSD sample application. The file contains properties that the application can reference in the xHMI configuration file (see the xHMI Reference Guide for details on properties). You can edit the web.xml to add certain services as shown in this chapter. You can dynamically configure each session of an OSD application by including specific URL parameters in the first request to the application.

Deployment to a web server


It is simple to deploy an OSD application to a web application server. The support for this task depends heavily on which web application server you use. For example, the Tomcat server uses a web-based configuration manager that allows several options:

Uploading a war file for installation. Copying the war file into the webapps directory and restarting the server. Automatically unpacking the war file without restarting the server.

If your web application does not use a war file, you can copy the whole directory structure to the web application server for deployment. See your web application server documentation for information about deployment.

Nuance Proprietary

OSD administration Deployment to a web server

71

Providing an XML parser


You must provide an XML parser in your runtime environment (the application server). The parser must be compatible with JAXP 1.3. OSD installs the Apache Xerces parser for this purpose in the java/lib/ext directory. To use this parser, copy the libraries (xercesImpl.jar and xml-apis.jar) to the appropriate location for your server. For Tomcat, the location is the common/endorsed folder of the Tomcat installation. After you copy the libraries, you might need to restart the server. Then, the OSD samples and all your OSD web applications will use the parser.

Starting a session
The first request of a client to an OSD application creates a session on the server. You can configure the session using URL parameters appended to the request, but you must do this in the initial request and not in subsequent requests. OSD accepts the following parameters in the first request:

callId callerId calledId session ID (you can define this parameter as described below)

For example, you could send this start request to the sample pizza application:
http://server:port/pizza/osd?xhmiCallerId=123&xhmiCalledId= 456&xhmiCallId=789

OSD initializes the session with the parameter values, and stores them in predefined attributes named __callerId, __calledId and __callId. You can access the attributes in the same way as any variable, for example: __callerId.best(). The rendering system use these properties to when rendering markup for your browser. If you replace the OSD rendering system, you can access the properties as _xhmiCallerId, _xhmiCalledId and _xhmiCallId. To define a session ID, add a global property to the xHMI configuration file:
<property name="_aliasUrlParamUserSessionId" value="SID"/>

Then, the request to the OSD application can contain a SID parameter:
http://server:port/pizza/osd?xhmiCallerId=123&xhmiCalledId= 456&xhmiCallId=789&SID=abc

72

OpenSpeech Dialog 1.4 Developers Guide

Nuance Proprietary

The session ID becomes available to xHMI and the rendering system as follows:
xHMI attributes rendering property __userSessionIdName and __userSessionIdValue _xhmiUserSessionId (This corresponds to __userSessionIdValue. The name is contained in the property _aliasUrlParamUserSessionId that is set in the xHMI configuration.)

You can rename the callId, callerId, and calledId parameters using the following by adding these global properties in the xHMI configuration file:
<property name="_aliasUrlParamCallerId" value="myCallerId"/> <property name="_aliasUrlParamCalledId" value="myCalledId"/> <property name="_aliasUrlParamCallId" value="myCallId"/>

With the redefined parameter names, the example request URL looks like:
http://server:port/pizza/osd?myCallerId=123&myCalledId= 456&myCallId=789&mySID=abc

Operation administration & management (OA&M)


OSD provides an Operations, Administration & Management interface (OA&M) to control installed OSD applications. To use the interface, your servlet container must be JMX compliant. Some web application servers automatically include a JMX configuration (JBoss, for example), while others require manual setup (Tomcat 5, for example). This documentation describes the JMX installation for Tomcat, with general comments for other web application servers. To use a different server, see the server vendors documentation. To use a management framework that is not JMX compliant, you must implement the framework in a manner similar to the provided OA&M interface. See the OSD Integration Guide for information about integrating different management frameworks.

Using JMX in Tomcat


Using JMX in Tomcat differs slightly from other web application servers. Tomcat does not provide a pre-configured JMX system. You must install a compatible JMX version and implementation for your Tomcat server. We assume Tomcat 5.0.x with an implementation of the JMX API v1.2 and the JMX Remote API v1.0.
Nuance Proprietary OSD administration Operation administration & management (OA&M) 73

Configuring the JMX connector


OSD provides a JMX configurator that you must configure in the web.xml. Doing this allows JMX Management Consoles to connect to the web application servers. Different web application servers do this differently. Configuration requirements:

Set JMX parameters in web.xml Load the configurator class

For Tomcat, set the servlet-listener to add the OA&M MBean. You must do this even if you have more existing web applications using the JMX configurator and the web application server is already configured for it. The JMX configurator then detects the new configuration and only adds an MBean.
Set JMX parameters in web.xml

Add the following configuration to the same web.xml that contains the JMXConfigurator listener (described in Load the configurator class on page 75). This example shows default values. If you do not need to change values, you can omit them from the configuration:
<context-param> <param-name>jmxProtocol</param-name> <param-value>jmxmp</param-value> </context-param> <context-param> <param-name>jmxHost</param-name> <param-value>localhost</param-value> </context-param> <context-param> <param-name>jmxPort</param-name> <param-value>1099</param-value> </context-param> <context-param> <param-name>jmxUrlPath</param-name> <param-value></param-value> </context-param> <context-param> <param-name>mbeanDomain</param-name> <param-value>OSD</param-value> </context-param>

74

OpenSpeech Dialog 1.4 Developers Guide

Nuance Proprietary

The protocol, host, port, and URL path parameters form the service url: service:jmx:jmxmp://localhost:1099 The mbeanDomain parameter names the MBean in the ObjectName: OpenSpeech Dialog:MBean=Instrumentation For details, see the javadoc for the JMXServiceURL and ObjectName classes.
Load the configurator class

To enable JMX in Tomcat, insert the following lines into the web.xml:
<listener> <listener-class> com.scansoft.osd.oam.jmx.JMXConfigurator </listener-class> </listener>

This configuration loads the JMXConfigurator class as a web application context listener, and enables the JMX server components. Note: All OSD web applications share this setting. Any web application that uses this configurator will possibly overwrite settings from the previously loaded web applications. We strongly recommended using a single web application to set up JMX.

Managing configuration
After you complete the JMX configuration, an MBean named Instrumentation becomes available to JMX Management consoles (under the mbeanDomain, configured as OSD by default). You can use any JMX management console.

Balancing system loads


When you deploy OSD applications as web applications, you can use a resource management service to balance loads on CPU and memory. OSD does not provide a load balancer, but you can use third-party routers and load balancers. Requirements:

The load balancer must direct each OSD session to a single server. To accomplish this, the service uses the JSESSIONID cookie to balance loads. The cookie is described in the Java Servlets Specification. The VoiceXML browser platform must support and use this cookie for each call when communicating with application servers running OSD applications.

Nuance Proprietary

OSD administration Operation administration & management (OA&M)

75

Controlling shutdown and update operations


System operators need to update OSD applications periodically without interrupting service to application users. We recommend that operators use the routing servlet when performing these updates. You can also use the methods described here to support an alternative update process. OSD provides management operations to stop applications from accepting new sessions (graceful shutdown) and to force the end of sessions that exceed a reasonable duration. OSD does not provide complete program logic for application updates; this must be implemented (integrated), probably as a script running on your management console. You can also implement graceful shutdown by blocking new sessions, eventually having zero instances of an application running. Once this status is reached, you can safely update and restart the application without losing sessions. Another mechanism is to kill an application, which immediately terminates all running sessions. This technique should be a last resort, and will result in an error when using any method of the IDialogManagerInvocation interface.
The IOAM interface

The following table shows the management operations accessible through the class IOAM and exposed by the JMX MBean server. For details on the methods, see the OSD Integration Guide and javadoc.
Method acceptNewSessions blockNewSessions getMaxActiveSessions getNumSessions getProperties getStatistics kill listApplications sendNotification setMaxActiveSessions Description Cancels a previous call to blockNewApplications or kill. Prevents new sessions. Gets the maximum number of sessions allowed. Gets the current number of sessions. Gets all context properties for an application. Gets statistical information about an application. Aborts all sessions immediately. Lists the names of all running applications. Sends a message to the OAM management framework. Sets the maximum number of sessions allowed.

76

OpenSpeech Dialog 1.4 Developers Guide

Nuance Proprietary

OA&M event notifications


Operational event notifications are messages sent to a management framework. This is different from event logging, which records all events, because only a subset of events is usually sent (a strategy to avoid information overflow at the console). OSD automatically sends notifications in case of exceptional situations. In addition, your OSD applications can send notification events. Each event contains a type, message, and userdata. If you need more fields (for example, sessionid and tenantid), your implementation must add them when writing notification events. A notification type has the following predefined hierarchy: error.application error.system warn.application warn.system info.application info.system Your events must conform with this hierarchy. The OSD framework sends error.system, warn.system, and info.system. Your applications can use an API call to send the others. You can also define subtrees of events under the error.application, warn.application and info.application. The message is a free format string to contain any information to be displayed at the management console. The userdata field contains key-value pairs delimited by semicolons. For example:
sessionid=42;application=pizza;callerId=0815;calledId=4711;

To send OA&M application events, use the sendNotification method:


sendNotification(String type, String message, String userData)

OSD adds a timestamp and a prefix of predefined key-value pairs to the userdata. The pairs are taken from the create method of the DialogManager Invocation Interface:
Key Sessionid appname Value Usually the ID generated by the application server (jessionid) The application name

Nuance Proprietary

OSD administration Operation administration & management (OA&M)

77

Key CallerId CalledId Called Default error messages

Value The callers telephone number (if available) The called number (if available) A unique identifier of the call (if available)

The messages.dtd file contains keys for the predefined error messages. OSD uses the keys to find message text in messages.xml. Here is a list of keys:
Message key CONFIGURATION_ERROR VUIFORWARDMAP_UNDEFINED TRANSITION_FAILED NO_NEW_SESSIONS SHUTTING_DOWN CONTEXT_NOT_REGISTERED NODE_EXECUTION_FAILED PROCESSING_ERROR PROTOCOL_ERROR PROTOCOL_ERROR_DIALOG_NOT_STARTED PROTOCOL_ERROR_ONLY_HANGUP PROTOCOL_ERROR_SMEX BROWSER_ERROR_EVENT Message parameters none {0} name of node none none none {0} name of application {0} name of node {0} information about failed processing none none none none {0} event message

Each message describes a single error. OSD fills the {0} construct at runtime. For a description of how this works, see the javadoc for the ErrorMessages class.
Extending error messages

You can define new error messages and use them in any custom nodes you create. (OSD does not use the messages in the predefined nodes.) The DialogManager API has a method for sending the custom error messages from within your Java code.

78

OpenSpeech Dialog 1.4 Developers Guide

Nuance Proprietary

Extending the error messages is simple: 1 Copy messages.dtd, messages_custom.dtd, and messages.xml from their installation location in the OSD system folder) to a temporary location for editing. Do not change the originally installed versions of these files. Edit the messages_custom.dtd file to define the new messages. (OSD automatically appends the file to messages.dtd.) Edit the messages.xml file to write the text of the messages. Copy the dtd files to the WEB-INF/dtd folder of every web application that will use the custom error messages.

2 3 4
Example

Here is a fragment of messages.dtd. It shows how the custom dtd is appended:


<?xml version="1.0" encoding="UTF-8"?> <!ELEMENT messages ANY> <!ELEMENT CONFIGURATION_ERROR (#PCDATA)> <!ELEMENT VUIFORWARDMAP_UNDEFINED (#PCDATA)> <!--...more definitions...--> <!ENTITY % messages_custom.dtd SYSTEM "messages_custom.dtd"> %messages_custom.dtd;

Here is an example of messages_custom.dtd. It shows the key for a new message called HELLO_WORLD:
<?xml version="1.0" encoding="UTF-8"?> <!ELEMENT HELLO_WORLD (#PCDATA)>

Here is an example of messages.xml. It shows the text added for the new message:
<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE messages SYSTEM "dtd/messages.dtd"> <messages> <CONFIGURATION_ERROR> Configuration error. Check the appconfig.xhmi for errors. </CONFIGURATION_ERROR>

Nuance Proprietary

OSD administration Operation administration & management (OA&M)

79

<VUIFORWARDMAP_UNDEFINED> VUIForwardMap undefined in node {0}. Check VUI forward maps in the appconfig.xhmi </VUIFORWARDMAP_UNDEFINED> <!--...more definitions...--> <HELLO_WORLD>Hello World Message</HELLO_WORLD> </messages>

Your application can return the new message from the ErrorMessages class, which is available through the DialogNode class (see the javadoc).

Localizing OA&M notifications


You can localize the default OA&M notification messages to any language by creating xml files containing your definitions. OSD automatically uses your localized message definitions when you place the files in the WEB-INF folder of the web application (see Create a directory structure on page 138). You can only localize the OA&M error messages defined by the DialogManager and its components. You cannot use this mechanism to localize values (such as prompts) in the xHMI configuration. To localize messages, use the following templates for the xml filenames. You can substitute other languages using the same pattern:
messages.xml messages_en.xml messages_en_US.xml Messages in the default language. Messages in English. Messages in US English.

When you include the full or partial language code, you can provide any number of files with different language codes. (For example, you could have Spanish files messages_es.xml, messages_es-ES.xml, and messages_es-CO.xml.) At runtime the system chooses messages in this order: 1 2 3 4 From the given locale From the default locale From the messages.xml From the internal defaults

OSD sends messages from these definitions to the OA&M client in notification messages. Using the localized messages.xml and messages.dtd files, you can overwrite the default messages defined in the ErrorMessages class. Because the
80 OpenSpeech Dialog 1.4 Developers Guide Nuance Proprietary

default notification messages have built-in defaults, no messages.xml file is needed. OSD provides the dtd is provided in its system folder. You must copy the dtd to every web application that uses custom error messages. (See Extending error messages on page 78.)

Routing calls to the application


This section describes using the routing servlet to route incoming requests to OSD applications.
Background concepts

OSD provides a routing servlet that accepts and forwards requests from voice browsers to OSD web applications. The routing servlet offers an easy way to update web applications and to perform management operations for application installation, update, and removal. The routing servlet lets you map requests to specific web applications using telephone numbers or application names. Importantly, the servlet lets you remove applications from service so they can be updated without interrupting sessions. To remotely manage the routing, the management framework provides a set of management operations. OSD installs an example implementation of a routing application (which uses JMX) in the routing.war that is installed in the samples folder. For additional discussion of the routing servlet, see the OSD Integration Guide.

Registering OSD applications for routing

There are no special steps to register an OSD application for routing. At startup, every OSD application registers itself to the OSD framework and is therefore available for routing. The framework knows which context of an OSD application is the latest (and stores this knowledge in the oamDataDirectory). If a server restarts, the applications are configured to use the latest context when receiving start requests. Theres two ways to configure the routing servlet:

Configuring the routing servlet

Configure static or dynamic routing. Configure the routing servlet dynamically at runtime.

Static routing is configured in the web.xml of the OSD application.


Setting the persistent storage directory

The servlet needs a persistent storage location; you must configure this location in the web application deployment descriptor (web.xml). Do this in only one

Nuance Proprietary

OSD administration Operation administration & management (OA&M)

81

web.xml, and the setting will be used by every OSD web application. To configure this setting add the following context parameter to the web.xml:
<context-param> <param-name> oamDataDirectory </param-name> <param-value> C:\OSDSavedData </param-value> </context-param> Setting the initial routing table

The web.xml also configures the initial entry set of the routing table. Use this example as a model to specify the set:
<context-param> <param-name>routingTableEntry001</param-name> <param-value>123,HelloWorld</param-value> </context-param> <context-param> <param-name>routingTableEntry017</param-name> <param-value>456,HelloWeb</param-value> </context-param>

Each entry is named routingTableEntryXYZ where XYZ is a unique string or number. Every parameter name that starts with routingTableEntry is an entry for the routing table. In the example, the phone number 123 maps to the HelloWorld application, and 456 maps to HelloWeb. In all entries, the application names must match the <display-name> entry in the web.xml of the web applications.
Enabling the routing servlet

The <servlet> and <servlet-mapping> elements are required in the web.xml for starting the RoutingServlet. These elements accept the parameters described in later in this section.
<servlet> <servlet-name> routingservlet </servlet-name> <servlet-class> com.scansoft.osd.servlet.RoutingServlet </servlet-class> </servlet> <servlet-mapping> <servlet-name> routingservlet </servlet-name> <url-pattern> /router </url-pattern> </servlet-mapping>

82

OpenSpeech Dialog 1.4 Developers Guide

Nuance Proprietary

The order of the entries in the web.xml file is significant (and is dictated by the dtd used for the file). Consult the documentation for your web application server for details.
Configuring the web application server

To use the Routing Servlet, the web application server must allow the routing servlet to access different web contents. For example, Tomcat requires a configuration such as the following in the server.xml file inside the <host> element:
<DefaultContext reloadable="true" crossContext="true"/>

Static routing

The routing configurator sets up the RoutingServlet using parameters from the web.xml. These parameters include settings for the initial routing table. To enable routing in Tomcat, insert the following lines in the web.xml:
<listener> <listener-class> com.scansoft.osd.oam.jmx.RoutingServiceConfigurator </listener-class> </listener>

This configuration reads the routing table entries from web.xml and sets up the routing table with these values. With static routing, the routing table does not change after this initial setup. With dynamic routing, changes are possible at runtime.
Dynamic routing

For dynamic routing, you must set up JMX as described in Set JMX parameters in web.xml on page 74. After specifying the web.xml entries described there, make the settings needed for Static routing. After setting up the dynamic routing a new MBean is available under the JMX mbeanDomain OSD for changing the routing table at runtime. For information, see Managing configuration on page 75.

Using the routing servlet

To use the configured routing servlet, copy the routing.war file into the web applications folder of your web application server. Sending requests to the routing servlet is similar to sending requests to the OSD servlet. Here is a URL request to the OSD servlet for starting a dialog with the pizza application on MyServer port 8080:
http://MyServer:8080/routing/router?application=Pizza

Here is a request using telephone number 123 and a routing table as identifiers:
http://MyServer:8080/routing/router?calledId=123

If the request uses both application name and telephone number, the telephone number takes precedence (assuming the routing table has an appropriate entry).
Nuance Proprietary OSD administration Operation administration & management (OA&M) 83

The returned page carries the real context that handled the forwarded request; this allows all subsequent requests to be sent directly to the serving context instead of the routing servlet. The URL used in the request to the routing servlet is forwarded to the target application without change. Therefore, the start request can carry additional information to configure the OSD application.

84

OpenSpeech Dialog 1.4 Developers Guide

Nuance Proprietary

Chapter 9

Application development topics

This chapter describes:


Using skip lists to avoid recognizing specific words Dynamic prompts Working with dates and times programmatically Creating grammars dynamically (at runtime) Reading the <config> content of a node Extending the application object Rendering

Using skip lists to avoid recognizing specific words


A skip list is a list of recognition hypotheses that should be skipped. This feature supports the concept of not asking the same question twice. The purpose is to prevent illogical conversations such as this:
System User What time do you want to go to Boston? Not Boston, BOLTON!

System

Okay, what time do you want to go to Boston?

In the example above, the negative confirmation Not Boston is understood, but then BOLTON! is mis-recognized again as Boston. Users get angry when applications make mistakes like this; and skip lists solve the problem.

Nuance Proprietary

Application development topics Using skip lists to avoid recognizing specific words

85

Key facts about skip lists


OSD maintains a skip list for each attribute. When a caller rejects an attribute value during a verification step, that value is added to the skip list for that attribute. In other words, any attributes on the skip list are automatically removed from the recognizers next results. Skip lists are lists of values only. They are attached to single attributes. For example an attribute named count can have a skip list containing one,two,three so that OSD would never again understand one of those values. Skip lists do not contain combined values. When a user rejects more than one slot in a single utterance, the system does not update any skiplist. For example, if the user rejects do you want to go to Boston tomorrow? the skiplists for the destination and time attributes are not changed. Skip lists are driven by the content of the verification candidate list (VCL) and are only used when a verification question is answered negatively. (A verification question means using, for example, <verify yesno="YESNO"/> instead of <understand namelist="YESNO"/> in xHMI.) A negative answer is the value false for the YESNO attribute. Unless you clear a skiplist value, the system retains it in the attribute until the end of the session. Use these IAttribute methods to clear or change the skiplist: clearSkipList() addToSkipList() getSkipList() setSkipList(<new skiplist>) skip(<new value>) Application developers can choose where the system processes skip lists: either by the recognition engine or by OSD. Thus, skip lists are processed either automatically by the OSD framework or within ECMAScript inside speech grammars. When you select processing on the recognizer side, OSD automatically adds the ECMAScript to the grammar. Use these OSD properties to control where processing occurs: asrSideSkipList serverSideSkipList

86

OpenSpeech Dialog 1.4 Developers Guide

Nuance Proprietary

When skip list processing occurs

For background information, this figure shows when skip list processing occurs:

Controlling where skip list processing occurs

By default skip lists are processed by OSD (which is known as server-side processing) and disabled on the speech recognizer (which is known as ASR-side processing). This is equivalent to the following setting of the properties serverSideSkipList and asrSideSkipList:
<property-list>

Nuance Proprietary

Application development topics Using skip lists to avoid recognizing specific words

87

<property name="serverSideSkipList" value="true"/> <property name="asrSideSkipList" value="false"/> </property-list>

Above, these are the default settings. If you want server-side processing, do not change the defaults. Alternatively, you can process skip lists on the speech recognizer instead of using OSD. Set the parameters as follows:
<property-list> <property name="serverSideSkipList" value="false"/> <property name="asrSideSkipList" value="true"/> </property-list>

To disable skip list processing, use the following settings:


<property-list> <property name="serverSideSkipList" value="false"/> <property name="asrSideSkipList" value="false"/> </property-list>

You can enable skip list processing on both sides (recognizer and OSD), but this adds load to processing without increasing performance. The default is to process skip lists on the OSD machine. This is recommended for these reasons:

If you are not using the Nuance Recognizer, and your recognizer does not support ECMAScript in speech grammars that can be activated via parameter grammars, you must change the default processing location. PerformanceIf your recognition server is already heavily loaded, the additional ECMAScript processing for skip lists might be undesirable. Normally, the additional load is minimal. There is a small cost savings (cpu and network activity) to process with OSD. For example, there is less rendering for the voice browser and less data transfer to the recognizer. This is not likely a significant factor for changing the default processing location.

But processing on the recognizer is also useful because it returns more accurate results that are easier to work with. (The recognizer replaces skipped hypotheses with new possibilities and re-adjusts confidences levels, whereas OSD simply removes next-best entries. With OSD, removed slots are not re-filled and confidence scores are not re-adjusted.)
OSD automatically adds homophones to skip lists
88

If any of the items on the skip list are associated with homophones, OSD automatically adds the homophones to the list. This is done because the default implementation of xHMI works with semantic values, and it could
Nuance Proprietary

OpenSpeech Dialog 1.4 Developers Guide

unknowingly prompt with an item that has a different semantic value but the same pronunciation. To avoid this, OSD reads the contents of the skip list, detects homophones, and appends them to the list. Assume the following dialog:
System User System User System What is the persons name? Cager. Meyer, correct? No. Please say the name again.

Above, when the caller says no, OSD has a skip item containing Meyer. Because Meyer has homophones such as Mayer, Meier, and Mayor, OSD also adds these names to the skip list.
Sample skip list grammar

Below is a skip list grammar (as created automatically by OSD):


<?xml version="1.0"?> <SWIparameter version="1.0" id="set_grammar_script" precedence="1" ignore_unknown_parameters="0"> <parameter name="swirec_grammar_script"> <value> if(typeof(origin) != 'undefined' &amp;&amp; origin == 'London') {SWI_disallow=1;} </value> </parameter> </SWIparameter>

Nuance Proprietary

Application development topics Using skip lists to avoid recognizing specific words

89

Dynamic prompts
To present dynamic content to the caller, the application developer can do either of the following:

Write the presentation text (the text to be presented to the caller) to an attribute and then use an ECMAScript expression in the <output> configuration. Add the output to the StepResponse that is passed to the rendering system.

To describe these alternatives (below), we use the Hello World example (originally presented in the xHMI Reference Guide). In these examples, the prompt text is hard-coded in java. This is done for simplicity; in a real application, the dynamic prompts would be generated from database content.

Writing prompt text to an attribute


The following xHMI configuration shows the Hello World written so that the greeting prompt depends on the time of day:
<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE xhmi SYSTEM "dtd/xhmi.dtd"> <xhmi root="HelloWorld" xml:lang="en-US" xmlns="http://www.scansoft.com/2004/xhmi"> <dialog id="HelloWorld" root="hello"> <var-list> <var name="dynamicPrompt" type="attribute" expr="Default prompt"/> </var-list> <node class="com.scansoft.osd.tutorial.DynamicPrompt1" id="hello"> <config> <output-list> <initial> <output> <value> expr="s.best('dynamicPrompt')"/> </output> </initial> </output-list> </config>

90

OpenSpeech Dialog 1.4 Developers Guide

Nuance Proprietary

<transition> <next name="expr"> <target condexpr="true" path="exit"/> </next> </transition> </node> </dialog> <catch> <target event="error" path="exit"> <log>An error occurred.</log> </target> </catch> <vuiforward> <forward name="_outputExit" path="/outputExit.jsp"/> <forward name="_exit" path="/exit.jsp"/> </vuiforward> </xhmi> java code for the custom node package com.scansoft.osd.tutorial; import java.util.Calendar; import java.util.GregorianCalendar; import import import import com.scansoft.xhmi.DialogException; com.scansoft.xhmi.IStepRequest; com.scansoft.xhmi.IStepResponse; com.scansoft.xhmi.nodes.Output;

public class DynamicPrompt1 extends Output { public void execute(IStepRequest request, IStepResponse response) throws DialogException { // calculate prompt text String greetingPrompt = new String( "Hello!" ); Calendar calendar = new GregorianCalendar(); int hour = calendar.get(Calendar.HOUR_OF_DAY); if( hour < 10 ) { greetingPrompt = "Good morning!"; } else if( hour > 17 ) { greetingPrompt = "Good evening!"; }

Nuance Proprietary

Application development topics Dynamic prompts

91

// write prompt text to attribute getSessionFrame().addAttribute( "dynamicPrompt", greetingPrompt); super.execute(request, response); } }

(An alternative implementation to the above is discussed in Adding output to the StepResponse on page 92.)

Adding output to the StepResponse


In contrast to the preceding example, this xHMI configuration file does not contain an <output> element. Instead the initial output is added to the StepResponse (shown after the configuration):
<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE xhmi SYSTEM "dtd/xhmi.dtd"> <xhmi root="HelloWorld" xml:lang="en-US" xmlns="http://www.scansoft.com/2004/xhmi"> <dialog id="HelloWorld" root="hello"> <node class="com.scansoft.osd.tutorial.DynamicPrompt2" id="hello"> <transition> <next name="expr"> <target condition="true" path="exit"/> </next> </transition> </node> </dialog> <catch> <target event="error" path="exit"> <log>An error occurred.</log> </target> </catch> <vuiforward> <forward name="_outputExit" path="/outputExit.jsp"/> <forward name="_exit" path="/exit.jsp"/> </vuiforward> </xhmi>

92

OpenSpeech Dialog 1.4 Developers Guide

Nuance Proprietary

The initial output in the StepResponse

package com.scansoft.osd.tutorial; import java.util.Calendar; import java.util.GregorianCalendar; import import import import import import com.scansoft.osd.config.XInitial; com.scansoft.osd.config.XOutput; com.scansoft.xhmi.DialogException; com.scansoft.xhmi.IStepRequest; com.scansoft.xhmi.IStepResponse; com.scansoft.xhmi.nodes.Output;

public class DynamicPrompt2 extends Output{ public void execute(IStepRequest request, IStepResponse response) throws DialogException { // calculate prompt text String greetingPrompt = new String( "Hello!" ); Calendar calendar = new GregorianCalendar(); int hour = calendar.get(Calendar.HOUR_OF_DAY); if( hour < 10 ) { greetingPrompt = "Good morning!"; } else if( hour > 17 ) { greetingPrompt = "Good evening!"; } XInitial initial = new XInitial(); initial.add(new XOutput(greetingPrompt)); // write output to step response response.putInitialOutputList(initial); super.execute(request, response); } }

Nuance Proprietary

Application development topics Dynamic prompts

93

Working with dates and times programmatically


In xHMI, the IDateTime interface defines methods for working with dates and times. It allows interactions with any part of a complete timestamp, which in turns allows applications to work with timestamps in any desired format. It also allows conditionalized expressions based on dates. For example, the application can check whether the current time is morning or evening. OSD provides a default implementation for this interface. The implementation is available via the com.scansoft.osd.date.DateTime class. The following sections describe the use of the interface and the default implementation, but these are not the only features available for handling date and time information. Applications can also use the following features:

Update rulesOSD provides sample date and time update rules for evaluating timestamp information in recognition results, improving the next-best entries, and filling slot attributes in the SessionFrame. Attribute facadesWhen discussing dates and times with users, one challenge is to collect separate slots of timestamp information, and then use them together. For example, the application might allow: date, day, day of week, month, year, time of day, hour of day, and so on. Internally, the application translates these slots into a concise timestamp object. Externally, the application needs to formulate output using any combination of the slots. To simplify the challenge, applications can use attribute facades. For example, date and time attribute facades can detect when timestamps are incomplete (i.e. when slots are missing), and automatically formulate follow-up questions in output to users. To collect each slot, the needed grammars are automatically referenced, and the application does not need to control the chaotic activation and deactivation of individual grammars in the recognizer.

Setting dates and times


The interface IDateTime has two methods to set dates: set(String date) set(String date, String format, String assume) With set, you can specify any part of the date (using the notation described in Details on date and time formats on page 96), and the system will automatically add any parts that you omit. To do this, the system assumes future dates when

94

OpenSpeech Dialog 1.4 Developers Guide

Nuance Proprietary

resolving ambiguities. You can specify dates in the following combinations of fields:
Complete timestamp date time timezone Partial timestamp combinations date time date time timezone date

The set method provides additional control. The format string allows various timestamp formats with these values: osd:datetime vxml:date vxml:time The assume string allows changes to assumptions when filling in omitted parts of the timestamp with these values: ASSUMEPAST ASSUMEFUTURE ASSUMECLOSEST ASSUMENOTHING For example, to set a timestamp in the past (the first day of the current month):
set("-01","osd:datetime","ASSUMEPAST")

Getting dates
Use the get method of IDateTime to retrieve dates in any desired format. You can get complete dates or parts of dates. The signature is: String get(String format) For example, the following format string prints a four-digit year (zero padded when only three digits are available), a two-digit month, and so on. This for is similar to the ISO8601 definition for dates and times: YYYY-MM-dd HH:mm:ss Z

Nuance Proprietary

Application development topics Working with dates and times programmatically

95

The format string can contain the following arguments:


Argument y M d h H m s Z Description One digit of a year, for a four digit year use yyyy One digit of a month, for a two digit month use mm One digit of a day, for a two digit day use dd One digit of a 12hr day, for a two digit hour use hh One digit of a 24hr day, for a two digit hour use HH One digit of a minute, for a two digit minute use mm One digit of a second, for a two digit second use ss Gets the +hh:mm timezone designator

Details on date and time formats


The com.scansoft.osd.date.DateTime class uses the following format for the date portion of a timestamp: YYYY-MM-dd You can abbreviate values so long as your abbreviation is not ambiguous when the specification is parsed (from left to right). When your specification omits parts of the date, OSD fills those parts using the current date in a way that the future is assumed. Examples:
Specification 2005-02-10 2005-02 2005 02 -02 -02-10 --10 Date 10 February 2005 February 2005 Year 2005 Year 02 February 10th February 10th day

96

OpenSpeech Dialog 1.4 Developers Guide

Nuance Proprietary

The com.scansoft.osd.date.DateTime class uses the following format for the time portion of a timestamp: HH:mm:ss Z The Z is a timezone designator showing an offset relative to Greenwich Mean Time (GMT). It can take the values -23.59 to +23.59. For example: 2005-02-10 10:00:00 +01:00 The timezone must contain a plus sign (+) or a minus sign (-). The format is: (+|-)HH:mm In the timezone format, HH is a two-digit number between 0 and 23, and mm is a two-digit number between 0 and 59. Times are always based on a 24-hour clock (not 12-hour). You cannot specify am or pm. You can abbreviate values when specifying the time. The class fills abbreviations using zeros:
Specification 13:15 13 :30 ::55 Time 13:15:00 (this is 1.15 am) 13:00:00 00:30:00 00:00:55

For a description of format abbreviations, see Timestamp abbreviations on page 157


Exceptions for invalid timestamps

The set method rejects invalid dates and times by throwing an exception that states the error. For example, the date 2005-02-40 is invalid (because there is no month having more than 31 days). The smallest possible date is 1-1-1 (the first day of the first month in year 1). Although there is a zero time (00:00:00 is midnight), there is no zero date: a value of 0-0-0 will throw an exception.

Nuance Proprietary

Application development topics Working with dates and times programmatically

97

Creating grammars dynamically (at runtime)


A dynamic grammar is a grammar created at runtime (during a session) because it depends on previous user input, content of a database, or some other information that is unknown during application development. Dynamic grammars should be small or medium-sized. Large dynamic grammars add load to your system.

Comparison of dynamic and static grammars


Application developers have a choice when designing and building speech grammars: the grammars can be written in advance, perhaps compiled in advance too, and then stored on a server until needed for loading into the recognizer; or they can be written dynamically (just before loading in the recognizer) at the moment they are needed. Pre-written grammars are called static grammars. Runtime grammars are dynamic grammars. (Because dynamic grammars are written into memory as a string, they are known informally as string grammars.) Benefits: static grammars can be stored in cache (thus saving runtime resources); dynamic grammars can recognize information that is only available at runtime (for example, the geographical location of the caller). Most applications use a combination of static and dynamic grammars:

Large vocabularies of known information (item lists, natural language grammars, robust parsing grammars, and so on) are built as static grammars. Smaller, customized vocabularies (personalized information that is specific to the session) are built as dynamic grammars (typically, by jsp pages). Static and dynamic grammars can be complementary. For example, two grammars can be activated in parallel where the larger grammar is static, and the smaller grammar is dynamically generated to extend the coverage of the recognized speech.

Overview of OSD support


For convenience the OSD Collection node provides an extension point to create dynamic grammars. Basically, you implement the addAllDynamicGrms and removeDynamicGrms methods in your extended node. See the Example of a dynamic grammar. Your custom dialog node executes the following steps: 1 2
98 OpenSpeech Dialog 1.4 Developers Guide

Create dynamic grammar. Store the grammar.


Nuance Proprietary

3 4

Add and use the grammar. Release unused grammars. This is optional to improve performance.

To store and release grammars, the com.scansoft.xhmi.nodes.DialogNode class provides the storeGrammar and releaseGrammar methods. OSD adds the dynamic grammar to the generated page so it is activated for the next recognition.

Example of a dynamic grammar


This is an example implementation customized to use a simple grammar and a simple mechanism to tell the node which attribute name is to be used for the return value. Please note that the method addAllDynamicGrms is called during node execution and removeDynamicGrms is called when the node is finished on transitioning out using a transition.
public class DynamicGrammarCollection extends Collection { final String name = "myDynamicGrammar"; // the following method is called from DialogNode#execute protected void addAllDynamicGrms(IStepResponse response, IUnderstandList ul) throws DialogException { IPropertyList pl = getLocalProperties(); String fills = pl.getValue("attributeToFill"); String grammar = "<?xml version=\"1.0\"?>\n" + "<grammar version=\"1.0\" xml:lang=\"en-US\" xmlns=\"http://www.w3.org/2001/06/grammar\"" + " mode=\"voice\" tag-format=\"semantics/1.0\" root=\"SIZE\">\n" + " <rule id=\"SIZE\" scope=\"public\">\n" + " <one-of>\n" + " <item>one<tag>DYNAMICSLOT='one';</tag></item>\n" + " <item>two<tag>DYNAMICSLOT='two';</tag></item>\n" + " <item>three<tag>DYNAMICSLOT='three';</tag></item>\n" + " </one-of>\n" + " </rule>\n" + "</grammar>"; // store new grammar DynamicGrammarDesc dynamicGrammar = storeGrammar(name, grammar, "application/srgs+xml", false); dynamicGrammar.addMapping(fills, "DYNAMICSLOT");

Nuance Proprietary

Application development topics Creating grammars dynamically (at runtime)

99

// add grammar to set of used grammars addDynamicGrammar(response, ul, dynamicGrammar); } // the following method is called from DialogNode#transition protected void removeDynamicGrms(IStepResponse response) throws DialogException { releaseGrammar(name); // release grammar after using it } }

Use the node in your xHMI configuration. You must configure the <understand> element with any attribute names that are filled by the dynamic grammars. Otherwise, the system cannot copy recognized slots to attributes:
<node class="com.examples.nodes.DynamicGrammarCollection" id="collect"> <config> <property-list> <property name="attributeToFill" value="dummy"/> </property-list> <output-list> <initial> <output>please state value for 'dummy'</output> </initial> </output-list> <understand namelist="dummy"/> </config> <transition> <next name="expr"> <target path="#waiting"/> </next> </transition> </node>

Reading the <config> content of a node


For custom nodes it might be required to add custom child elements to the <config> element. Lets revisit the Hello World example once more, to show which steps are required to extend the config element. This time we like to have a list of typed parameters that can be used to configure the different prompt texts and the time until which the good-morning-prompt is played and a time from which on the good-evening-prompt is played. Furthermore parameters are allowed to take default values.
100 OpenSpeech Dialog 1.4 Developers Guide Nuance Proprietary

To extend the <config>, perform these steps: 1 2 3 Add the elements by extending the DTD Use the new element in your xHMI configuration Write classes to access the custom node

Add the elements by extending the DTD


In xhmi_mai.dtd, add the new, custom element. In the following example, we add <parameter-list> as child to the <config> element of xHMI:
<!-- for stricter dtd check use this--> <!ELEMENT config (xi:include|osdm?|understand?|verify?|property-list?|output-list? |grm-list?|transfer-result?| parameter-list?)*>

Put the definition of the custom element into the custom.dtd file. For example:
<!ELEMENT parameter-list (parameter+)> <!ELEMENT parameter EMPTY> <!ATTLIST parameter name NMTOKEN #REQUIRED type NMTOKEN #REQUIRED value CDATA #IMPLIED default CDATA #IMPLIED >

Use the new element in your xHMI configuration


The following example shows the new element <parameter-list> in the Hello World application. The dtd now allows multiple greetings prompts that are parameterized as times of day:
<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE xhmi SYSTEM "dtd/xhmi.dtd"> <xhmi root="HelloWorld" xml:lang="en-US" xmlns="http://www.scansoft.com/2004/xhmi"> <dialog id="HelloWorld" root="hello"> <node class="com.scansoft.osd.tutorial.CustomConfig" id="hello">

Nuance Proprietary

Application development topics Reading the <config> content of a node

101

<config> <parameter-list> <parameter name="helloPrompt" value="Hello!" type="String"/> <parameter name="morningPrompt" value="Good morning!" type="String" default="Hello"/> <parameter name="eveningPrompt" value="Good evening!" type="String" default="Hello"/> <parameter name="morning" value="10" type="Integer" /> <!-- 10am --> <parameter name="evening" value="17" type="Integer" /> <!-- 5pm --> </parameter-list> </config> <transition> <next name="expr"> <target condition="true" path="exit"/> </next> </transition> </node> </dialog> <catch> <target event="error" path="exit"> <log>An error occurred.</log> </target> </catch> <vuiforward> <forward name="_outputExit" path="/outputExit.jsp"/> <forward name="_exit" path="/exit.jsp"/> </vuiforward> </xhmi>

Write classes to access the custom node


To access the new <parameter-list> element, the custom node must read the custom elements from the DOM. To enable this, you must write classes that implement the IDomElementReader interface. Then, the node can retrieve the custom elements using the DialogNode.getConfig method.

102

OpenSpeech Dialog 1.4 Developers Guide

Nuance Proprietary

Here is the java code:


package com.scansoft.osd.tutorial; import java.util.Calendar; import java.util.GregorianCalendar; import java.util.HashMap; import javax.xml.transform.TransformerException; import import import import import import import import import import import import import import org.apache.xpath.XPathAPI; org.w3c.dom.Element; org.w3c.dom.NamedNodeMap; org.w3c.dom.Node; org.w3c.dom.NodeList; com.scansoft.osd.DialogManagerConstants; com.scansoft.osd.config.XInitial; com.scansoft.osd.config.XOutput; com.scansoft.xhmi.DialogException; com.scansoft.xhmi.IDomElementReader; com.scansoft.xhmi.IStepRequest; com.scansoft.xhmi.IStepResponse; com.scansoft.xhmi.XMLException; com.scansoft.xhmi.nodes.Output;

public class CustomConfig extends Output{ class XParameterList extends HashMap implements IDomElementReader{ public static final String XML_ELEMENT_NAME = "parameter-list"; public static final String TAG_ATTRIBUTE = XParameter.XML_ELEMENT_NAME; public String getXMLElementName() { return XML_ELEMENT_NAME; }

Nuance Proprietary

Application development topics Reading the <config> content of a node

103

public void readFromDomElement(Element element, String namespacePrefix, Node namespaceNode) throws XMLException { String xpath = "./" + namespacePrefix + ":" + TAG_ATTRIBUTE; try { try { NodeList nodeList = XPathAPI.selectNodeList( element, xpath, namespaceNode); if (nodeList != null) { for (int n=0; n< nodeList.getLength() ; n++) { XParameter param = new XParameter(); param.readFromDomElement( (Element)nodeList.item(n), namespacePrefix, namespaceNode ); put( param.getName(), param ); } } } catch (TransformerException e) { throw new DialogException("Cannot find: " + xpath, e); } } catch (Exception e) { throw new XMLException( "Cannot get XParameterList", e); } } }

104

OpenSpeech Dialog 1.4 Developers Guide

Nuance Proprietary

class XParameter implements IDomElementReader { public static final String XML_ELEMENT_NAME = "parameter"; public static public static public static "value"; public static "default"; private private private private final String XML_ATTRIB_NAME = "name"; final String XML_ATTRIB_TYPE = "type"; final String XML_ATTRIB_VALUE = final String XML_ATTRIB_DEFAULT =

String String Object Object

name_ = null; type_ = null; value_ = null; default_ = null;

public String getXMLElementName() { return XML_ELEMENT_NAME; } public String getAttribute(Node node, String attributeName) { NamedNodeMap map = node.getAttributes(); Node n = map.getNamedItem(attributeName); if (n == null) { return null; } else { return n.getNodeValue(); } }

Nuance Proprietary

Application development topics Reading the <config> content of a node

105

public void readFromDomElement( Element element, String namespacePrefix, Node namespaceNode) throws XMLException { name_ = getAttribute(element, XML_ATTRIB_NAME); type_ = getAttribute(element, XML_ATTRIB_TYPE); if( type_.equalsIgnoreCase( "String") ) { value_ = new String(getAttribute( element, XML_ATTRIB_VALUE)); if( null != getAttribute( element, XML_ATTRIB_DEFAULT)) { default_ = new String(getAttribute( element, XML_ATTRIB_DEFAULT)); } } else if( type_.equalsIgnoreCase( "Integer") ) { value_ = new Integer(Integer.parseInt( getAttribute(element, XML_ATTRIB_VALUE))); if( null != getAttribute( element, XML_ATTRIB_DEFAULT)) { default_ = new Integer(Integer.parseInt( getAttribute(element, XML_ATTRIB_DEFAULT))); } } } public String getName() { return name_; } public void setName(String string) { name_ = string; }

106

OpenSpeech Dialog 1.4 Developers Guide

Nuance Proprietary

public Object clone() { Object clone = null; try { clone = super.clone(); } catch(CloneNotSupportedException e) { // should never happen // because we have implemented clone, // so it is supported throw new RuntimeException(e); } return clone; } } public void execute(IStepRequest request, IStepResponse response) throws DialogException { XParameterList parameterList = new XParameterList(); //read parameter list from xHMI IDomElementReader reader = getConfig( getApplication().getNamespacePrefix() + ":" + XParameterList.XML_ELEMENT_NAME, parameterList, DialogManagerConstants.SCOPE_NODE, true); // access parameters (this sample does not // make use of the default value) XParameter param = (XParameter)parameterList.get("morning"); int morning = ((Integer)param.value_).intValue(); param = (XParameter)parameterList.get("evening"); int evening = ((Integer)(param).value_).intValue(); param = (XParameter)parameterList.get("morningPrompt"); String goodMorning = ((String)(param).value_); param = (XParameter)parameterList.get("eveningPrompt"); String goodEvening = ((String)(param).value_); param = (XParameter)parameterList.get("helloPrompt"); String hello = ((String)(param).value_);

Nuance Proprietary

Application development topics Reading the <config> content of a node

107

// calculate prompt text String greetingPrompt = new String( hello ); Calendar calendar = new GregorianCalendar(); int hour = calendar.get(Calendar.HOUR_OF_DAY); if( hour < morning ) { greetingPrompt = goodMorning; } else if( hour > evening ) { greetingPrompt = goodEvening; } XInitial initial = new XInitial(); initial.add(new XOutput(greetingPrompt)); // write output to step response response.putInitialOutputList(initial); super.execute(request, response); } }

Extending the application object


Sometimes it is useful for an application to set up data during load time, which can be shared by all sessions of the application at run-time. You can do this by extending the supplied class com.scansoft.osd.Application and overwriting its init method. In init, call super.init first to start OSD's regular initialization. The application class is loaded when the web application starts. To tell the framework about your class, you need to set a parameter in the web.xml of your web application.
<context-param> <param-name>ApplicationClassName</param-name> <param-value>com.scansoft.osd.MyApplication</param-value> </context-param>

You can access the application from a node simply by calling getApplication and casting the result to your application class. For example:
MyApplication app = (MyApplication)getApplication();

Do not store any dynamic session data in the application object because the data will be lost in the case of a session failure and re-initialization. Instead, store all dynamic data in the SessionFrame.

108

OpenSpeech Dialog 1.4 Developers Guide

Nuance Proprietary

Rendering
OSD renders Voice XML 2.0 pages and sends them to your browser. It implements the rendering system using jsp pages stored in installDir\voiceXML\jsp. The jsp pages use a JSP tag library using the xhmi-voicexml.tld descriptor file. OSD stores the file in installDir\voiceXML\jsp\WEB-INF. Applications must copy the file to their WEB-INF folder. OSD provides these jsp pages:

collection.jspplays prompt and collects user input error.jspused internally by OSD outputExit.jspoptionally plays prompts and exits outputOSDM.jspoptionally plays prompts and calls an OSDM outputSync.jspplays a prompt and triggers VoiceXML generation return.jsphandles the return from a component to the calling application root.jspused internally by OSD root-nocache.jspused internally by OSD start.jspused internally by OSD submit.jspinvokes a transfer to another VoiceXML application transfer.jspinvokes a call transfer

These pages correspond to the predefined <vuiforward> properties _collection, _outputExit, _transfer, plus the vui forward key defined by the class OSDMNode: callOSDM. In addition to the Render Data objects defined by xHMI, OSD uses the following Render Data object:
Key RenderOsdmGeneric Object OSDMCallDesc

In this table the key is an entry in the hash map of Render Data objects contained in the step response

Extending the rendering system


Application developers can extend the rendering system provided with OSD.1 For example, you could supply additional jsp pages when your platform
1Substituting

a complete rendering system completely (for example, to adopt a different markup language) is described in the OSD Integration Guide.
Application development topics Rendering 109

Nuance Proprietary

exports additional functionality with a VoiceXML <object> element. To use the functionality, an application needs to generate an appropriate VoiceXML page. For example, assume you want to use an object called X, and a call to this object needs two parameters a and b. The following example shows the desired rendering of VoiceXML when the parameter values are 42 and 43 (values chosen arbitrarily for this example):
<object name="X"> <param name="a" value="42"/> <param name="b" value="43"/> </object>

To extend the rendering system, do the following: 1 2 3 4


Create a custom node

Create a custom node Configure the custom node in xHMI Change the <vuiforward> map in xHMI Create a jsp page

We need a custom node that sets a viuforward key to select the new rendering component. We define an arbitrary key (named "callX"). The node execute function looks like this:
public void execute(IStepRequest request, IStepResponse response) throws DialogException { setVuiForward(response, "callX"); response.setCommit(true); }

The step response must be committed to invoke immediate page rendering.


Configure the custom node in xHMI

The node can set required parameters (a and b in the example scenario) because the runtime framework passes all properties that are visible to a node into the StepResponse as a Render Data object. Therefore, your xHMI configuration defines properties in the <config> section of the custom node. For example:
<config> <property name="a" value="42"/> <property name="b" value="43"/> </config>

Change the <vuiforward> map in xHMI


110

You must map the vuiforward key created by the custom node to the jsp page that renders the VoiceXML. Any existing mappings are unchanged (even if they

OpenSpeech Dialog 1.4 Developers Guide

Nuance Proprietary

are not used in the new custom node). Assume that the name of jsp page we want to create is callObjectX.jsp:
<vuiforward> <forward name="_collection" path="/collection.jsp"/> <forward name="_outputExit" path="/outputExit.jsp"/> <forward name="callX" path="/callObjectX.jsp"/> </vuiforward>

Above, this example assumes the new page resides in the root directory of the web application.
Create a jsp page

In our example scenario, the jsp page needs to render a VoiceXML document that makes a call to the object. It is beyond the scope of this document to explain this process in detail. However, we explain how JSP code can access the properties we need. For Render Data objects that contain properties, the runtime framework uses the key RenderPropertyList. The object returned for the key implements the interface IPropertyList. The Render Data object can be found in a StepResponse object. First, we retrieve this object from the HTTP request, and then we extract the property list. The following JSP fragment shows how to do this:
<%@page contentType="text/xml;charset=UTF-8" errorPage="/error.jsp" import="com.scansoft.osd.StepResponse" import="com.scansoft.xhmi.renderbeans.RenderBeanKeys" %> <% StepResponse stepResponse = (StepResponse)request.getAttribute ( RequestAttributeNames.STEP_RESPONSE ); IPropertyList props = (IPropertyList)stepResponse.get ( RenderBeanKeys. RENDER_PROPERTY_LIST); %> ... more VoiceXML here <object name="X"> <param name="a" value="<%=props.getValue("a")%>"/> <param name=""b" value="<%=props.getValue("b")%>"/> </object> ... more VoiceXML here

Note the use of the getValue function to access a property in the list. For more rendering information, study the JSP page supplied with the OSD installation.
Nuance Proprietary Application development topics Rendering 111

Using custom Render Data objects

The example scenario above shows how to use a system-defined Render Data object to create a custom render component. You can also use arbitrary objects as your own Render Data object. The only requirement is that the key for storing these objects in the StepResponse must not conflict with any of the render keys defined by xHMI (see the appendix of reserved names in the xHMI Reference Guide). To add your object to the response, make a call like this inside the execute function of a node:
response.put("myKey", new MyClass());

112

OpenSpeech Dialog 1.4 Developers Guide

Nuance Proprietary

Chapter 10

The OSD Datamodel

This chapter describes the model for handling application data in OSD and xHMI, including how to declare and use variables, access data from application components, and define new datatypes.

Overview of variables and data storage


The xHMI configuration uses the <var> element to define variables. OSD provides the necessary runtime classes to make variables available to any ECMAScript or Java code in your application. OSD and xHMI store the following kinds of data:

Recognition data (results from the recognizer). In xHMI, you define this data as attribute variables. For example:
<var name="destination" type="attribute" expr="'London'"> <param name="temporary" expr="true"/> <param name="verified" expr="false"/> <param name="homophone" expr="true"/> </var>

Application data of common datatypes: integers, double words, booleans, and strings. For example, this includes counters for purposes like retries, no-match recognitions, and how often a node has been entered. You can also store static strings to hold presentation data (such as the application name). In xHMI, you define this data as typed variables. For example:
<var-list> <var name="maxIndex" type="int" expr="17" /> <var name="counter" type="int" expr="maxIndex + 2" /> <var name="pi" type="double" expr="3.1415" /> <var name="isValid" type="boolean" expr="true" />

Nuance Proprietary

The OSD Datamodel Overview of variables and data storage

113

<var name="version" type="string" expr="'MyApp Version 5'" /> </var-list>

Application data of complex datatypes: you can define Java classes for any complex type, and then define xHMI variables of these types. For example:
<var name="rc" type="class:org.examples.beans.MyComplexType"> <param name="age" expr="1 billion years"/> <param name="message" expr="'hello World'"/> <param name="version" expr="1.0"/> </var>

ECMAScript variables. You can define and use data within ECMAScript, and you can use xHMI variables in ECMAScript expressions. See Accessing variables with ECMAScript.

Datamodel error events


OSD throws error events for datamodel errors just as it would any error. For a description of events, see <catch> in the xHMI Reference Guide. Applications should always catch the error.datamodel root event or at least the general error event.1

Accessing variables with xHMI


Use the <var> element to define variables in your xHMI configuration. Within the <var>, use ECMAScript expressions to assign initial values, or omit the expressions so the variables take default values. For details on <var>, see the xHMI Reference Guide. Here is an integer variable initialized to the number 42:
<var name="myCounter" type="int" expr="42"/>

Here is a recognition attribute variable:


<var name="myAttribute" type="attribute"> <param name="temporary" expr="true"/> </var>

Below is a attribute facade variable. This example refers to a fictitious class org.example.MyFacade as an example of a user-defined Java extension to OSD.

1Some

severe errors (for example, errors in xHMI configuration files) throw the general error event. The event message provides details. When enabled, the diagnostic log (osd.log) contains an exception trace.
Nuance Proprietary

114

OpenSpeech Dialog 1.4 Developers Guide

The class accepts the parameter namelist, which is defined in the extension. The namelist is a string with attribute names separated by white space:
<var name="myFacade" type="facade:org.example.MyFacade"> <param name="namelist" expr="'myAttribute'"/> </var>

Accessing variables with ECMAScript


Your ECMAScript expressions can access any variable of any datatype created with the <var> element. (Of course, the expressions can define new variables too.) For example, counter is an integer variable defined by the <var> element:
<script> if (counter==0) { // ... } counter = counter + 1; </script>

Below, the ECMAScript appears in a conditional target. If myAttribute is defined, the path is taken:
<target condexpr="myAttribute.def()" path="#next"/>

Here is a guard condition. The node is entered if myAttribute is not yet defined:
<node id="n1" class="Collection"> <guard condexpr="myAttribute.undef()"/> ... </node>

Here is a script that creates the ECMAScript variable z and assigns the top n-best results associated with the xHMI variable myAttribute:
<script> var z = myAttribute.best(); ... </script>

The following example shows a complex datatype called agent. You could create such a datatype by extending org.examples.bean.MyComplexType with the identifier agent. You must define the variable before using it in a script; for example:
<var name="agent" type="class:org.example.beans.MyComplexType"/>

The example sets three properties of the agent: age, version and message. (Not shown are the variable declarations, which must be done before the script executes.)
Nuance Proprietary The OSD Datamodel Overview of variables and data storage 115

<script> agent.age = 39; agent.version = 3.0; agent.message = 'hello'; </script>

The following example is equivalent to the previous. It gets the same result with different syntax using ECMAScript setter methods:
<script> agent.setAge(39); agent.setVersion(3.0); agent.setMessage('hello'); </script> Passing variables to subdialogs

When your application transitions to a subdialog, you can pass variables to the subdialog as parameters. This example passes two variables (myAttribute and myCounter):
<target path="MySubdialog(myAttribute myCounter)"/>

To receive the parameters, the target dialog defines a parameter list:


<dialog id="MySubdialog" attributes="attrib1" "count1" ...> <!--...Access variables with ECMAScript...--> </dialog>

You can pass any <var> of any type, but when using a variable for recognition purposes it must be an attribute datatype. In other words, the following xHMI elements require attribute variables:

<fills name="attrib1"/> <understand namelist="attrib1"/> <verify actor="attrib1"/>

The passed variables are available in the target subdialog using the names defined in the dialogs parameter list. Above, myAttribute and myCounter become known as attrib1 and count1. Changes to these local variables also change values in the higher scoped variables. For example, setting count1 to 5 above, results in myCounter=5 in the parent dialog.

Writing your own bean


This section shows the implementation of a bean that enables definition of variables such as the following:
<var name="rc" type="class:org.examples.beans.MyComplexType"> <param name="age" expr="1 billion years"/>

116

OpenSpeech Dialog 1.4 Developers Guide

Nuance Proprietary

<param name="message" expr="'hello World'"/> <param name="version" expr="1.0"/> </var>

Here is the implementation:


package com.example.beans; public class MyComplexType { private int age = 0; private String message = ""; private float version = 0.1; public void setAge(Number age) { this.age = age.intValue(); } public Number getAge() { return new Integer(this.age); } public void setVersion(Number version) { this.version = version.floatValue(); } public Number getVersion() { return new Float(this.version); } public void setVersion(String message) { this.message = message; } public String getMessage() { return this.message; } }

Nuance Proprietary

The OSD Datamodel Overview of variables and data storage

117

Accessing variables with Java code


This section describes how to access xHMI and ECMAScript data from any Java class created by your application or development platform. You can create classes for any purpose, including:

Complex datatypes Custom nodes Attribute facades Update rules Extensions to the nodes, facades, and update rules provided with OSD

When creating a class, implement the IDataModel interface to gain access to the datamodel. In general, this means the following:
IDataModel model = // get datamodel instance Object varObj = model.get("myAttribute"); AttributeBean attr = (AttributeBean) varObj; // use the attribute Access from a custom node

To get the datamodel from a custom node, extend DialogNode and use the getDataModel method. For example:
public class MyNode extends DialogNode { public void execute(IStepRequest request, IStepResponse response) { IDataModel model = super.getDataModel(); // the model is accessed // now get any data in the model ... } }

Above, DialogNode is the base class for all nodes. Use getDataModel to return the current scope of the IDataModel instance. The getDataModel method has the signature:
public IDataModel getDataModel(); Access from an update rule

To write to the datamodel from an update rule, implement the IUpdateRule interface. For simplicity, also implement the IDataModelAccess interface to automatically reference the datamodel in the instantiated rule before the rule is initialized. (For information on update rules, see the xHMI Reference Guide.) For example:
public class MyUpdateRule implements IUpdateRule, IDataModelAccess {

118

OpenSpeech Dialog 1.4 Developers Guide

Nuance Proprietary

private IDataModel model = null; private String attrName = null; //OSD calls this method after you create an instance of this // updater rule and before it calls the init method. public void setDataModel(IDataModel model) { this.model = model; } //Initialize this instances of the update rule, taking a list // of attribute names separated by whitespace. // This implementation only expects one attribute on the list. public void init(String variableNames) throws DialogException { this.attrName = variableNames.split("\\s")[0]; } //Define behavior when update rule runs as a pre-update rule. public void preUpdate( INbestResult nbestResult, ISessionFrame sf) throws DialogException { AttributeBean attr = (AttributeBean) model.get(attrName); // work with the attribute bean ... } public void postUpdate( INbestResult nbestResult, ISessionFrame sf) throws DialogException { } } Access from an attribute facade

To get the datamodel from an attribute facade, implement the IAttributeFacade interface. For simplicity, also implement the IDataModelAccess interface to automatically reference the datamodel in the instantiated facade before using the facade. For example:
public class MyAttributeFacade implements IAttributeFacade, IDataModelAccess { private IDataModel model = null; //OSD calls this method after you create an instance of this // updater rule and before it calls the init method. public void setDataModel(IDataModel model) { this.model = model; }

Nuance Proprietary

The OSD Datamodel Overview of variables and data storage

119

//A user defined method for this attribute facade. public String someFacadeMethod() { AttributeBean attr = (AttributeBean) model.get("myAttribute"); return attr.getBest(); } }

Above, DefaultAttributeFacade is the base class. Use getDataModel to return the current scope of the IDataModel instance. The getDataModel method has the signature:
public IDataModel getDataModel(); Access from a custom bean

Your JavaBeans must implement the interface IDataModelAccess. OSD uses dependency injection to provide IDataModel references to instances of each bean. For example, this bean implements the interface, sets the model, and gets an xHMI variable named myAttribute:
public class MyComplexType implements IDataModelAccess { private IDataModel model = null; //OSD calls this method after you create an instance of this // updater rule and before it calls the init method. public void setDataModel(IDataModel model) { this.model = model; } public String getAttrValue() { AttributeBean attr = (AttributeBean) model.get("myAttribute"); //Return the best choice from the recognition result. return attr.best(); } // ... see myComplexType in Writing your own bean }

The getAttrValue method is only defined in this class (for bean instances). You can use it in xHMI as follows:
<script>rc.getAttrValue()</script>

ISessionFrame methods
amb(java.lang.String qname)returns true if the specified attribute's first best value is ambiguous, false otherwise. Returns a boolean.

120

OpenSpeech Dialog 1.4 Developers Guide

Nuance Proprietary

best(java.lang.String qname)returns the first best value of an attribute or null if it does not exist in the current node, one of the dialogs on the stack or in global scope. Returns a java.lang.String. conf(java.lang.String qname)returns the confidence (0..1.0) of the first best value of an attribute or 0 if it does not exist in the current node, one of the dialogs on the stack or in global scope. Returns a double. def(java.lang.String qname)checks whether attribute is defined. Returns a boolean. hom(java.lang.String qname)returns true if the specified attribute's first best value is homophone, false otherwise. Returns a boolean. nbest(java.lang.String qname, int order)returns the n-th best value of an attribute or null if the attribute does not exist. Returns a java.lang.String. nconf(java.lang.String qname, int order)returns the confidence of the n-th best item of an attribute or 0 if the attribute does not exist. Returns a double. nsay(java.lang.String qname, int order)returns the representation of the best items in a format that is suitable for output generation. Returns a java.lang.String. say(java.lang.String qname)returns the representation of the best items in a format that is suitable for output generation. Returns a java.lang.String. undef(java.lang.String qname)checks whether an attribute is not defined. Returns a boolean. unver(java.lang.String qname)checks whether attribute is not verified. Returns a boolean. ver(java.lang.String qname)checks whether attribute is verified. Returns a boolean.

Differences between a.best() and s.best(a)


The following expressions have the same result: mySlot.best() s.best('mySlot') The differences between the expressions are:

The expression mySlot.best() uses simpler syntax. If the variable is undefined, the expression throws error.script.execution. You can use s.best('mySlot') even when mySlot is not defined (in which case the ISessionFrame instance returns null).

Nuance Proprietary

The OSD Datamodel ISessionFrame methods

121

AttributeBean methods
AttributeBean methods clear() : return void amb() : return boolean best(): return String nbest(int): return String say(): return String nsay(int): return String conf(): return double nconf(int): return double def(): return boolean undef(): return boolean ver(): return boolean unver(): return boolean getAmbiguousCount(): return int skip(String): return void like <clear name/> like s.amb like s.best.. like s.nbest like s.say.. like s.nsay like s.conf.. like s.nconf like s.def.. like s.undef.. like s.ver.. like s.unver.. like s.amb add value to skip list of this attribute bean, like IAttribute#skip()

Writing a factory
You can define a complex variable in the xHMI configuration, and then use a factory in OSD to instantiate a JavaBean object of the desired class in the datamodel. OSD provides the following predefined factories:

Class factory Facade factory

In addition, you can create factories of your own, and use these factories to create classes conveniently without knowing their concrete type.

122

OpenSpeech Dialog 1.4 Developers Guide

Nuance Proprietary

When do you need a factory?


You need a factory when the <var> element does not provide the needed datatype. For example:

If you anticipate creating objects with many parameters, then a factory simplifies and significantly reduces the xHMI configuration. If you need access to an external registry like JNDI or Spring, a factory creates a bridge between the registry and the contents of the xHMI datamodel. This is possible because a factory can contain arbitrary Java code that accesses xHMI variables and interacts with the external entity.

Steps for writing a factory


Steps for writing and using factories: 1 2 3 Implement the interface IDataElementFactory. See Implementing IDataElementFactory on page 123. Add the new class to the osd-config.xml configuration file. See Configuring factories on page 124. In the xHMI configuration, create variables with the complex datatype pointing to the class. See the example in Overview of variables and data storage on page 113, or see the <var> element in the xHMI Reference Guide.

Implementing IDataElementFactory
To write a factory, implement the interface IDataElementFactory. The interface defines a create method that your implementation must provide to create objects of the desired type. Any JavaBeans or factories that you create must have a default constructor in the implementation. (Other constructors are also allowed.) OSD uses the default constructor to initialize variables defined with <var name="a" type="z"/>. In other words:

When you do not define expr, OSD uses the default constructor. When you define expr, OSD uses the constructor for the type of the expr result. If this constructor is not present, OSD throws the error.datamodel event.

The interface is defined as:


public interface IDataElementFactory { /** * This method creates an instance based on a specific scheme.
Nuance Proprietary The OSD Datamodel Writing a factory 123

* For example, the scheme 'class' would call the class loader. * This loads the class and creates an instance * of the loaded class. */ public Object create(String uri, IDataModel dm); }

Here is an example implementation of a class factory:


package com.scansoft.osd.datamodel; public class ClassFactory implements IDataElementFactory { public Object create (String uri, IDataModel dm) throws Exception { return getClass().getClassLoader().loadClass(uri).newInstance(); } }

You can use create an instance of the class in xHMI as follows:


<var name="a" type="class:org.example.beans.MyComplexType"/> The factory lifecycle

The framework instantiates factories when executing their corresponding <var> elements, and then removes them after building the variable. Therefore, your factory implementations must not store data for individual application sessions.

Configuring factories
After creating a new IDataElementFactory class, you must configure it in the /WEB-INF/lib/osd-config.xml configuration file. During runtime initialization, the osd-config.xml configuration file defines the available datatypes and their classes. For example:
<osd-config> <var-types> <!-- predefined datatypes provided with OSD -> <var-type name='int' class='java.lang.Integer'/> <var-type name='string' class='java.lang.String'/> <var-type name='double' class='java.lang.Double'/> <var-type name='boolean' class='java.lang.Boolean'/> <var-type name='attribute' class='com.scansoft.osd.datamodel.AttributeBean'/> </var-types> <var-factories> <!-- predefined factories provided with OSD -> <var-factory name='class' class='com.scansoft.osd.datamodel.ClassFactory'/>

124

OpenSpeech Dialog 1.4 Developers Guide

Nuance Proprietary

<var-factory name='facade' class='com.scansoft.osd.datamodel.FacadeFactory'/> <!-- example user-defined factory -> <var-factory name='record' class='org.examples.beans.MyComplexType'/> </var-factories> </osd-config>

Above, you can create new types and classes, and insert them into the configuration file. Also, you can replace any class with an implementation of your own. For example, if you implement a variant for integers, you can map int to a class other than java.lang.Integer.

Using the OSD datamodel interface


To access the OSD datamodel, use the interface IDataModel in your Java code. The OSD datamodel itself is responsible for serving declared data elements to other modules of the running system. (See Accessing variables with Java code on page 117.) So this class serves access to data elements that are stored in the OSD datamodel itself. Available methods in IDataModel: contains(name)Indicates whether the named variable is defined or not. get(name)Gets the value of name. Returns NULL if name does not exist. getNames()Gets the names of all defined variables. getType(name)Gets the datatype of name. set(name, value)Sets the value of name. size()Gets the number of defined variables. store(name, datatype, value)Creates a variable of the given type and stores the value. Once stored, you can update the value with the set method. In addition, to ensure correct processing of back processing and the servlet, your application must also implement the following: Cloneable and Serializable.

Implementing IDataModelAccess
Applications must implement this interface in any object (such as a bean, update rule, attribute facade, or class factory) that accesses the OSD datamodel and SessionFrame interfaces. In response, OSD automatically injects the datamodel into the object when the object is created.
public interface IDataModelAccess {

Nuance Proprietary

The OSD Datamodel Writing a factory

125

public void setDataModel(IDataModel dm) throws Exception; }

The injected instance of the datamodel is appropriately scoped, and all variables are available. This includes OSD variables such as __callId, __callerId, and __ calledId (as described in the xHMI Reference Guide).

Example user-defined JNDI factory


A convenient way of sharing data among OSD applications is to package the data in a JNDI object and access the data with a JNDI factory in each application. OSD does not provide a JNDI factory, but the following example shows how to create one. JNDI is the Java Naming and Directory Interface provided by Sun Microsystems, Inc. For details, see these links:

http://java.sun.com/products/jndi/. http://java.sun.com/j2se/1.4.2/docs/api/javax/naming/package-summary.html Create the JNDI factory class:

import javax.naming.InitialContext; public class JNDIFactory implements IDataElementFactory { private InitialContext context = new InitialContext(); public Object create(String schemeSpecificPart) { return context.lookup(schemeSpecificPart); } }

Add the factory to the osd-config.xml configuration file:

<osd-config> <var-types> ... </var-types> <var-factories> ... <var-factory name="jndi" class="com.scansoft.osd.datamodel.JNDIFactory"/> ... </var-factories> </osd-config>

Defining a JNDI bean in xHMI:


<var name="myBean" type="jndi:weatherforecast"/>

126

OpenSpeech Dialog 1.4 Developers Guide

Nuance Proprietary

Above, the example gets weatherforecast from the JNDI local context, which must contain an instance of the org.examples.jndi.WeatherService class. Then, you can use myBean to access the methods of the class as follows (we assume the class implements a temperature method:
<script> var t = myBean.temperature(); </script>

Nuance Proprietary

The OSD Datamodel Writing a factory

127

128

OpenSpeech Dialog 1.4 Developers Guide

Nuance Proprietary

Chapter 11

FAQ

This chapter provides answers to typical questions that might arise.

Evaluating variables and making logic decisions


Q: Can I create a node that doesn't do any prompting or recognition but only tests a variable and branches accordingly? A: There are two possibilities to do that a Write a java node with an empty execute function and a transition function that like this
public String transition(StepResponse response) throws DialogException { return ("expr"); }

In xHMI <transition>, use the "condexpr" attribute of <target> to write some conditions using ECMAScript. This approach has the advantage that you only need one Java class (independent from the conditions), because the conditions are expressed in xHMI. The disadvantage is that sometimes the expressions might get too complicated to be written and debugged in ECMAScript. b Write a java node with an empty execute function and a transition function that checks your condition:

Nuance Proprietary

FAQ Evaluating variables and making logic decisions

129

public String transition(StepResponse response) throws DialogException { if (getSessionFrame().best("varname").equals("G")) { return "good"; } else if (getSessionFrame().best("varname").equals("B")) { return "bad"; } }

In the <transition> you can branch on the transition property:


<transition> <next name="good"> <target path="#node_g"/> </next> <next name="bad"> <target path="#node_b"/> </next> </transition>

Q: How can I specify branching conditions in xml when a node always returns the same transition property. A: You can use more than one <target> elements within a <next> element, each with a condition. See the xHMI Reference Guide for details.

Accessing variables in ECMAScript expressions


Q: How can I access dialog variables in ECMAScript expressions used in xHMI? Can I have my own objects there? A: OSD does not give application developers direct access to the ECMAScript context. Instead all data of an application needs to be stored in the SessionFrame. This is required in order to allow making the application state persistent for session fail-over. The SessionFrame can be accessed with the ECMAScript variable name "s". In the Java code of a node it can be accessed with getSessionFrame. The variables stored in the SessionFrame are of class type "Attribute". An Attribute has next-best lists of strings together with their confidences. The SessionFrame object has convenience methods to allow a shorthand notation for

130

OpenSpeech Dialog 1.4 Developers Guide

Nuance Proprietary

the access to the best value of an Attribute or the confidence of that. For example:
s.best('varname')

gives you the value of the first best attribute. This is equivalent to:
$s.get('varname').getFirstBest().getText()

You can also put arbitrary objects into the SessionFrame with getObject and putObject. These are stored in a hashtable. So, in ECMAScript you would access them with:
((MyClass)s.getObject('myname')).getSomeMember()

Declaring custom classes in xHMI


Q: Can I declare custom classes in xHMI and use them as dialog session data? A: No, you can only declare attributes. These correspond to the com.scansoft.xhmi.Iattribute interface.

Creating attributes via Java


Q: Can attributes be added from the java code? A: yes, they can. Use the addAttribute and assignAttribute methods of class
com.scansoft.xhmi.ISessionFrame

To access the SessionFrame, call the getSessionFramemethod. This example, adds a new attribute and assigns a value to it.
public void addAttribute(String qname, String value)

Configuring OSDM parameters dynamically


Q: How can I set a parameter for an OSDM call dynamically? A: You can either use an ECMAScript expression in the value of a <property> or do it in Java. To do it in Java, create a new node that extends com.scansoft.osd.nodes.OSDM. Overwrite the method updateProperties(OSDMCallDesc osdmCallDesc)

Nuance Proprietary

FAQ Declaring custom classes in xHMI

131

You can use getProperty and putProperty of OSDMCallDesc to change properties.

Timing of updates in the SessionFrame


Q: When does the update of the attributes in the SessionFrame occur? A: Short answer: After the call to execute and before the call to transition. More in detail: A node renders a VoiceXML page via a JSP, the VoiceXML browser executes the page, gathers input from the caller and submits the recognition result back to the server. The HTTP request that reaches the servers is transformed into a StepRequest for the DialogManager. The DialogManager reads the semantic attributes in the StepRequest and uses them to update the SessionFrame. After the status has been updated, the framework checks for events raised and calls a <catch> handler if an event was raised and a matching catch handler is present. The framework then calls the transition function of the current node. With the result of the transition function and the content of the global and node specific transition maps in the xml, the framework decides which node to call next.

Changing the provided rendering jsp


Q: Can I change the provided rendering jsp? A: The OSD rendering engine uses jsp pages to create dynamic VoiceXML pages. One typical question from application developers working with OSD 1.0 is whether they can change the provided jsp pages. The answer is this: the jsp pages supplied should be sufficient for any type of application. Although it is possible for application developers to create custom jsp pages (for reasons described below), we strongly recommend that you do not adapt the provided jsp pages:

Your custom pages will not benefit from feature updates and bug fixes in future OSD releases. Your custom jsp pages are required to support OSI logging. This is an error-prone task that easily leads to a situation where problems occur only very late in the development process (for example, only when logged data is analyzed).

132

OpenSpeech Dialog 1.4 Developers Guide

Nuance Proprietary

In rare circumstances, an application developers might want to create their own pages. Example reasons:

To use a VoiceXML object tag to trigger non-VoiceXML functions of the browser platform. To use custom ECMAScript on the generated page. To generate different mark-up such as SALT.

Recognizing long utterances with robust parsing


Q: My application uses a robust parsing grammar, but the recognizer seems to reject all long utterances. What is wrong? A: With robust parsing grammars and OSR versions 3.0.3 or 3.0.4, you should set the reserved property _confidenceForMatch to 0.0. Otherwise, your VoiceXML browser might always reject long utterances. Background information: often, the recognizer assigns a sentence confidence of 0 or just above 0.0 to long utterances. If the application specifies a _confidenceForMatch threshold, then these sentences will have a confidence that is below the threshold. If you do not have access to the slot confidences (OSR 3.0.3), OSD automatically uses the sentence confidence as confidence for each of the slots. In that case you also need to set _confidenceForDefined to 0.0.

Nuance Proprietary

FAQ Recognizing long utterances with robust parsing

133

134

OpenSpeech Dialog 1.4 Developers Guide

Nuance Proprietary

Chapter 12

Getting started with development

An xHMI speech or multi-modal application typically consists of one or more xHMI files, a web deployment descriptor file (web.xml), grammar files, audio files, Java files for custom nodes and backend logic. It is beyond the scope of this document to describe the development process for such an application in detail. However, it is assumed that the development process roughly follows the process in the picture below. Usually such a process is iterative, that is a version of the application is created with a subset of the functionality, it is then tested and refined and tested again and so on until it the application meets all test criteria.

Nuance Proprietary

Getting started with development

133

Speech application development lifecycle


We assume a simple, iterative development lifecycle consisting of Design, Development, and Deployment phases. Speech applications also require tuning iterations during deployment. The following figure shows the lifecycle in detail:

In this document, we focus on the aspects of the lifecycle that are closely related to xHMI. (These are depicted in the white boxes).

Application design
Application design includes the specification of every possible interaction with end-users: the prompts, the speech grammar coverage, and the expected

134

OpenSpeech Dialog 1.4 Developers Guide

Nuance Proprietary

recognition results. All these specifications are reflected in the applications xHMI configuration. At the highest level of organization, the xHMI configuration consists of dialogs and nodes: major branches of the application are represented by <dialog> elements; individual interactions are represented by <node> elements. The callflow (as embodied in dialogs and nodes) can be highly conditionalized. That is, the nodes that are visited and prompts that are played can change depending on current state of the session with the caller. This is a key strength of OSD applications; they are not limited to linear (directed-dialog) callflow models. Instead, OSD applications are most versatile when employed for information-driven or state-driven models. For example, you might design a directed-dialog as the default callflow:
System User Do you want ice cream or pizza? Ice cream

System User

Cone or cup? Cup.

System User

Large, medium, or small? Small.

The default callflow can be conditionalized (in the xHMI configuration instead of the application code) depending on the data collected (an information-driven dialog):
System User Do you want ice cream or pizza? Ice cream cone.

System

Large, medium, or small?

Nuance Proprietary

Getting started with development Application design

135

OSD relies on a design that defines the individual pieces information needed, that collects the pieces, and that changes depending on which pieces have been collected. For example, the previous example might have evolved as follows:
System User Do you want ice cream or pizza? Small ice cream.

System

Cone or cup?

Design the callflow


The UI designer specifies the conversational flow (the callflow) of user sessions. This includes the introductory prompts, the possible main branches of the callflow, the collection of information, the sequence of that collection, and the expected vocabularies to be verified. The following list shows rules of thumb when designing application callflows. The terms in parentheses show the xHMI configuration elements associated with these User Interface (UI) design tasks:

Define the branches of the application, the tasks to be accomplished in each branch, and the primitive activities that comprise each task.

Each branch of the application is a single, self-contained dialog (<dialog>). Each primitive activity in the callflow is a node (<node>). Each node collects information, sends information, interacts with a database, or calls an object such as a SpeechPAK or an OpenSpeech DialogModule (OSDM).

One goal is to determine the optimal size of a task by balancing tasks that are small enough to be generic and reusable yet large enough to perform a meaningful transaction.

Define the root dialog of the application, and the root nodes of each dialog:

Define a root dialog, the first dialog that runs at the start of every session. Define a root node inside each dialog, the first node that runs in the dialog.

Specify the transitions (<transition>) between the nodes. A transition happens when all callflow activity is completed inside the current dialog and node.

136

OpenSpeech Dialog 1.4 Developers Guide

Nuance Proprietary

Define the possible states (success, failure, and so on) at the end of each node. Define the possible targets (<target>) for the next dialog or node Define the conditions for choosing the correct target to when transitioning from the current node or dialog.

Define how error situations are handled (<catch>). Consider an operator fallback. Define the information collected by each node, and the likely vocabulary words used by callers. Specify a label for each piece of data (<var>). Define an OpenSpeech Insight (OSI) transaction for each high-level application branch, mid-level application task, and detailed-level node. To prove that the callflow meets application objectives, there should be at least one transaction for each application requirement. Transactions are logged (<log>), added to OSI databases, and used for generating reports (for example, transaction success rates) and tuning (for example, locating correcting dialogs and nodes with high failure rates).

Design the prompts and speech grammars


For each node, specify the initial prompt, retry prompts, success and failure prompts, and any other expected prompts. Whenever the application collects information, specify the utterances that you expect to collect from callers. Categorize the vocabulary as follows:

Globally recognized (for example, for commands and shortcuts). Statically recognized (the expected utterances are known in advance and the speech grammar can be written before the runtime session begins). Dynamically recognized (the expected utterances are identified during the session and the speech grammar must be written at runtime by the application.

You can write highly-constrained or minimally-constrained speech grammars. This includes robust parsing grammars (grammars that distinguish meaningful phrases within longer utterances) and natural language grammars (grammars that categorize each utterance based on large samplings of possible sentences and their intended meanings).

Nuance Proprietary

Getting started with development Application design

137

Application development
Steps for coding an xHMI application: 1 2 3 4 5 Create a directory structure. Configure the application (create xHMI files). Test the application callflow. Implement grammars. (Not described in this guide.) Create recordings. (Not described in this guide.)

Create a directory structure


For new xHMI applications, we recommend a directory structure as follows:
mayapp audio grammars log xhmi app dtd inc WEB-INF lib src test src

Here are details: myappStorage of jsp pages (or links to the pages). The jsp pages from install-dir\voiceXML\jsp need to be copied (or linked) into the application root directory. If you use Ant as build tool, the Ant build.xml goes into this directory. myapp/audioContains waveforms (caller utterances) collected by the application. myapp/grammarsContains speech and DTMF grammars. myapp/logContains application-dependent log files. Use this directory for diagnostic log files created by log4j. The log4j.properties file supplied in the /samples directory shows how to set the log file location. For more information, see Diagnostic logging on page 58.

138

OpenSpeech Dialog 1.4 Developers Guide

Nuance Proprietary

myapp/xhmi/appContains appconfig.xhmi (the xHMI configuration file) and global.prop (global properties). myapp/xhmi/dtdContains a copy (or link) to the xHMI DTD file. myapp/xhmi/incContains xHMI include files. myapp/WEB-INFContains files needed by the application including classfiles, libraries, taglib tld files, and files for any custom nodes such as a database access node. Some specific files in the directory:

log4j.propertiesThe log4j configuration properties. messages.xmlLocalized alarm messages in different languages. web.xmlThe web application configuration file. A copy (or link) of the tag descriptor file: install-dir\voiceXML\jsp\WEB-INF\ xhmi-voicexml.tld

myapp/WEB-INF/libContains copies (or links) to all OSD jar files. The files are located in the following directories:
install-dir\lib install-dir\Shared\java\lib\ext

myapp/WEB-INF/srcThe source code for any customization needed by the application (for example, for a custom database access node). myapp/WEB-INF/testContains all code for unit tests (source code, dtds, data, etc.) See Testing on page 143. myapp/WEB-INF/test/srcContains only the source code for unit test classes. See Testing on page 143.

Configure the application (create xHMI files)


You can use any text editor to create xHMI files. Because the syntax is XML, you can minimize application load errors by using an XML editor that supports DTD or W3C Schema validation.
Validating with a DTD

The dtd (Document Type Definition) for xHMI configurations can be found in <installdir>/system/xhmi.dtd. It consist of two entities: xhmi_main.dtd and custom.dtd. We recommend you copy these three files into a directory relative to the location of you xHMI file. In your applications xHMI file you can then use a DOCTYPE declaration such as the following: <!DOCTYPE xhmi SYSTEM "dtd/xhmi.dtd">

Nuance Proprietary

Getting started with development Application development

139

The dtd can be extended by declaring new elements in the custom.dtd file. A typical use case is when a node requires a new xml element inside <config>. The dtd for these elements accept "any", element, so declaring them in custom.dtd will suffice.
Validating with a W3C schema

A mechanism similar to the dtd is provided for schema validation. The schema files (*.xsd) can also be found in <installdir>/system." For schema validation, declare attributes in the xHMI root element (instead of using DOCTYPE) as shown in the following example:
<xhmi root="Main" xmlns:xi="http://www.w3.org/2001/XInclude" xml:lang="en-US" xmlns="http://www.scansoft.com/2004/xhmi" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.scansoft.com/2004/xhmi xhmi.xsd">

Implementing custom nodes

If the application needs to access some backend logic, such as a transactional system or a database, custom nodes are used to implement the desired behavior. Sometimes an application developer might also choose to implement decision logic in Java instead of building the logic with ECMAScript expressions in xHMI. Any Java IDE, such as Eclipse can be used for this step. See also, Application development topics on page 85.

General activities
Design the behavior of the application, and divide the main user tasks into separate modules. In xHMI, use a separate <dialog> for each module, and within each <dialog>, use a <node> for each phase of interaction with the user. For example, a corporate speech application might have these modules:

greetings and security checks corporate information and promotions employee directory technical support good-byes and follow-up

Each module has a node for each subtask. For example, greetings and security checks might have nodes for public greeting, internal greeting, and security password. Within each <node>, define the needed configuration, prompts, slots, grammars, and transitions. To re-use definitions, define them at the dialog or global scopes.

140

OpenSpeech Dialog 1.4 Developers Guide

Nuance Proprietary

Test the application callflow


An advantage of the xHMI and OSD architecture is the ability to test the callflow of an application before completing the grammars and prompts. These tests can be done as a test against a Java API (see Testing on page 143).

Application deployment and tuning


Once all parts of an application are completed, the application can be deployed to a web server. See Deployment to a web server on page 71 for details.

Nuance Proprietary

Getting started with development Application deployment and tuning

141

142

OpenSpeech Dialog 1.4 Developers Guide

Nuance Proprietary

Chapter 13

Testing

Testing can be divided into several sub processes:

grammar testingtesting that the grammar delivers the expected semantic results with a variety of different utterances call-flow testtesting that the application behaves in the way expected with a variety of semantic inputs backend interface testtesting the access to a backend system system testtesting the complete system, usually by making telephony calls acceptance testtesting the system at the customer's premises

A speech application test is often executed by making phone calls into a life system. The system is comprised of a number of complex sub-systems, such as the telephony switch, a VoiceXML gateway with a browser, an application server and a backend system such as a database. Testing an application under development in this way can be a cumbersome procedure, because the application will show errors, in which case, after fixing the error, the process of deploying and calling has to be repeated. With OSD you simply the process by executing the call-flow tests at a Java API level, without the need to deploy the application. In this way it is possible to create automatic regression tests for an application.

Nuance Proprietary

Testing

143

From the xHMI architecture, recall the separation of the Front-end Controller and the DialogManager:
*.xHMI

Application
update

Session Frame

CONTROLLER (a) execute

Front Controller

Dialog Manager

Dialog Node

create

Render Data Objects


MODEL

At the IDialogManagerInvocation interface (a), many aspects of an application can be examined. We will use this interface for callflow testing:
(a)

Tester

Dialog Manager

The general flow of such a test is the following:

Create and initialize an application object. The xHMI file is read during initialization. Create a DialogManager, passing in the application object Make a start request by calling the stepRequest method Test that the StepResponse contains the expected Render Data objects and test their content

144

OpenSpeech Dialog 1.4 Developers Guide

Nuance Proprietary

Setup the semantic input, i.e. the results that would have been created by the recognizer Make a further request supplying the results

During execution of the nextStep method, the DialogManager executes one or several nodes. If one of these nodes accesses the backend system, this access is included in call-flow testing. The StepResponse object contains the Render Data Objects that would have been used to render the markup in a deployed system. Information that can be retrieved from these objects is e.g. a list of activated grammars and a list of prompts. The OSD library contains a support class that facilitates the access to these objects. See class com.scansoft.osd.test.TstSupport. A test program can also access the SessionFrame object with:
ISessionFrame dialogManager.getSessionFrame();

From the SessionFrame, the current dialog and node names can be read as well as the SessionFrame attributes and user objects. The following is a small example of how such a test looks like. You can find a more comprehensive example in a subfolder of the pizza sample application: \samples\pizza\test\src\com\scansoft\osd\samples\pizza\TestPizzaOrder Both examples are using JUnit, a popular unit test framework (see www.junit.org). However, any other suitable test software can also be used.
public void testFirstInitialOutput() { try { String configFileName = appRoot + "/output.xhmi"; Application app = new Application(); app.init(appRoot, configFileName); DialogManager dm = new DialogManager(app, "ABC", "usrid", "1235", "callerId", "calledId", "callId", null); dm.init(); StepRequest request = new StepRequest(); StepResponse response = dm.nextStep(request); assertEquals("__exitDialog",

Nuance Proprietary

Testing

145

dm.getSessionFrame().getCurrentDialogName()); assertEquals("__exitNode", dm.getSessionFrame().getCurrentNodeName()); assertEquals(2, TstSupport.getNumberOfQueuedOutputs(response)); assertEquals("Hello World from Nuance", TstSupport.getFirstQueuedOutput(response)); assertEquals("GoodBye", TstSupport.getQueuedOutput(response, 1)); assertTrue(dm.isTerminated()); } catch (Exception e) { fail(e.getMessage()); } }

146

OpenSpeech Dialog 1.4 Developers Guide

Nuance Proprietary

Appendix A

Predefined properties

This appendix lists property names that are predefined by OSD.

Miscellaneous properties
Ambiguous recognition results
Applications can use the following properties to handle ambiguous recognition results for speech that has more than one meaning:

ambiguousSeparatorChar ambiguousGroupSeparatorChar

For example, consider this conversation:


System User Where will you pick up the car in Germany? In Frankfurt.

Above, the users response is ambiguous; it could refer to Frankfurt Oder or Frankfurt Main. A well-written speech grammar will detect ambiguous meanings and provide all possibilities in the recognition result; the grammar concatenates the meanings. For example, the top next-best item might appear as follows:
0 Frankfurt|frankfurt oder#frankfurt main

Nuance Proprietary Miscellaneous properties

147

The application can configure guard conditions to control how the ambiguous meanings should be distinguished. Above, the pound sign (#) and pipe symbol (the vertical bar, |) are used inside the speech grammar as delimiters:

The pound sign (#) is a delimiter between ambiguous items (frankfurt oder and frankfurt main). To use a different separator, you must configure that delimiter with the following OSD property: ambiguousSeparatorChar

The pipe symbol (|) is a delimiter between an ambiguous group (Frankfurt) and the ambiguous items. To use a different separator, you must configure that delimiter with the following OSD property: ambiguousGroupSeparatorChar

Skip list processing


The asrSideSkipList and serverSideSkipList properties control where skip list processing occurs, either on the OSD server or on the recognition server. For details, see Controlling where skip list processing occurs on page 87.

Properties for OpenSpeech Insight logging (OSI)


All time values are specified in milliseconds; all real number are specified in a simple format such as "1.0"; and exponential formats are not supported.
Property OSIEventLogDestination OSILogMode Description The file or path for the log file. Use this for server-side OSI logging. Selects the type of OSI logging that is done automatically. Values are system (default) or app. When set to system a more complete set of events is automatically logged. When set to app only basic events are written. When true, this property enables server-side OSI logging. The default is "false".

OSILogServer

148

OpenSpeech Dialog 1.4 Developers Guide

Nuance Proprietary

Property OSIVarNameUniqueCallID

Description This property defines the name of the VoiceXML variable that holds a unique call ID at any time during page execution. There is no default value. OSD checks for an ID set via a shadow variable in the platform adaptor and also check for a global xHMI property.

OSIUrlCallStart

This property selects the web application that handles the start of call event for execution of server-side OSI logging requests. Typically, this is a log OSDM. This parameter is mandatory for server-side OSI logging. See example below. This property selects the web application that handles the end of call event for execution of server-side OSI logging requests. Typically, this is a log OSDM. This parameter is mandatory for server-side OSI logging. See example below. This property selects the web application that handles the general application event for execution of server-side OSI logging requests. Typically, this is a log OSDM. This parameter is mandatory for server-side OSI logging. See example below.

OSIUrlCallEnd

OSIUrlLogApplication

Here are example property definitions:


<property name="OSIUrlCallStart" value="%{osdm_server}/osd-osilogger/sessionstart"/> <property name="OSIUrlCallEnd" value="%{osdm_server}/osd-osilogger/sessionend"/> <property name="OSIUrlLogApplication" value="%{osdm_server}/osd-osilogger/log"/>

Nuance Proprietary Properties for OpenSpeech Insight logging (OSI)

149

150

OpenSpeech Dialog 1.4 Developers Guide

Nuance Proprietary

Appendix B

Command line tools

This appendix summarizes available command line tools.

Summary of command line tools


OSD provides the following command line tools:
Tool Recording List Grammar List Validate File rl gl validate Purpose Extracts all prompts from a xHMI configuration. Extracts all grammars filenames from an xHMI configuration. Validates the configuration described in xHMI files.

Prerequisites
To run the commandline tools, your system requires the following:

Java SDK version 1.4.2 or higher with java.exe in the current path. OSD version 1.1. This release sets the environment variable SWIOSD to the OSD base directory and adds SWIOSD\bin to the path.

Recording list tool (listing prompts for the recording studio)


This command line tool extracts all prompts from the xHMI configuration and writes them to a text file. For example, the list might be used when making audio files in a recording studio.

Nuance Proprietary Summary of command line tools

151

Windows command:
rl.bat (-i|--inputfile) <infile> (-o|--outputfile) <outfile>

Linux command:
rl.sh (-i|--inputfile) <infile> (-o|--outputfile) <outfile>

The tool reads the inputfile and extracts all outputs. Any output that has an id and an <audio> tag is written to the outputfile. The input and output parameters are required. The <infile> is the name of the xHMI configuration file. The <outfile> is the name of the recording list that will be created Given is the following output definition at node scope in the configuration file /xyz/appconfig.xhmi:
<output id="id1"><audio src="uri">text</audio></output>

(If there is audio file at the URI, then the text is played.) The recording list generator generates the following file:
file /xyz/appconfig.xhmi dialog DlgName node NodeName output-id id1 soundfile uri text text

This file now serves as input for the recording of the outputs. If the output is not specified in a node/dialog then the node/dialog fields contain the value none. Here is a more interesting example:
<output id="bla"> <audio src="problems.wav"> We could not understand you. </audio> <audio src="operator.wav"> An operator will assist you soon. </audio> </output>

152

OpenSpeech Dialog 1.4 Developers Guide

Nuance Proprietary

Screen output:
Warning: Found an <output> element with multiple <audio> elements Warning: Found name mismatch: output id is 'bla', but audio file is called 'problems.wav' Warning: Found name mismatch: output id is 'bla', but audio file is called 'operator.wav'

Recording file:
file appconfig.xhmi appconfig.xhmi dialog OperatorTransfer OperatorTransfer node doTransfer doTransfer output-id bla bla soundfile problems.wav operator.wav text We could not understand you. An operator will assist you soon.

Nuance Proprietary Recording list tool (listing prompts for the recording studio)

153

Grammar List tool (lists grammars in an xHMI file)


This command line tool extracts all grammar filenames from the xHMI configuration and writes them to a text file as tab-separated items. For example, the list might be used in a functional specification to person who writes the grammars. Windows command:
gl.bat (-i|--inputfile) <infile> (-o|--outputfile) <outfile>

Linux command:
gl.sh (-i|--inputfile) <infile> (-o|--outputfile) <outfile>

The input and output parameters are required. The <infile> is the name of the xHMI configuration file. The <outfile> is the filename of the recording list that will be created. This example shows a sample output file (the count of slots can vary).
dialog id none none none none Order Order Order node id none none none none size topping confirm grm id restart help starkey shortcut size none yes_no grm name restart.grxml __help.grxml starkey.grxml shortcut.grxml size.grxml topping.grxml Boolean __help __help pizzasize pizzasize pizzatopping yesNo pizzatopping slot 0 slot 1 grammar path path/to/grammars path/to/grammars path/to/grammars path/to/grammars path/to/grammars path/to/grammars builtin:grammar

154

OpenSpeech Dialog 1.4 Developers Guide

Nuance Proprietary

Validate tool (validating xHMI configuration files)


This command line tool validates the configuration files of and xHMI application. The validation is performed on all xHMI files included in the application (via the xi:include statement). Windows command:
validate.bat <filename>

Linux command:
validate.sh <filename>

The <filename> is the name of the xHMI configuration file. If errors occur they are written to the stderr device, warnings are written to stdout. The tool performs numerous validations; not all are documented. File-level validations:

Validation of the file against the DTD Validation of the reference to the root dialog (ensures the root dialog is defined) Validation of all references to root nodes in dialogs (ensures root nodes exist) Validation of path attributes of all nodes (ensures the specified paths exist)

Within the file, the validation includes:

check for duplicate symbols validate all <fills> elements to ensure that the name attributes used are previously declared in that scope validate all <verify> elements to ensure that the actor attributes used are previously declared in that scope validate all <verify> elements to ensure that the vcl attributes used are previously declared in any scope validate all <verify> elements to ensure that all names in the vcl' attribute are activated by grammars in visible scopes validate all <understand> elements to ensure that the attributes used in the namelist are previously declared in that scope check all <property> elements for suspicious names. For example:

Nuance Proprietary Validate tool (validating xHMI configuration files)

155

When their names suggest that they are meant to override xHMI system properties but are missing the leading underscore character (_). When their names match a discontinued OSI logging property name.

validate values of properties check that each grammar (<grm> element) has a <fills> element as a child, except when the <fills> element is optional (when the <grmr> element has an event attribute). check consistency of all instances of verifyOutput against the VCL. This ensures that the used attributes or facades exist in one of the verify-output-list elements (on different scopes). for <verify> ensure that the vcl and appendvcl attributes are not used at the same time; also, ensure that any appended attributes are also declared.

This example shows the output of a successful validation:


> validate appconfig.xhmi Validation result: 0 fatal errors,0 errors,0 warnings

This example shows the output of a failed validation:


> validate appconfig.xhmi Error: Duplicated definition of node 'ask_for_location' (previously declared in file 'appconfig.xhmi' in line 128) Error: The attribute 'undefined_attribute' cannot be resolved. Validation result: 0 fatal errors,2 errors,0 warnings

156

OpenSpeech Dialog 1.4 Developers Guide

Nuance Proprietary

Appendix C

Timestamp abbreviations

When using the OSD-provided classes for date and time, dates can be abbreviated as defined in the following grammar:
S fulldate datetime datez timez date date2 date3 time minute second year month day hour minute -> fulldate | datetime | datez | timez | date | time -> date ' ' time ' ' z -> date ' ' time -> date ' ' z -> time ' ' z -> year date2 | date2 -> '-' month date3 | date3 -> '-' day | *eps* -> hour time2 | time2 -> minute time3 | time3 -> second | *eps* -> ('0' | | '9') year | ('0' | | '9') -> ('0' | | '9') ('0' | | '9') -> ('0' | | '9') ('0' | | '9') -> ('0' | | '9') ('0' | | '9') -> ('0' | | '9') ('0' | | '9')

Nuance Proprietary

157

second z

-> ('0' | | '9') ('0' | | '9') -> ( '+' | '-' ) ( hour ':' minute | ':' minute | hour | hour ': ')

Above, *eps* is the empty word, terminal symbol are quoted. Other symbols are rule references.

158

OpenSpeech Dialog 1.4 Developers Guide

Nuance Proprietary

Appendix D

Negative confirmations

When using robust parsing grammars (sometimes called open grammars), the ROOT rule is not a rule but a set of concepts. Although the following grammar is complete, the example does not show the additional files needed for robust parsing (fsm, wordlist, and userdict). See the OSR Grammar Developers Guide for details. Instead of a single ROOT rule, the ROOT rule is divided into individual rules; each rule is called by a rule-ref tag inside a concept tag. The concept tag then uses ECMAScript to copy the values into the returned slots.
<?xml version="1.0" encoding="UTF-8"?> <grammar version="1.0" xmlns="http://www.w3.org/2001/06/grammar" xml:lang="en-US" mode="voice" root="concepts"> <meta name="swirec_user_dict_name" content="my.userdict"/> <meta name="swirec_fsm_grammar" content="some.fsm"/> <meta name="swirec_fsm_wordlist" content="some.wordlist"/> <conceptset id="concepts" xmlns="http://www.scansoft.com/grammar"> <concept> <ruleref uri="#r_origin"/> <tag> origin = r_origin.origin; </tag> </concept> <concept> <ruleref uri="#r_origin_destination"/> <tag> origin = r_origin_destination.origin; destination = r_origin_destination.destination; </tag> </concept>

Nuance Proprietary

159

<concept> <ruleref uri="#neg_origin_destination"/> <tag> var tmp = ''; if (neg _origin_destination.origin != undefined) { tmp += (tmp!='' ? '**' : '') + 'origin*' + neg _origin_destination.origin; } if (neg _origin_destination.destination != undefined) { tmp += (tmp!='' ? '**' : '') + 'destination *' + neg _origin_destination. destination; } if (tmp != '') { NEG_GROUP = tmp; } </tag> </concept> <concept> <ruleref uri="#neg_origin"/> <tag> NEG_origin = neg_origin.NEG_origin; </tag> </concept> </concepts> <rule id="neg_origin_destination" <item> <ruleref uri="#r_not"/> <ruleref uri="#r_origin_destination"/> <tag> origin = r_origin_destination.origin; destination = r_origin_destination.destination; </tag> </item> </rule>

160

OpenSpeech Dialog 1.4 Developers Guide

Nuance Proprietary

<rule id="neg_origin"> <item> <ruleref uri="#r_not"/> <ruleref uri="#neg_origin"/> <tag> NEG_origin = r_origin.origin; </tag> </item> </rule> <rule id="r_not"> <one-of> <item>not</item> </one-of> </rule> <rule id="r_origin"> <item> from <ruleref uri="#r_cities"/> <tag>origin=r_cities.v;</tag> </item> </rule> <rule id="r_origin_destination"> <item> from <ruleref uri="#r_cities"/> <tag>origin=r_cities.v;</tag> to <ruleref uri="#r_cities"/> <tag>destination=r_cities.v;</tag> </item> </rule> <rule id="r_cities"> <one-of> <item>boston<tag>v='boston';</tag></item> <item>austin<tag>v='austin';</tag></item> <item>houston<tag>v='houston';</tag></item> </one-of> </rule> </grammar>

The order of the concepts is defined by the training of the voice model for a robust parsing grammar and not by the order of the tags.

Nuance Proprietary

161

162

OpenSpeech Dialog 1.4 Developers Guide

Nuance Proprietary

You might also like