Professional Documents
Culture Documents
Januar 2013
Agenda
KTM for Forms Kofax Capture Add On Features Technology Enhancements Trainable Document Seperation LimLoc Enhancements Kofax Search and Matching Server Mix Print Detection Clustering Utility
Knowledge Base Conflict Management Project Merge Tool Project Builder Test Documents New Xdoc Browser
Productivity Enhancements Users Localisation Thin Client Enhancements Field and Table Drop Down Lists Sticky Notes (Annotations) Advanced Routing Docking and Zooming
This and That Recostar 5 Normalization Format Locator Enhancements Locator Dialog - Testing Script Rotation
Automation?
(i.e. how much data is extracted automatically)
User productivity?
(i.e. how many docs can a user process per hour)
4
deal is being fought over features and tech, and not business value
Pan European Wholesaler Before Kofax After 3 months of Automation After 2 weeks of user productivity
solution?
The goal is not Perfect OCR Perfect UI
Build a Benchmark
Add the Fields you need Classify (F5) Validate (F8) Save Xdocs (
Tools/ExtractionBenchmark
/AllClasses
Save Benchmark Open in Microsoft Excel
10
bad data leaving Kofax 3. Reduce False Negatives user pressing ENTER 4. Few True Negatives OCR Accuracy, Database problems & learning
11
Benchmark Before
12
Benchmark During
13
Benchmark After
14
15
Database Matching
Slide 16
18
19
20
classification
21
documents
KPSG or partner uses Utility to sort documents from customer Understanding what are the biggest subsets of documents in a
enhancing a KTM project Customer adds new classes to project and needs samples for
classification
22
Requirements
Kofax Clustering Utility works with XDocuments XDocuments must be created with KTM OCR Server tool KTM (5.5 SP2) must be installed to use Clustering Utility.
23
Requirements
Using the KTM OCR Server reduces the KTM base volume count Eval licenses supported Hardware requirements same as for KC/KTM Files to be clustered should be local for performance Need write access to file location
24
proper language
Leave rest at default Running the KTM OCR Server: Simply press the Start button
25
26
KTM
27
28
29
30
31
32
33
34
Setting this up manually and finding/organizing the proper training documents takes hours or days. With the Kofax Clustering Utility, this example took 20 minutes.
35
37
improves ROI.
More powerful and flexible validation interface (with Xtrata you have to
38
update-en.pdf
Features Layout-based classification Unlimited extraction fields Advanced Zone Locator Barcode Locator ABBYY FineReader OCR Document Review (thick client) Validation, Verification, Correction (thick and thin client)
39
40
Scan
Business Processes
Fax
Export Connector
Folder
Web service
Original Format
42
Supports Color Supports Advanced Binarization for full compatibility with all KTM functions
Supports PDF
43
Page 44
46
47
Separation Benchmark
Separation Benchmark
Golden Batch
48
49
50
False Postive
51
52
53
54
55
56
Slide 57 57
GFV = Golden File Value (perfect file) Super Work Work False positives
Slide 58 58
59
Sorting
By Column Content By Status
62
63
Page 64
Page 65
66
67
68
69
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
101
102
103
104
105
Refresh Document List Previous / Next Document Previous / Next Page Zooming Highlighting On/Off Annotations On/Off
106
107
108
109
110
AC_BATCH_NAME - Batch Name (read-only) AC_BATCH_CLASS_NAME - Batch Class Name (read-only) AC_BATCH_PRIORITY - Priority (read/write) AC_BATCH_DIRECTORY - ImageDirectory (read-only) AC_BATCH_EXTERNAL_BATCHID - ExternalBatchID (read-only) AC_BATCH_GUID - BatchGUID (read-only) AC_BATCH_CREATIONDATETIME - BatchCreationDateTime (read-only) AC_BATCH_CREATIONSITENAME - CreationSiteName (read-only) AC_BATCH_CREATIONUSERID - CreationUserID (read-only) AC_BATCH_OPERATORUSERID - OperatorUserID (user ID of last batch history entry) (read-only) AC_BATCH_USERID - UserID (read-only) AC_BATCH_USERNAME - UserName (read-only) AC_BATCH_WINDOWSUSERNAME - WindowsUserName (read-only) AC_FIELD - Kofax Capture document fields (read only) AC_TABLE - Kofax Capture table fields (read only) AC_FORMTYPE - Kofax Capture form type (read only) AC_CSS - Kofax Capture Custom Storage String at document level AC_CSS_PAGE<n> - Kofax Capture Custom Storage String at page level AC_REJECTED_DOCUMENT - Indicates if the document has been rejected in Kofax Capture AC_REJECTED_DOCUMENT_NOTE - The rejection note AC_REJECTED_PAGE<n> - Rejected page AC_REJECTED_PAGE_NOTE<n> - Rejected page note
111
pXDoc.Locators.ItemByName("LineItems").Alternatives(0).Table
112
English
German
114
10 Swedish
Page 115
Documentation
(runtime modules and Userguide.pdf)
1. 2. 3. 4.
116
117
118
Primary language
en en-UK en-US
Secondary language
119
120
Yes
No
Yes
No Use default value for display name Use translation value for display name
End
121
122
123
Summary KTM Graphic User Interface language KTM Server language Project language (Project.ActiveLanguage)
124
125
126
127
128
Project.Resources.GetString("Error_Example")
129
Additional languages
Default language
130
Page 132
133
Validation Form Layouts Annotations Additional Batch Editing Operations User Settings Advanced Login Capabilities Combo-boxes With Descriptions Combo-boxes Inside Tables Other Small Things
135
Different font types and sizes Mini-viewers Custom buttons Location of fields Anchoring Layout localization
136
Display annotations created by KTM modules Create new annotations inside Thin Client Edit annotations Delete annotations Move annotations Hide/Display annotations
137
138
User name at login screen Batch Open dialog box: size, columns, sorting settings Panels: size, expanded states Zoom settings: fit width, fit height, custom zoom Annotation settings: hide/display annotations
139
140
Display descriptions, values or both Support empty strings consistently for all combo-boxes Paging control for over 100 items Type-ahead filtering capabilities New script events to initialize scripted combo-boxes
141
Batch loading performance improvements (project caching) PDF support Reject/Unreject documents support scripting on the server Allow to install Thin Client Server on top of previous version Propagate user changes in config files to a new version
142
144
KTM_DOCUMENTROUTING_NEWBATCHCLASS_<PlaceHolder>
Page 145
AFC or SVM
TDS Separation
Algorythm unchanged Re-use training sets
SVM Last AFC
1st Middle
Re-build model
147
Training Set
30,000 docs 100 doc. types
SVM
AFC
Similar accuracey, but the AFC produces fewer missed splits AFC allows for more frequent benchmarking
Page 148 148
Multi PO discovery Online Learning Release Matching information to ERP Getting more data
150
Multi PO discovery
151
KTM Server
Marked for Learning
Validation clerk
KTM 5.5
KTM Server
Marked for Learning
Validation clerk
Marked for Learning
Slide 152
152
etc. are now stored in new global column for Match Remarks
153
LIM Loc as input to Table Locator Table Header pack for column detection
154
155
Business Value Faster client startup time Instant feedback Access large enterprise DBs Fast response time Industry standard connectivity Low Maintenance
New in KTM 5.5 Client Server instead of local copy (No Loading Delay No Local Memory Usage) Unlimited DB Size due to 64 bit support (50 Mio Records Tested) Multithreaded design with full support of multi core architecture MS SQL, Oracle, ODBC and CSV Automatic DB Update Scheduler in background
157
Technical Background
KTM Validation / KTM Server KSMS
Administrator
158
Technical Background
Instant access, no loading time Automatic update Direct access to databases Made for 64 bit systems and big databases Load balancing available Multiple KSMS Server Security Active Directory support Secure communication Administration through KTM remote or KTM local client possible Separate installer
159
35,00
25,00
20,00
15,00
10,00
5,00
0,00 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
160
162
OCR recognition profile full page only ICR recognition profile full page or zonal Mixed OCR and ICR OCR or ICR Threshold for ICR
163
LongTag = 1 ICR LongTag = 2 Signature Boxes are not created for OCR
164
165
166
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184