"Good To Know" Topics For A Smooth SAP NetWeaver MDM 7.1 Implementation

SAP NetWeaver How-To Guide
Good to know topics for a smooth SAP NetWeaver MDM 7.1 implementation
Applicable Releases: SAP NetWeaver MDM 7.1 (SP01-SP05)
Topic Area: Information Management Capability: Master Data Management
Version 1.02 December 2010
Copyright 2010 SAP AG. All rights reserved. No part of this publication may be reproduced or transmitted in any form or for any purpose without the express permission of SAP AG. The information contained herein may be changed without prior notice. Some software products marketed by SAP AG and its distributors contain proprietary software components of other software vendors. Microsoft, Windows, Outlook, and PowerPoint are registered trademarks of Microsoft Corporation. IBM, DB2, DB2 Universal Database, OS/2, Parallel Sysplex, MVS/ESA, AIX, S/390, AS/400, OS/390, OS/400, iSeries, pSeries, xSeries, zSeries, z/OS, AFP, Intelligent Miner, WebSphere, Netfinity, Tivoli, Informix, i5/OS, POWER, POWER5, OpenPower and PowerPC are trademarks or registered trademarks of IBM Corporation. Adobe, the Adobe logo, Acrobat, PostScript, and Reader are either trademarks or registered trademarks of Adobe Systems Incorporated in the United States and/or other countries. Oracle is a registered trademark of Oracle Corporation. UNIX, X/Open, OSF/1, and Motif are registered trademarks of the Open Group. Citrix, ICA, Program Neighborhood, MetaFrame, WinFrame, VideoFrame, and MultiWin are trademarks or registered trademarks of Citrix Systems, Inc. HTML, XML, XHTML and W3C are trademarks or registered trademarks of W3C, World Wide Web Consortium, Massachusetts Institute of Technology. Java is a registered trademark of Sun Microsystems, Inc. JavaScript is a registered trademark of Sun Microsystems, Inc., used under license for technology invented and implemented by Netscape. MaxDB is a trademark of MySQL AB, Sweden. SAP, R/3, mySAP, mySAP.com, xApps, xApp, SAP NetWeaver, and other SAP products and services mentioned herein as well as their respective logos are trademarks or registered trademarks of SAP AG in Germany and in several other countries all over the world. All other product and service names mentioned are the trademarks of their respective companies. Data contained in this document serves informational purposes only. National product specifications may vary.
These materials are subject to change without notice. These materials are provided by SAP AG and its affiliated companies ("SAP Group") for informational purposes only, without representation or warranty of any kind, and SAP Group shall not be liable for errors or omissions with respect to the materials. The only warranties for SAP Group products and services are those that are set forth in the express warranty statements accompanying such products and services, if any. Nothing herein should be construed as constituting an additional warranty. These materials are provided as is without a warranty of any kind, either express or implied, including but not limited to, the implied warranties of merchantability, fitness for a particular purpose, or non-infringement. SAP shall not be liable for damages of any kind including without limitation direct, special, indirect, or consequential damages that may result from the use of these materials. SAP does not warrant the accuracy or completeness of the information, text, graphics, links or other items contained within these materials. SAP has no control over the information that you may access through the use of hot links contained in these materials and does not endorse your use of third party web pages nor provide any warranty whatsoever relating to third party web pages. SAP NetWeaver How -to Guides are intended to simplify the product implementation. While specific product features and procedures typically are explained in a practical business context, it is not implied that those features and procedures are the only approach in solving a specific business problem using SAP NetWeaver. Should you wish to receive additional information, clarification or support, please refer to SAP Consulting. Any software coding and/or code lines / strings (Code) included in this documentation are only examples and are not intended to be used in a productive system environment. The Code is only intended better explain and visualize the syntax and phrasing rules of certain coding. SAP does not warrant the correctness and completeness of the Code given herein, and SAP shall not be liable for errors or damages caused by the usage of the Code, except if such damages were caused by SAP intentionally or grossly negligent. Disclaimer Some components of this product are based on Java. Any code change in these components may cause unpredictable and severe malfunctions and is therefore expressively prohibited, as is any decompilation of these components. Any Java Source Code delivered with this product is only to be used by SAPs Support Services and may not be modified or altered in any way.
Document History
Document Version 1.00 1.01 Description First official release of this guide for SAP NetWeaver MDM 7.1 Change chapter 4.4 System landscape for parallel and serial imports and moved to chapter 6.11 Parallel Import Change chapter 9.1.2 Disable stemming for matching to 9.1.2 Stemming for matching 1.02 Change chapter 5.10 Number of images
Typographic Conventions
Type Style Example Text Description Words or characters quoted from the screen. These include field names, screen titles, pushbuttons labels, menu names, menu paths, and menu options. Cross-references to other documentation Example text Emphasized words or phrases in body text, graphic titles, and table titles File and directory names and their paths, messages, names of variables and parameters, source text, and names of installation, upgrade and database tools. User entry texts. These are words or characters that you enter in the system exactly as they appear in the documentation. Variable user entry. Angle brackets indicate that you replace these words and characters with appropriate entries to make entries in the system. Keys on the keyboard, for example, F2 or ENTER.
Icons
Icon Description Caution Note or Important Example Recommendation or Tip
Example text
Example text
<Example text>
EXAMPLE TEXT
Table of Contents
1. 2. 3. 4. Business Scenario .......................................................................................................... 1 Related Information ......................................................................................................... 1 Prerequisites.................................................................................................................... 1 Architecture/ Landscape ................................................................................................. 2 4.1 4.2 4.3 5. Overview .................................................................................................................. 2 Network .................................................................................................................... 3 Number of servers .................................................................................................... 3
MDS .................................................................................................................................. 4 5.1 5.2 5.3 5.4 5.5 5.6 MDM Server Architecture .......................................................................................... 4 Concurrent users ...................................................................................................... 5 Memory usage .......................................................................................................... 5 Repository load time ................................................................................................. 6 Accelerator files ........................................................................................................ 7 MDM Server Client Requests .................................................................................... 8
5.6.1 Process Flow ................................................................................................ 8 5.6.2 Queue Handling Locking ............................................................................ 9 5.7 MDS.ini file parameter: CPU Count ......................................................................... 14 5.8 MDS.ini file parameter: "Max Threads Per Operation" ............................................. 14 5.9 MDS.ini file parameter: Extra DBConnection Validation ........................................... 15 5.10 Number of images .................................................................................................. 15 5.11 Performance for AIX ............................................................................................... 15 6. Import and MDIS ............................................................................................................ 16 6.1 6.2 Steps during update and import .............................................................................. 16 MDIS.ini File Parameters ........................................................................................ 17 6.2.1 Interval ....................................................................................................... 18 6.2.2 Chunk Size ................................................................................................. 18 6.2.3 Inter-Chunk Delay MS................................................................................. 19 6.2.4 No. Of Chunks Processed In Parallel .......................................................... 20 Slicing..................................................................................................................... 20 Registry parameters................................................................................................ 23 Bulk_Import_Silo..................................................................................................... 23 Cartesian Product ................................................................................................... 25 How to import lookup main ...................................................................................... 29
6.3 6.4 6.5 6.6 6.7
6.8 Importing single source field to multiple display fields .............................................. 32 6.9 Strategies for Attribute/Value Import........................................................................ 34 6.10 Excel Import............................................................................................................ 37 6.11 Parallel Import ........................................................................................................ 40 6.12 Transport of maps (value mapping) ......................................................................... 41 6.13 Import XML (XSD.exe) ............................................................................................ 42 6.14 How to avoid disappearing value mappings............................................................. 42 6.15 Difference between Value Exceptions and Import Exceptions.................................. 43
6.16 6.17 6.18 6.19 6.20 7.
Limitation on number of source fields ...................................................................... 43 Limitation on import file size .................................................................................... 43 Port Sequence ........................................................................................................ 43 Access, Excel, complex XML not supported on Win64 ............................................ 45 Maximum multi-record value ................................................................................... 45
Syndicator and MDSS ................................................................................................... 46 7.1 7.2 7.3 7.4 7.5 7.6 Steps during Syndication ........................................................................................ 46 MDSS.ini Parameters ............................................................................................. 48 Syndication remote keys ......................................................................................... 50 Suppressing Initial Syndication................................................................................ 50 Field triggers for syndication ................................................................................... 51 Changed Records are not being syndicated .......................................................... 52
8.
Data Model ..................................................................................................................... 53 8.1 8.2 8.3 8.4 8.5 8.6 8.7 8.8 8.9 8.10 8.11 8.12 8.13 8.14 8.15 8.16 Performance considerations.................................................................................... 53 Number of main table fields..................................................................................... 54 Lookup tables ......................................................................................................... 54 Nested lookups ....................................................................................................... 55 Qualified tables ....................................................................................................... 55 Lookup main uni-/ bidirectional relationship .......................................................... 57 Tuples .................................................................................................................... 61 Validation Tuples vs. Validation on qualified table ................................................... 62 Key mapping........................................................................................................... 66 Change tracking...................................................................................................... 67 Calculated fields ..................................................................................................... 67 Keyword index ........................................................................................................ 67 Multi-Lingual Fields ................................................................................................. 68 Sort index ............................................................................................................... 69 Display fields .......................................................................................................... 70 Taxonomy............................................................................................................... 70
8.17 Relationships .......................................................................................................... 71 8.18 Workflow................................................................................................................. 71 9. Core Features ................................................................................................................ 72 9.1 Matching................................................................................................................. 72 9.1.1 Matching Performance ................................................................................ 72 9.1.2 Stemming for matching ............................................................................... 74 9.1.3 Matching Strategy: Match attribute values ................................................... 75 Stemming ............................................................................................................... 77 Workflow................................................................................................................. 78 9.3.1 Accessing the Workflows Table .................................................................. 78 9.3.2 9.3.3 9.3.4 10. Import and Workflow ................................................................................... 79 Completed Workflows ................................................................................. 79 Workflow Thread ........................................................................................ 79
9.2 9.3
JAVA-API ....................................................................................................................... 80 10.1 Notifications (Events) .............................................................................................. 80
11.
Portal Integration ........................................................................................................... 81 11.1 11.2 11.3 11.4 11.5 Portal MDM connectivity ...................................................................................... 81 Garbage Connections ............................................................................................. 83 How to find MDM 7.1 Portal Content build version ................................................... 83 MDM Web Services ................................................................................................ 84 MDM Web Dynpro Components.............................................................................. 86
12.
PI Adapter ...................................................................................................................... 88 12.1 Communication with MDS ....................................................................................... 88 12.1.1 Import - MDM PI Adapter to MDS................................................................ 88
Good to know topics for a smooth SAP NetW eaver MDM 7.1 implementation
1.
Business Scenario
This document describes challenges often experienced in SAP NetWeaver Master Data Management (SAP NetWeaver MDM) implementation projects. Besides the explanation of "technical" topics like Multithreading, Locking, and Scalability, there are also "application"-specific topics like Matching, Import Steps, and Syndication Steps discussed, and in addition data modeling recommendations are provided. All of this is given with the purpose of making your own implementation project smoother as well as providing an understanding of the internal architecture of SAP NetWeaver MDM.
As the way MDM handles some of the described topics will change in future releases, please check for further versions of this document. In addition please check the coming release notes for SAP NetWeaver MDM 7.1 SP06 for planned innovations to optimize performance, stability and supportability.
2.

Related Information
SAP MDM 7.1 Documentation Center: service.sap.com/installMDM How to guides: www.sdn.sap.com/irj/sdn/howtoguides Information Management/ Data Unification o Developing Applications on Top of SAP NetWeaver MDM - An Architect's Guide http://www.sdn.sap.com/irj/scn/index?rid=/library/uuid/7012ea38-6069-2c10-0097a95ad0b9fc21 o Best Practices Workflow for Master Data Management http://www.sdn.sap.com/irj/scn/index?rid=/library/uuid/50f1c01b-972d-2c10-3d9d90887014fafb o Best Practices for Repository Migration from SAP NetWeaver MDM 5.5 to SAP NetWeaver MDM 7.1 http://www.sdn.sap.com/irj/sdn/index?rid=/library/uuid/80765a21-78f3-2b10-74a2dc2ab57a1bd2 o How to Optimize an MDM Matching Process http://www.sdn.sap.com/irj/sdn/go/portal/prtroot/docs/library/uuid/408b8031-6fc42b10-c18b-a77abefb75b9 o How to Activate Field Triggers for Syndication http://www.sdn.sap.com/irj/sdn/go/portal/prtroot/docs/library/uuid/60a2b4e4-69932a10-0e9a-c6d720f1571b o How to Avoid Problems with Your Data Model in SAP NetWeaver MDM Do's and Don'ts http://www.sdn.sap.com/irj/sdn/go/portal/prtroot/docs/library/uuid/d0d8aa53-b11d2a10-aca2-ff2bd42a8692
3.
Prerequisites
SAP NetWeaver MDM 7.1
December 2010
4.
Architecture/ Landscape
4.1 Overview
SAP NetWeaver MDM is based on a client-server architecture with several server and a variety of client components. User interfaces are provided for administrative task, repository design and for working with master data. MDM Server The MDM server is a standalone application implemented in C++ which is not running inside the SAP NetWeaver Application Server. MDM Server (MDS) is the core component of SAP NetWeaver MDM. MDM Repository An MDM repository certainly includes a database of information consisting of text, images, PDFs, and other data about each record. MDM Client Users can access the MDM functions using the MDM portal UI or a variety of client applications for the different MDM activities. MDM Import Server The Master Data Import Server (MDIS) automates the task of importing master data into MDM repositories. It allows you to import data automatically in conjunction with predefined inbound ports and import maps. MDM Syndication Server The Master Data Syndication Server (MDSS) automates the task of syndicating master data from MDM repositories. It allows you to export data automatically in conjunction with predefined outbound ports and outbound maps.
December 2010
MDM Architecture Components:
4.2 Network
All MDM and DBMS servers should be installed in one LAN with a backbone of 1 Gbps. A network connection between MDM Servers and MDM clients should be at least 100 Mbps. All clients (console, data manager and so on) should reside inside the LAN. For clients connected via slower networks, the recommendation is to install client software on Windows Terminal Server, which should be in the fast network environment of the servers.
4.3 Number of servers

It is recommended you separate the DB server, MDS and MDIS/ MDSS. It is also recommended you separate MDIS and MDSS, but this is not a precondition. These recommendations are based on test experience as there will be a competition of the different modules for RAM and CPU if they run on one physical server. For more information see the sizing guide for SAP NetWeaver MDM 7.1.
...
December 2010
5.
MDS
5.1 MDM Server Architecture

The following chapter gives an overview about the MDM Server Architecture: The MDM server contains core MDM functionality like the engines for searching, matching and validating. The MDM server is a standalone application implemented in C++ which is not running inside the SAP NetWeaver Application Server. In general, MDS is multithreaded. Multiple threads are used to listen for incoming requests, multiple working threads are used for performing MDM tasks. Any given request is typically allocated one thread by the server for its processing, allowing multiple requests to run in parallel on multiple threads/CPUs. (The only exception here is write requests, which reserve the repository exclusively in order to ensure data integrity) The MDM repositories are stored in relational databases. One MDM server can access multiple MDM repositories that may be persisted on one or on multiple database servers. It is also possible to store a single MDM repository using multiple (predefined) physical partitions that are stored in separate databases. Inside the MDM server there are the core MDM services that provide the functionality and a database abstraction layer that encapsulates the details of different relational database management systems. To achieve high performance the MDM server uses an in-memory cache into which complete MDM repositories (*means the indices) are loaded. Before a repository can be accessed by clients, it must first be loaded into the cache of the MDM server by an administrator. The in-memory architecture is required to enable the fast searching and matching provided by the MDM server. But it also has an impact on hardware requirements and server operation: The server hardware needs enough main memory to load completely all repositories assigned to that server. Loading of very big repositories takes considerable time which must be added to the start-up time after a server restart or repository maintenance. Clients access the MDM server using a MDM specific network protocol based on TCP/IP.
December 2010
5.2 Concurrent users

The number of concurrent users is an influencing parameter for sizing. It is the number of read and write connections to the repository, where the connection is every instance of an application that is connected to the repository. Client connections such as GUI Client, API Client, Portal, and so on have parallel connections to the MDM Server. The Portal uses a connection pool for simultaneous accesses to the MDM server. For further information about calculating the concurrent users see the sizing guide for SAP NetWeaver MDM 7.1.
5.3 Memory usage

For the calculation of memory, the used/ allocated memory is used, not the memory needed following the field definition. Example: Text field, width 255
For the memory usage, the defined width of 255 is not relevant, but the real-life fill rate is. This means that fields with a large field width definition do not lead to a high memory usage if the real-life fill rate is pretty low.
December 2010
Example: Example for MS SQL Database: Tables are defined as nvarchar.
nchar and nvarchar: Character data types that are either fixed-length (nchar) or variable-length (nvarchar) Unicode data and use the UNICODE UCS-2 character set. nvarchar(n): Variable-length Unicode character data n can be a value from 1 through 4,000 (4*1024). The storage size, in bytes, is two times the number of characters entered + 2 bytes. The data entered can be 0 characters in length. nvarchar (2000) corresponds to a storage size of 4000 bytes (Unicode character data) for each record.
5.4 Repository load time

There is a significant difference between the loading time for loading with or without indices. If the indices are created once, there is normally no need to load the repository with indices again. The indices have to be recreated only if an error occurs. By loading with indices, the load time can increase several times compared to a load time for loading without indices.
December 2010
What can be done if the load process crashes? 1) Load without creating indices Nevertheless, the missing indices will be created. Only if step 1 fails: 1) Restart MDS 2) Verify, repair, and load with rebuild indices In version MDM 7.1 load immediate and load with indices algorithms were reviewed and modified to enable shorter load time: Load immediate is about 2-4 times faster (in average) Load with indices is about 2-3 times faster (in average) The improvements to the load times are highly repository dependent. The greatest improvements will likely be seen if any of the following apply: Usage of multiple CPUs and a large number of fields indexed Usage of a large number of key mappings Usage of a large number of small flat lookup tables This could result in even faster load times than mentioned above.
5.5 Accelerator files

The MDS in-memory data representation and indexes are updated and asynchronously flushed to accelerators on disk. If the repository is loaded immediately, the data representation and indices are loaded from accelerator files instead of being regenerated by reading them from the DB.
Not all data is loaded into memory. MDM needs continuously access to the DB. For example, PDF documents or images are loaded from Database on request and not during repository loading.
December 2010
Load Repository Immediate o o o Loads indexes from accelerator files instead of being regenerated by reading the data them from the DB and creating the index. Automatically builds indexes stored in out-of-date accelerator files. Is usually fast.
Load Repository with Update Indices o o o Rebuilds all MDM indexes files by reading and indexing data stored in the DB. The results are flushed on disk as accelerator files. Necessary by repository schema changes or if any error occurs. Might take significant amount of time to complete.
5.6 MDM Server Client Requests

5.6.1 Process Flow
The following chapter shows the process of the MDM Server client requests: Each client (MDM Data Manger, Portal client, MDIS, MDSS communicates via TCP /IP sockets with the MDS server. Listener thread continuously scans all sockets and assigns requests to worker threads. Worker thread performs first de-serialization transforming byte stream in C++ classes If the request requires read operations on one repository, worker thread tries to acquire a shared lock on the repository If the request requires write operations on one repository, worker thread tries to acquire an exclusive lock on the repository The request is being processed. If repository data were changed, notifications are being sent Lock is released Responses end and thread returned to worker thread
December 2010
5.6.2 Queue Handling Locking

In general, MDS is multithreaded. Multiple threads are used to listen for incoming requests, multiple threads are used for reading, and multiple background threads are used for workflow. A situation might occur that it is not possible to work with the Console, Import or Data Manager. What is the reason for this behavior? As described, MDS is multithreaded and will handle multiple read requests in parallel, taking advantage of multiple CPUs on the system. Any given request is typically allocated one thread by the server for its processing, allowing multiple requests to run in parallel on multiple threads/CPUs. The only exception here is write requests, which reserve the repository exclusively in order to ensure data integrity. This means that all MDM Requests to a given repository are placed in that repository's Queue. Whereas read requests can be executed in parallel, a write request will require exclusive access to the MDM repository. If a write operation is the next task in the request queue, the execution will wait until previous read requests have been worked off. Then the write operation will be initiated and subsequent operations have to wait until it is finished.
December 2010
Multiple CPU is useful for read processes since reads occur in parallel. If one CPU is working at full capacity, the other one is still available to service read-only requests. Details of MDM locking: All MDM operations place either a shared or exclusive lock on the repository. This results in the following blocking behavior: o o o o Shared lock operations can occur in parallel with other Shared lock operations Shared lock operations block Exclusive lock operations. Exclusive lock operations block Shared lock operations Exclusive lock operations block Exclusive lock operations
MDM operations can be broken into short duration and long duration operations. The ones that are important here are the long duration operations because these may block other operations. Example A short duration Exclusive lock operations will block all other operations. But this is a minor concern since the operation is short, and therefore the wait time for the other operations will also be short. What is important are the long duration operations, since they may block other operations. Whether they block depends on the above matrix of Shared vs. Exclusive lock.
December 2010
10
The list of long duration MDM operations and their lock type follows: o o o o o o o Import - Exclusive Syndicate - Exclusive Modify many records - Exclusive Delete many records - Exclusive Add many records to workflow - Exclusive Check out, Check in, Roll back many records - Exclusive Match many records - Share
All other operations are considered short duration operations. The ones that modify data require an Exclusive lock, and the ones that just retrieve data require a Shared lock. To determine if two operations can be run in parallel, check the duration and Lock Type (Shared vs. Exclusive) of each operation and use the above matrix to determine if they will block each other. Example Match many records can occur simultaneously as Retrieve records since both are Shared lock operations. Match many records cannot occur simultaneously as Import since one is Shared and one is Exclusive. In this case one will occur first, and the other second. Note In a syndication process the read process ends up in a write process when the syndication date is written back to the database. This leads to an exclusive lock of the repository. (Syndication is actually multiple operations, most time is spent in Shared operations, but since there are operations that require an Exclusive lock, the entire Syndicate is just considered as an Exclusive operation). Note Check-in and Check-out operations performance was enhanced. Please see note 1415628. Note Record lock will be released when a user is trying to access the record, which was locked by a dead session, e.g. user application cut the network line, user application finished without destroying session, user session timed out. This means invalidated locks are not actually released on disconnect/expiration. Instead, they stay in place until another application wants to lock the affected record. At that time, locking code checks if the existing lock has a valid owner. If yes, locking fails. If not, the old lock is discarded and the record is locked by the new application. Important As the lock of a repository only takes a very short period of time, the user might notice this behavior only during parallel import, syndication of high volumes or during checking in/out of large volumes of records.
December 2010
11
The performance of import and syndication has been tremendously improved so the probability of noticing this effect is decreasing. Example 1: Single processing capability during import and syndication The Import locks the repository exclusively during the import of each batch. Therefore you cannot work on the repository during import and you cannot perform a parallel import. MDM server: CPU usage:
Data Manager on client server 1:
December 2010
12
Data Manager on client server 2:
Recommendation Large imports, updates or syndications should be scheduled to times when it will not interrupt regular users. Example 2: The following picture shows the activity monitor of the Solution Manager Diagnostics.
Smith
1. Opening of huge maps by user Smith caused a read lock. Please see get matched numbers. 2. User C3VTOMDM tries to add an image at the same time. This is requesting a write lock. The write lock will be given when the task 1 is finished. The user C3VTOMDM waits. 3. All other read repository locks for user C2VTOMDM and C4VTOMDM are now waiting for the 2. request with the write lock to finish.
December 2010
13
Example 3: Request for a server/repository read lock timed out Under the current locking model, when a user, for example, adds a field to a table, both the server and the repository get locked. When another user attempts to add a field to a table of another repository, the user is unable to lock the server. Until MDM 5.5 SP06 patch 3, users were waiting as long as necessary to acquire the locks. Starting with MDM 5.5 SP06 patch 3 and MDM 7.1, lock attempts time out in 2 minutes to prevent Consoles from locking up for long times. This is not an error and does not require server to be restarted.
5.7 MDS.ini file parameter: CPU Count

The parameter CPU count is replaced by the new parameter Max Threads Per Operation.
5.8 MDS.ini file parameter: "Max Threads Per Operation"

In MDM 7.1 the old mds.ini option "CPU Count" has been replaced by "Max Threads Per Operation" The parameter defines the maximum number of threads that a single operation is allowed to use. Default value is Auto, which automatically limits the number of threads an operation can use to the number of logical cores on the MDM Server host (if the server has more than two cores, the maximum number of threads MDS will use per operation is equal to number of cores 1). Potential values for the "Max Threads Per Operation" in the MDS.INI file can be An empty value ("Max Threads Per Operation="): nothing is set. MDS will set the value to auto. The automatic option ("Max Threads Per Operation=Auto"): MDS will automatically determine the number of cores in the system. If the system has more than two cores, MDS will use up to the number of cores less one thread per operation. Otherwise, it will use one thread per operation. A fixed number of threads option (an integer number greater than 0): MDS will use up to the specified number of threads per operation. Any other values (text other than Auto, negative numbers, 0, etc) will be considered to be errors and MDS will switch to Auto.
December 2010
14
Note This parameter does not limit the total number of threads used by MDS. Example A multi-core processor combines two or more independent cores into a single package. The individual core is normally a CPU. A dual-core processor contains two cores, and a quad-core processor contains four cores. Number of cores = 2 MDS will use max 1 thread per operation Number of cores = 4 MDS will use max 3 threads per operation
5.9 MDS.ini file parameter: Extra DBConnection Validation

The parameter makes sure that the DBMS connection is live prior to every DBMS request and silently restores it if necessary. It is useful for the small minority of MDM installations where the network connection between the MDM Server and the DBMS is unreliable and frequently lost. It improves reliability but slows the MDM Server. Therefore the default setting should be False. The idea is that if the network is faulty to the degree that MDM cannot perform, then the network needs to be repaired. The setting (to true) exists for customers who need the reliability while they are waiting for the repair of their network.
..
5.10
Number of images
If images are not categorized under Data Groups then each CRUD operation needs to show the thumbnail of all images and it can be slow. But usually images are categorized so that each time you show only the relevant data group, then it is very fast even with overall high number of images.
5.11
Performance for AIX
With MDM 7.1 SP03 SAP ships a build for AIX which is optimized for performance. This is based on an updated C++ run time library shipped by IBM, and the customer is encouraged to update his landscape to the relevant AIX C++ RTE release. This fix may significantly improve performance on AIX for single user operations. You can find more details at the following links: http://www-01.ibm.com/support/docview.wss?rs=2239&uid=swg21110831 Section: C++ Runtime Environment AIX 5.3 TL6 - AIX 6.1: XL C++ RTE for AIX, V10.1 July 2009 English international 64-bit versions of the released UNIX platforms. MDM on AIX requires AIX service pack 5.3.0.50, which can be downloaded from the IBM Web site. Also, MDM requires the IBM C++ Runtime Environment Components for AIX version 8 or higher contained in the Runtime package. This package can be found at: http://www-306.ibm.com/software/awdtools/xlcpp/ Please see note 1342611.
.
December 2010
15
6.
Import and MDIS
6.1 Steps during update and import

Generally, the following interdependent steps occur during updates to fields of a record. Automatic validations are performed and all constraints are checked, including rights, unique constraints, and data integrity DBMS tables are updated MDS in-memory data representation and indexes are updated (and asynchronously flushed to accelerators on disk) Calculated fields are refreshed Workflows are checked and triggered Notifications are generated
Generally, the following interdependent steps occur during import. During import, this process becomes more complex. Import Server processes data to MDS in batches, which are configured in the MDIS.INI file The source file is parsed to fill a batch The data is normalized, transformed, and mapped to MDM structures The batch of records is matched against the repository and MDIS determines the import action for each record Within each batch, Import Server may also include data not provided in the simple update process described above New lookup records are created including remote keys New attributes and text attribute values including remote keys are created The MDIS sends the batch of records to MDS and the Import Server continues its processing of source data in parallel. MDIS does not import records directly from import files. Instead, it imports a transformed version of the original record called the virtual extended record. This transformation is part of the overall process through which source data is imported into an MDM repository. MDS locks the repository as described in the previous chapters and processes the data; it releases its locks between each batch in order to allow clients to refresh MDIS processes import files in several steps: 1. Scanning Ports: MDIS regularly triggers a port scan. During the port scan it checks if files exist in the port folder structure that need to be imported. If the port scan finds files, it triggers an import. The actual import starts by separating a file into several chunks if its row count exceeds the chunk size defined in mdis.ini. This step takes usually only a few ms. 2. Structural Transformation Stage: The import file or import chunk undergoes a structure and value mapping where the source structure is mapped to the target structure and source values are replaced by target values. The number of executions corresponds to the number of chunks or files transformed. The time spent for the transformation depends strongly on the size of the import files. This means source tables are converted to a single virtual table following the rules of the import map. During the structural transformation stage, MDIS flattens all source tables in the import file into one virtual extended table. For text files, there is a 1:1 ratio of source table records to virtual extended table records (although the number of fields per record increases on the virtual extended table).
December 2010
16
For XML files, there can be a 1:many ratio of original records to virtual extended records, depending on the way record data is contained in the XML schema (Please see chapter Cartesian Product). Also during the structural transformation stage, MDIS applies any field transformation operations (cloning, splitting, pivoting, etc.) specified in the import map. If there are discrepancies between the structure of the tables in the import file and the structure in the import map (e.g. missing fields), MDIS will be unable to proceed. If this occurs, MDIS logs a structural error and moves the source file to a separate folder for manual processing in the Import Manager. 3. Value Transformation Stage. Source values on the virtual table are converted according to the rules of the import map. 4. Trigger Import: The transformed virtual records are imported into the MDM repository. The number of executions corresponds to the number of chunks or files imported.
6.2 MDIS.ini File Parameters

The parameter Interval defines the number of seconds MDIS waits between scans of the MDM Server. The following parameters can be used to adjust the responsiveness of MDS during an import: Chunk Size={number of records for each import chunk, default is 50000} Inter-Chunk Delay MS= {delay between chunks, default is 0} No. Of Chunks Processed In Parallel
December 2010
17
6.2.1
Interval
Interval= The number of seconds MDIS waits between scans of the MDM Server. For MDM 7.1 the files are immediately processed once theyre put into a ready-folder. This means MDIS directly picks up the file no special configuration is required. The parameter Interval is left in the MDIS.ini file as a safeguard. This means if MDIS should miss a notification, the port will be scanned at the latest following the interval definition.
6.2.2
Chunk Size
Chunk Size={number of records for each import chunk, default is 50000} The Chunk Size parameter defines the number of virtual extended records to include in a chunk. Typically, the smaller the chunk size the better, as smaller chunks can be processed faster within MDIS and also require less wait time in transactions between MDIS and the MDM Server. However, setting the chunk size too low can reduce overall performance because there is a fixed overhead per chunk. Example The import task downloads its own source file as a whole and processes it. If the chunk size is set to 20000, a file containing 50000 records is split into three files (20000/20000/10000).
December 2010
18
Recommendation Chunk Size=50000 (default) Tweak this value slowly, as the larger the chunk size, the more memory is consumed by MDIS. Setting this parameter smaller reduces the time it takes MDS to process each chunk of records. Setting this parameter larger reduces the total time it takes MDS to process all the chunks. Since MDIS uses a multithreaded pipeline to process chunks, you should set the chunk size so that large imports are broken up into multiple chunks.
6.2.3
Inter-Chunk Delay MS
Inter-Chunk Delay MS= {delay between chunks, default is 0} The number of milliseconds MDIS waits between processing chunks of records to allow other clients access to MDS. Note Increasing this parameter gives MDS more time to process other client requests in between chunks.
December 2010
19
6.2.4
No. Of Chunks Processed In Parallel
No. Of Chunks Processed In Parallel= {Number. The number of chunks processed simultaneously during streaming import} The chunks are created in parallel. Therefore multiple CPUs help create parallel chunks. The final import step in MDS is done sequentially and cannot make use of multiple CPUs. The No. of Chunks Processed in Parallel parameter optimizes the way chunks are processed within MDIS. Rather than wait for one chunk to pass through all three processing stages before starting work on the next chunk, MDIS instead moves chunks from stage to stage as each chunk is processed. This parameter determines how many chunks MDIS can manage simultaneously, counting one chunk per stage plus chunks queued between stages. Recommendation No. Of Chunks Processed In Parallel= 5 -10 Optimal setting depends on the available memory and processing power of the machine running MDIS. Minimum requirements = 4 GB available memory and dual-core processor. Increase the value from 5 to 10 in proportion to actual system specifications. Example A file with 10k records has to be imported. The chunk size is set to 2k and the number of parallel chunks to 5. When the structural transformation thread is finished with the first chunk, the chunk travels through the pipeline to be value transformed and imported. Meanwhile the structural transformation thread will break down the next 2k and prepares for travel into the pipeline. If the number of parallel chunks was set to 1, then the structural transformation thread has to wait until the 1st chunk was imported and is ready for re-use.
6.3 Slicing
When importing records into an MDM repository, the MDM Server (MDS) locks the repository from all other client activities in order to safeguard the integrity of the repositorys data. To maintain the safe and timely import of data without sacrificing performance for other MDM clients, a balance must be struck between the amount of time MDS devotes to importing records and the time it allows for processing requests from other clients. To help you optimize this balance, MDM 7.1 introduces the concepts of import slices and import sleep time. Import slices divide a set of import records into smaller groups (slices) for MDS to process. By breaking up a large import job into multiple slices, other MDM clients no longer have to wait until the entire import job is complete before their requests can be answered by MDS. Now, MDS can set aside time between processing each import slice to respond to requests from other clients. During this period, called the import sleep time, MDS becomes available to all other client activities. Once the import sleep time ends, MDS starts processing the next import slice. MDS continues alternating between import slices and import sleep time until the entire import job is completed. Both the number of records to include in each import slice and the length of the import sleep time can be customized by entering configuration parameters in the mds.ini file.
December 2010
20
Example MDIS would send 50K records (the chuck size) to MDS for importing. On the MDS side, the 50K chunk would be sub-divided into 2K slices (the slice size). MDS is processing the import in small 2K slices, while the write lock is released in between the processing of the import slices during the entire 50K chunk import, and thus does not block other incoming requests for an extended period. This will allow other requests to be handled in between the processing of the import slices. The Import Slice Size Min and Import Slice Size Max parameters determine the average number of records to include in each slice. The higher the number of records in a slice, the more memory is required on the server hosting MDS and the longer the response time becomes for client requests to MDS. Too few records in a slice, however, can make the import process less efficient, as there is some overhead involved in each slice. Factors to consider when determining the optimal import slice size include: Available memory and processing power on the server hosting MDS Capabilities of the underlying DBMS The complexity of the target repositorys data model. The Import Sleep Time Max parameter dictates the length of the pause which MDS takes in-between processing each import slice. The higher this value is set, the longer it will take to complete the import job. The lower this value is set, however, the more likely it is that client requests are delayed. These configuration parameters and their recommended values are summarized in the following
December 2010
21
Note In future versions of MDM, import slice sizes will alternate between the Min and Max values based on MDS activity levels. Currently, MDS only and always uses the average of these two values. Note Each of these parameters must be added manually to the server-level settings in the mds.ini file. If these parameters are not added to mds.ini or have no values set, their default values will be used. MDS Import Slice Size and MDIS Chunk Size Do not confuse Import Slice parameters with the MDIS parameter, Chunk Size.
The Chunk Size value tells MDIS how many records from an aggregated file set to process at a time. Once MDIS processes a chunk of records, it immediately sends this chunk to MDS for import into the repository. When MDS receives the chunk, the Import Slice Size setting controls how many records from the chunk it should process at a time (several slices may be required to process an entire chunk). You will want to experiment with both chunk and slice settings to achieve optimal import performance for your environment. Note Please see note 1329424, section MDM Import Server: Optimal MDIS Chunk and Slice size: The optimal MDIS chunk size to use in MDM is 2000 and the MDS slice size is 100.
December 2010
22
Import Slice Size and Record Checkouts If you have configured MDM to check out records automatically upon import, MDS will check out all records in an import slice as part of processing that slice. Import Slice Size and Workflows If you are importing records into a workflow, the workflow is not launched until all import slices from an import package have been processed. An import package refers to the complete set of records received: From Import Manager or an Import Records API function; or From a set of import files aggregated by MDIS. Note The number of files which MDIS aggregates into an import package is controlled by a ports File Aggregation Count
6.4 Registry parameters

As youve just read, the import slice size and the import sleep time have an effect on how responsive MDS will be to incoming requests from MDM clients e.g. Data Manager, Import Manager, Java API, etc when MDS is importing data. While these settings do have a major impact on how quickly MDS responds to client requests, Data Manager can still appear unresponsive. The reason is that Data Manager constantly polls MDS for any changes that occur on the data. It does this in order to keep the data being shown as up to date as possible. This is usually a wanted feature but it becomes problematic during imports because large amounts of data are being modified. All these record modifications cause MDS to create notifications that Data Manager retrieves and processes in order to refresh the data. The problem is that since there are so many notifications during a mass import, Data Manager doesnt have time to respond to user requests e.g. search, create, modify, etc unless the import sleep time setting is set to a large value, which isnt recommended. Instead, a registry setting was introduced in order to control how often Data Manager retrieves the notifications. Retrieve Modifications Delay Seconds is the name of the setting and it has a default value of 10. The smaller the value the quicker Data Manager is refreshed. The bigger the value, the more responsive Data Manager will be during an import. The recommended value is 10 but it can be changed if warranted during performance tests.
6.5 Bulk_Import_Silo
Bulk Import Silo = True/False. Inreases import performance by optimizing SQL access methods. The mds.ini parameter BULK_IMPORT_SILO_LEVEL is deprecated. Instead the following two mds.ini parameters are used: Bulk Import Silo Safe Silo Mode
Bulk Import Silo is a Boolean value with default value true. During the bulk import, if this option is set to true, MDS will group multiple similar SQL statements into one Silo and execute it later.
December 2010
23
Example MDS could group multiple insert statements into one bulk-insert operation which will run much faster than individual ones. When Bulk Import Silo is true, Safe Silo Mode (Boolean value) can be set to true/false and default value is false. If Safe Silo Mode is set to true, MDS will flush (execute) the silo whenever next SQL statement change type (insert/update/delete). If Safe Silo Mode is set to false, MDS will keep the Silos until the end of import and flush them altogether. This is by far the most efficient way of bulk import. So there're 3 level of optimization: 1. Bulk Import Silo = false: 2. Bulk Import Silo = true and Safe Silo Mode = true: 3. Bulk Import Silo = true and Safe Silo Mode = false: Recommendation We recommend option 3. Note Please also see note 1329424. no optimization partial optimization full optimization
December 2010
24
6.6 Cartesian Product

An import of a very small number of records lasts very long and consumes a lot of memory. What are the reasons for this behavior? Example Two Main table records (e.g material master), several segments for each record Sample XML Structure - and two sample records for your orientation:
December 2010
25
Based on the sample the following number of records will be imported:
In a real life example for the number of 2 records for Main Table and 5 nested structures with different occurrences per record this will lead to the import of 1860 records!
The increase of numbers of records is caused by adding Joins & Lookups. Due to better visibility of this topic in MDM 5.5 (in MDM 7.1 it occurs rather under the hood) well do the explanation of the root cause based on screenshots in MDM 5.5.
December 2010
26
In MDM 5.5, the record data contained in source XML files was staged in MS Access before being imported into MDM (right after connect to source). This staging process occurred in a virtual workspace and consisted of decomposing the hierarchically structured data in the XML file into a series of flat Access tables ("flatten) (users would later have to manually recompose the original data structure in the Import Manager). From a point in time this happens during creating joins and lookups in the Import Manager.
Bu t r igh tb e fo re i mp o r t , we a r e "f o rce d "to c o n ve rt th e se h ier a rch y d a ta i n to r e co rd s (" fl a tte n ").
In MDM 7.1 XML is NOT converted into MS Access. Now the XML files are flattened in a MDM internal data format during the import (means after the user pressed the Import button). XML is converted into a staging area which preserves the hierarchy data. But right before import, these hierarchy data is converted into records ("flatten"). This is necessary as any source fields regardless of how nested they are can be mapped to any destination fields, and the only way to know the relationship between these source fields is to convert them into records. In order to not convert XML into records, heavy restrictions during field mapping would be needed, which allow import to directly translate the source hierarchy into repository => staging still happens, but its no longer visible to the user.
December 2010
27
Recommendation o Perform several imports with simplified Import Maps. Split a single complex import map into individual maps - one to load the main table, and additional maps for each qualified lookup table. Split Import Files into several files with less segments o One for Main table only One for each Segment in XSD-Structure
Whenever possible, load all the reference data (flat/ hierarchy/taxonomy tables) before loading the main table.
Recommendation applied to our real life example leads to import of 43 records compared to 1860 records !
December 2010
28
6.7 How to import lookup main

Import Manager handles main table lookup fields (Lookup [Main]) differently than other MDM lookup fields. Specifically, Import Manager does not display the complete set of display field values of the records of the underlying lookup table. Instead, the values it displays for a main table lookup field are limited by both the key mappings for the lookup table and the values in the source file. Also, Import Manager does not automatically display the values of a Lookup [Main] destination field in the Destination Values grid when you select the field in the Destination Fields grid. Instead, for a main table lookup field value to appear in the Destination Values grid, all of the following conditions must be met: The lookup table must have key mapping enabled The lookup field must be mapped to a source field The source field must contain key values for the lookup table The destination value must have a key on the current remote system The destination values key must match a source field value Important Source values are added to text display fields only. If the lookup table has no text display fields, the new record is entirely blank. Note At the moment lookup main cannot be used as matching field in Import Manager Example Product lookup main Supplier The lookup table must have key mapping enabled
The lookup field must be mapped to a source field
December 2010
29
The source field must contain key values for the lookup table The destination value must have a key on the current remote system The destination values key must match a source field value
December 2010
30
Example Product lookup main Supplier (multivalue) The source field contains multi-value key values for the lookup table.
The destination values keys do not match a source field value. Therefore set a Field Mapping Delimiter:
December 2010
31
The destination values keys are now matched to the source field values.
6.8 Importing single source field to multiple display fields

A Main table-field with reference to Lookup Table containing two Display fields has to be imported.
Lookup table has key-mapping disabled, the initial population of lookup table using both display-fields as matching key During Import into Main table only incomplete information regarding lookup -values is provided (Only ISO-Code, but no country name provided).
This means automapping of values doesnt work here. How to perform an efficient automated import of the values?
December 2010
32
For better understanding the problem in detail:
Solution: Enable Key-mapping for Lookup-table and use remote-key for matching during initial population.
For import in the main table mapping in Import Manager is much more comfortable.
December 2010
33
6.9 Strategies for Attribute/Value Import

From an ERP-system main table records including taxonomy and attribute information should be included. How to perform Import Mapping for attributes as convenient as possible?
Solution 1: o Challenges o o o o Each attributes defined as an own element !!!! => XSD-Definition could get very large New attribute in Taxonomy leads to enhancement in XSD-Definition => Repository needs to be unloaded for updates The more attributes you have, the slower the Import works In Import Manager each attribute needs to be mapped XSD-File Definition has attributes in own segment
December 2010
34
Solution 2: o o o XSD-File Definition has attributes in own segment Attributes are defined abstract Attributes are unbounded in definition
December 2010
35
Example for a XML-File with values:
What has to be done in Import Manager?
December 2010
36
Advantages of solution 2: XSD structure is very simple structured o o o Doesnt change when additional attributes need to be created This means also no repository maintenance required Import Mapping very simple Only one field to be mapped to handle all attribute/value combinations !!!!
6.10
o o o
Excel Import
What is the influence of the regional settings? How should the different d ecimal separators , and . be handled in the import maps? How are the different formats of Excel are handled in Import Manager?
In the following business case Excel sheets are imported from different regions.
December 2010
37
Example: Import Price information
Currency (General, Number)
Text - Different settings of decimal point
December 2010
38
In the current release it is not supported to parse two different settings of the decimal point (/thousands) in source tables with the same map, since the current design is the ability to have only one setting per map. Recommendation o o Use only Numeric Excel formats for the prices as described in the previous slide If Text format should be kept, use two import maps if source is always known
Text - Text interpretated as Numeric
Why is the Price formatted as Text in Excel interpreted as NUMERIC in Import Manager? For the interpretation of Import Manager not the format but the internal Data type Excel is using is relevant How to check the Excel Data Type Internal Data type:
December 2010
39
6.11
Parallel Import
You can place multiple repositories on the same server if sizing requirements are met.
Parallel imports to different ports of the same repository are not supported. However parallel imports to different repositories on the same server are supported. For parallel import/syndication a separate storage location for each repository should be used to improve performance. No additional hardware raid controller is needed. A redundant array of inexpensive (or independent) drives (or disks) (RAID) is an umbrella term for data storage schemes that divide and/or replicate data among multiple hard drives. They offer, depending on the scheme, increased data reliability and/or throughput. A RAID controller is a device which manages the physical storage units in a RAID system and presents them to the computer as logical units. External disk arrays are usually purchased as an integrated subsystem of RAID controllers, disk drives, power supplies, and management software. RAID adapters for use internal to a PC or server are sold either as embedded systems in the computer or as separate expansion cards.
December 2010
40
6.12
Transport of maps (value mapping)
Exporting an Import Map in DEV and Importing the map in QA might lead to wrong value mappings during Import. What is going on?
The problem exists only when Value-mapping is used. In this case MDM uses internal ids which can be different from system to system to store the mapping info. System behavior in MDM 5.5 If an import map was created the structural and the value mapping was based on the internal id of the fields and the values. If the map was then transported from a Dev to a Quality system the internal ids might differ. As a result the maps had to be reworked. In MDM 5.5 it could happen as well, that field mapping and not only value mappings were lost. System behavior in MDM 7.1 The structural mapping is based on the Field code now. As there are no codes for the values, the value mapping is still based on the internal ids and not on the values itself. This means if the map was transported from a Dev to a Quality system the internal ids might differ again and as a result the maps have to be reworked. Recommendation Activate key-mapping for the respective table => the mapping will then not be performed on values but on remote keys (The decrease of file size of the maps by switching to key mapping is the positive side effect) or Manually adjust value mapping
December 2010
41
6.13
Import XML (XSD.exe)
Starting with MDM 7.1 an additional file xsd.exe is required in order to process XML-Files with the Import Manager. To enable the MDM Import Manager to generate XML schemas from XML files upon import, the xsd.exe must reside in the same folder as the Import Manager executable. The xsd.exe is part of the Microsoft .NET Framework SDK (Software Development Kit) 2.0, which can be downloaded from the download center of the Microsoft web site.
6.14
How to avoid disappearing value mappings
During an Import with Import Manager value mappings are performed and saved in an already existing map. Loading the map again shows that some value mappings disappeared from the map.
There are two ways to save a map 1) Save saves only the current mapping s and disregards value mappings done in the past 2) Save update keeps previously made value mappings and adds the new ones to the map
December 2010
42
6.15 Difference between Value Exceptions and Import Exceptions

Below are some examples in order to better understand the difference between Values Exceptions and Import Exceptions during Imports using the Import Server. Example for a Value exception In an Import Map incoming text-values are mapped to an Integer-Field. The Values are ("A", "B", "C") and they are mapped to values (1,2,3).
Incoming Value A B C
Value in Repository 1 2 3
A new value D is part of the import -File, but has no integer values mapped to it. In this case the system doesnt know how to handle that value and as a consequence generates a value exception. Example for an Import exception In the MDM Console a field is defined as unique. There are three records stored in the repository. For record 1 the field has the value "17" record 2 the field has the value "18" record 3 the field has the value "21"
During an import a new record has to be created, however for the unique field it has e.g. the value "18" a value which already exists for that field in the repository. This would violate the uniqueness criteria for the field and thus Import Server would generate an Import exception. Import exceptions basically are raised for all other exceptions besides value exceptions and structural exceptions.
6.16
Limitation on number of source fields
MDM 7.1 removed the limitation on number of source fields. It is now limitless where in MDM 5.5 SP6 it was limited to 255 fields because of the processing staging using MS Access (MDM 7.1 no longer uses MS Access as a staging area).
6.17 6.18
Limitation on import file size Port Sequence
Please see note 1274858.
Lookup [Main] fields provide the capability to relate two main table records together using a unidirectional relationship from the origin record to the target record.
December 2010
43
Example: Material Lookup main [Suppliers]
It is necessary to import the Suppliers first so that references to them can be imported later on. Therefore there is a need for controlling of which ports should be processed first. Port Sequence: Port 1: Suppliers Port 2: Materials Important But now the following scenario could happen for which an (manual) exception handling has to be set up in your project. MDIS is permanently polling the ports. At present the default behavior is to exhaust all import files in a given port prior to scanning of the next port. So MDIS starts with Port 1 but does not find any files and goes on to port 2. Before MDIS polls port 2 input files will be placed in the corresponding ready folders: Port 1: Suppliers
Port 2: Materials
Now MDIS would go on importing files from Port 2. But in this the import could fail (corresponding to the map configuration) as the dependant supplier information is missing. In this example file import of ZZ_Product_E.xml will fail as the file with the supplier information SN239877523xxx.xml will not be imported in advance.
December 2010
44
6.19 Access, Excel, complex XML not supported on Win64

When importing an Access, Excel, or complex (the XML includes external joins between elements ) the XML includes external joins between elements XML files through a Win64 Import Server (MDIS), the import task fails with error message "Error: COM error 80040154 Class not registered" in the log file. The reason is that Microsoft has not released a 64-bit version of the JET Engine library, which the 64bit MDIS relies upon to process Access, Excel, and complex XML files. As a consequence, all import tasks that require this library will fail with the error message "Error: COM error 80040154 Class not registered." MDM version 5.5: Customers who need to import Access, Excel, or complex XML files are suggested to switch to Import Server Win32 version on Windows 64bit OS. Another alternative is to eliminate the external join within the XML file / Import Map. MDM version 7.1: MDM Servers (MDS,MDIS and MDIS) for Windows 32bit platform are not released anymore in MDM 7.1, please see the MDM PAM (Product Availability Matrix) on the Service Market Place. In order to import Excel/Access source files, you would need use the MDM Import Manager GUI or alternatively, to extract/transform the data from the Excel/Access source files into XML files for importing with the MDIS 64bit. Please see note 1009016.
6.20
Maximum multi-record value
Performance in SAP NetWeaver MDM Data Manager can be dramatically changed by altering the parameter Maximum multi-record value display in Record Detail pane in Data Manger configuration options. Here low values like 100 or less should be set.
December 2010
45
7.
Syndicator and MDSS
7.1 Steps during Syndication

The syndication is implemented as a separate worker thread in the Syndication Server. At the given interval, the main MDSS thread will get the list of currently running repositories and start or stop syndication tasks for each repository as necessary. For each port with a syndication map assigned, it executes the syndication query.
Task handling of MDSS:
The first step in the syndication process is the initialization of the table cache (this may take considerable amount of time depending on the number of lookup table records) Remote keys for the main table records are generated in a separate protocol command before the first chunk is retrieved from MDS; Remote keys are generated in one step for the entire set of records selected by the query Then the syndication port is locked; Port locks are held in MDS memory, not in the repository itself
December 2010
46
Updating record timestamps for successful syndication happens inside MDS as part of saving files to the distribution ready folder (after the file blobs are successfully transmitted, unwrapped and persisted in the ready folder, we update the timestamps) Temp syndication files are then deleted After the syndication is completed (succeeded or failed), the port is unlocked. The syndication chunk size is the number of source table records (and all the nested sub records for mapped lookups) read from MDS in one transaction. Default is 100 records, but it can be configured for MDSS in its ini file (BLOCK_READ_SIZE entry), and for Syndicator in the registry.
Note The smaller the chunk size, the more responsive MDS is for other clients' requests, especially those with data modifications. With bigger chunk sizes (such as 5000), MDS may block the repository while the chunk data is prepared inside the transaction, although the delay highly depends on the syndication map and repository structure.
Note The record timestamps for a particular syndication request will be updated only after all its result files (there may be many if split by records) have been successfully moved to the ready folder. If MDS ran out of disk space, the incomplete syndication results will be removed and the record timestamps will not updated.
December 2010
47
The Log file is copied to MDSS memory, goes via TCP/IP to MDS and is then saved in the distribution port's log folder After that the temp log. File is deleted
7.2 MDSS.ini Parameters

MDSS has been redesigned in MDM 7.1 to improve performance by reducing port-processing delays. MDSS now assigns up to three port processing tasks to every remote system on an MDM Server. Each task is dedicated to a specific type of port on the remote system, as described below.
December 2010
48
Note MDSS does not assign a task to a remote system if there are no corresponding ports for that task type on that remote system. The Automated task syndicates only to Automatic|Continuous ports. It works in a continuous loop by syndicating to each continuous port on a remote system, on a repository-by-repository basis. After it finishes syndicating to all of a repositorys ports, the Automated task waits the number of seconds specified in that repositorys MDSS Auto Syndication Task Delay property before continuing. The settings can be specified in the global or repository specific section. Auto Syndication Task Delay (seconds) = The num ber of seconds MDSSs automatic syndication task waits after syndicating to all ports associated with the repository. Value copied from the corresponding property in MDM Console. The Scheduled task syndicates only to Automatic ports with Hourly; Daily; or Weekly values in their Port Processing Interval properties. Instead of looping, it sleeps until a syndication time is due for a port. The Manual task syndicates only to Manual ports which have syndication job requests from a Workflow or API waiting on them. It works in a continuous loop, scanning each Manual port on a remote system for syndication requests to fulfill. Like the Automated task, it works on a repository-byrepository basis. After it finishes syndicating all of a repositorys manual jobs, the Manual task waits the number of seconds specified in that repositorys MDSS Manual Syndication Task Delay property before continuing. The settings can be specified in the global or repository specific section. Manual Syndication Task Delay (seconds) = The number of seconds MDSSs manual syndication task waits after syndicating all requests from ports associated with the repository. Value copied from the corresponding property in MDM Console.
Note If more than one syndication request is waiting on a port, MDSS fulfills all requests on that port before it resumes scanning. Recommendation The number of ports for each repository should be limited due to performance reasons as far as possible. The MDM Server Check Interval (sec) is used to monitor repositories start/stop events. At the given interval, the main MDSS thread will get the list of currently running repositories and start or stop syndication tasks for each repository as necessary. The default value is currently 20 seconds. Note The parameter Auto Syndication Task Enabled is obsolete, with MDM 7.1 auto syndication is always enabled
December 2010
49
The parameters can also be adjusted in the MDM Console on server level
or repository level
7.3 Syndication remote keys

Enabling the creation of remote keys can affect performance, for this requires an exclusive lock on the repository.
7.4 Suppressing Initial Syndication

An Initial Syndication should be suppressed. Important Suppress Unchanged Records doesnt help MDM memorizes updated systems by Remote system, not by Map or Port Please see the following procedure: o o o o Use MDM Syndicator GUI and create a simple map, e.g. containing only one field Perform Syndication for remote-system and choose as destination your hard disk After syndication the remote-system is considered syndicated Now start Syndication server => from now on only Delta-changes will get syndicated for the remote-system.
December 2010
50
7.5 Field triggers for syndication

A record should be syndicated only, when certain fields, e.g. Description, Material Type have been changed. The motivation is to o o Perform Syndication only, when important fields are changed Avoid triggering unnecessary workflows approvals in Client system
A workflow does the trick: o o Workflow called immediately after Record Update, Record add, Record Import Validation checks whether specific fields have changed One branch (called when validation returns true) triggers syndication (marks record for syndication) One branch stops the workflow (no action in this case)
December 2010
51
For further information please see the How to guide How to Activate Field Triggers for Syndication http://www.sdn.sap.com/irj/sdn/go/portal/prtroot/docs/library/uuid/60a2b4e4-6993-2a10-0e9ac6d720f1571b
7.6 Changed Records are not being syndicated

MDM tracks record syndications on the remote system level, not by port or map. As such, the first time a changed record is syndicated to a remote system, MDM considers that record to be unchanged for all future syndications to that remote system. For this reason, if you have multiple ports or maps which syndicate to the same remote system, only the first syndication will include the changed records.
December 2010
52
8.
Data Model
8.1 Performance considerations

By design, access to tables and fields in MDM is optimized for the main table, which leads to performance issues when other tables such as lookup tables or qualified lookup tables contain large numbers of records. Repositories should be always designed in such a way that the majority of records are kept in the main table, with the sub-tables containing only a minor part of the records in the repository. If a repository shows a different distribution of records, then this might be an indicator that the main table object should be different from the one chosen or that there are other issues with the design.
The data model is a cornerstone and one of the most important success factors during the MDM implementation. It needs to be planned carefully. With the following formulas, a hint of the influencing parameters is given and how they affect memory usage and CPU consumption.
December 2010
53
8.2 Number of main table fields

Best practice: try to limit the number of main table fields to lower than 100. A number above 500 should absolutely be avoided.
Memory Usage Server Data modeling decision # of main table fields (# of records) * (# of fields) * (average size of field value) Caches only 100 result set records, so the number of fields has a minimal impact. Client
CPU Consumption Server Client
Minor impact when result set records are returned
Minor impact for displaying records in results grid and detail pane.
8.3 Lookup tables

Flat lookup tables in MDM store reference data. They are used for enforcing data consistency for specific data domains which are rather static. Flat lookup tables act as a picklist that defines the set of valid values of the corresponding lookup field for data entry. Typical examples of lookup tables are Plant in the Material repository or Partner Type in the Business Partner repository. These examples make it pretty clear that a lookup table should have a limited number of records. Review also the business process. The user should still be able to select a record in a lookup table. This might be impossible if the lookup table contains thousands of records. Lookup tables that are used for lookup referencing from other table should host thousands of records and tables with thousands of entries should be carefully evaluated. Lookup tables and qualified lookup tables should have a very limited number of fields and number of records. Lookup table that are not referred should be eliminated. Lookup tables records between 10,000 and 100,000 records should be avoided. Lookup tables records above 100,000 should be absolutely avoided.
December 2010
54
Memory Usage Server Data modeling decision # of lookup table records In general: Main table-related data tends to overshadow lookup record memory usage If the statement above does not fit: (# of records) * (# of fields) * (average size of field value)
.
CPU Consumption Client Server Client
GUI clients cache lookup record display fields plus internal record information. More records use more memory.
More CPU cycles are required to produce the limited set of lookup records during the search execution.
Minor impact on building picklists. Minor impact receiving limited lookup table results.
8.4 Nested lookups

A very large number of nested lookup tables will lead to performance issues.
8.5 Qualified tables

A qualified table in MDM stores a set of lookup records and also supports qualifiers, subfields that apply not to the qualified table record by itself, but rather to each association of a qualified table record with a main table record. The limiting element in the usage of qualified lookup tables is the set of lookup records (this is what is considered the non-qualifier part), not the qualifier fields. Here the same principle as to the normal lookup tables can be applied the overall number of lookup records should be limited. This means the qualified lookup table should not contain more than 10,000 records. Example 1: Non-qualifiers and Qualifiers: Here we have quality management settings for materials. Settings (inspection percentage and max. inspection time) are specified depending on a plant and the inspection type.
Material 1001 1001 1001
Plant Berlin Berlin Munich
Inspection type Goods Receipt Production Shipping
Inspection percentage 2 10 1
Max. inspection time 24 hours 2 hours 6 hours
December 2010
55
1001 2001
Munich Chicago
Goods Receipt Goods Receipt
2 1
72 hours 0.5 hours
The qualified table associated to this qualified lookup field will have two fields, Plant and Inspection type as non-qualifiers, and the Inspection percentage and Max. inspection time as qualifiers. The different combinations of plants and inspection types are clearly limited in their numbers. In addition, choosing the max. inspection time as a non-qualifier would increase the number of records in the lookup table dramatically, besides the fact that from a logical/process point of view, it should not be defined as a non-qualifier. QM settings:
Non-qualifying fields Plant - Inspection type Berlin - Goods Receipt Berlin - Production Munich - Shipping Munich - Goods Receipt Chicago - Goods Receipt
Qualified field Inspection percentage 2 10 1 2 1
Qualified field Max. inspection time 24 hours 2 hours 6 hours 12 hours 0.5 hours
Based on this design, the structure of the catalog concerning the quantity based pricing looks as shown below:
Material 1001
Lookup [QM Settings] Berlin Berlin Munich Munich Goods Receipt Production Shipping Goods Receipt Goods Receipt 2 1 2 1 24 hours 6 hours 72 hours 0.5 hours
10 2 hours
1002
Chicago
Example 2: Non-qualifying entries Qualified tables can hold millions of qualified links, but the non-qualifying entries (qualified lookup table) should be regarded as flat tables in terms of data amount.
December 2010
56
Memory Usage Server Data modeling decision Qualified Lookup fields Additional indexes are required in addition to simple lookup fields. Cached qualifier fields are similar to main table fields in memory usage, but are multiplied by the average number of qualified links for each record. Minor impact. Only the currently selected records qualified links are cached. Client
Additional processing required to process searches on qualifier fields and to limit qualifier lookup fields. If any qualifier field is not cached, SQL is used in the search and limiting. This is significantly slower than the MDM in memory indices.
Minor impact. Only the currently selected records qualified links are cached.
8.6 Lookup main uni-/ bidirectional relationship

SAP NetWeaver MDM 7.1 supports the following options: Multiple business objects (main tables) can reside in a single repository Multiple main tables can reference the same look-up tables Main table objects can reference each other using a Lookup [Main] field type (e.g. a supplier can be related to several products) Main table objects can recursively reference themselves (e.g. Employees reference Employees)
December 2010
57
Lookup [Main] fields provide the capability to relate two main table records together using a unidirectional relationship from the origin record to the target record. When the lookup [main] field resides in a tuple, additional fields can be used to qualify the relationship. This is a unidirectional relationship in that the lookup [main] field belongs to the origin record and is not visible in the target record. For data ownership and editing some of the functionality needs to be bidirectional. The bidirectional link between main tables is planned to be supported with the next major release. For the current release there are two workarounds. A) One look-up main field from table Products into Supplier and one look-up main field from table Supplier into Products (as part of a tuple construct). Shortcoming: There is no automatic update of the relationship from both sides i.e. the second look-up is not updated and needs to be maintained manually. This endangers the data consistency.
December 2010
58
December 2010
59
B) A dedicated main table storing only relationship information e.g. table Product_Supplier_Relation containing 2 look-up fields into tables Products and Suppliers. This way, you can search in both directions and the relationship is valid and updated for both sides. Shortcoming: This only would work if the data volume, which is intended to be managed, is not too high. Keep in mind the multiplications that can appear in the mapping table. Besides the volume factor, the redundancy in the mapping main table is another shortcoming of this workaround.
December 2010
60
8.7 Tuples
A tuple is a record template that groups together a set of related fields into a reusable object or type definition that describes or composes a particular object or type
Impact of using Tuples instead of Qualified tables on sizing

The Business Partner Extended Repository with 1M records in the main table was used to test the effect of the conversion process; 11 out of 12 qualified tables in the repository were converted into tuples. The following table represents results of the test regarding the sizing parameters which were affected by the conversion:
The increment of memory and accelerator size is due to a new type of keyword indices (N-Gram) that were added in version 7.1. These indices are created side by side next to the original keyword indices for a tuple field which has keyword index on. For fields in a qualified table with keyword index on, only original keyword indices are built. The sizing requirements for hardware are higher when using Tuples instead of Qualified tables. See the above table for estimates of these requirements. The sizing formulas will be modified to reflect the impact and will be published in the next version of the Guide after we complete tests on other types of repositories. For further information please see the sizing guide.
December 2010
61
8.8 Validation Tuples vs. Validation on qualified table

Qualified lookup table fields can be used within validations for a main table. Since tuples can be reused within multiple record fields, the validation scope has been enhanced. Validations using tuple records or tuple member fields can be defined either for a main table or for a tuple itself. Latter is called Tuple Validation. A validation on a tuple runs for all instances of a tuple, i.e., for all fields which references the same tuple. You need to define a validation for the main table if you like to run the validation for one tuple field only or you need to run a validation comparing two fields that references the same tuple. For a validation on the main table, the record itself ([record]) and all tuple member fields are supported for single-valued tuple fields only. A validation on a multi-valued tuple field can use the record only. This limitation does not apply to tuple validations, i.e., all tuple member fields can be used within the validation regardless of being single-valued or multi-valued. Example: Validation on a tuple Assuming you have two different address fields in your main table which refer to the same Address tuple, namely ship-to address and bill-to address. It is required that if an address is maintained the postal code is filled, so you have to run a validation to check the same. Here, you would define a tuple validation. Whenever you run the tuple validation, it will check both address fields at the same time.
Tuple: Supplier Address
December 2010
62
To create a validation on a tuple , select the Validations tab and the corresponding tuple from the menu in MDM Data Manager:
Create the validation expression. As stated above, all member fields can be used:
Examples:
Ship-To and Bill-To Addresses are not filled
December 2010
63
Ship-To and Bill-To Addresses are filled; Postal Code is missing for Bill-To Address
Ship-To and Bill-To Addresses are filled
Example Validation on the main table Shipments into foreign countries will be handled differently to domestic shipments, so the respective validation should run on the ship-to address only. Here, you would need a validation on the main table. To create a validation on the main table , switch to Validation tab in MDM Data Manager and select the main table.
December 2010
64
As stated above, for multi-valued tuple fields only the record is supported:
For single-valued tuple fields both are supported, record and member fields: Note The same restrictions as described above apply to assignments, i.e., only the record of multi-valued tuple fields can be used in assignments whereas all member fields can only be used for single-valued tuple fields. Note A callable tuple validation can only be called from within another tuple validation. It cannot be called from within a validation. For more details about validations for tuples, please refer to the MDM 7.1 Data Manager Reference Guide, available on SAP Service Marketplace http://service.sap.com/installmdm71.
Validation behavior CAUTION The validation for multi-valued tuple fields and qualified lookup table fields might behave differently, see example below. Please test all your validations after having migrated.
December 2010
65
Example In the following example we check if the qualifier Manufacturer Part Number is filled using the function IS NOT NULL in the validation expression.
For the qualified lookup table field, the validation result is TRUE if at least one record is not null, it is FALSE only if all records are empty. For the tuple field, the validation result is TRUE only if all entries are not null, it is FALSE if at least one entry is empty.
Validation failure Assuming, you created a validation using a multi-value field either based on a tuple record or a qualified lookup table. In case of a validation failure, you wont see which tuple record or qualified table record caused the error. Furthermore, for tuple fields where the same tuple is used multiple times, you wont see which tuple field caused the error.
8.9 Key mapping

Key mapping is used to map objects of a remote system to master data objects within MDM. Remote keys increase the loading time of the repository. The key mapping information is always read from the DB and is therefore not available in the memory. Due to this, it is also not possible to search on remote keys.
Memory Usage Server Data modeling decision key mapping (# of records) * (# of remote systems) * (average size of key) No impact. Client
Similar performance as Text fields.
No impact.
If key mapping is enabled but the remote keys are not filled, it will have no impact on performance and sizing.
December 2010
66
8.10
Change tracking
The Change Tracking table tells the Master Data Server (MDS) which data modifications to track. Each entry set to Yes will cause the MDS to write one or more rows to the History table in the DBMS depending on the operation. Set the entries to No when tracking is not needed. For example, setting a field such as the Products ->Name field to Yes causes one row to be written when a record is added or deleted or when that particular field is modified. Setting all n fields in a table to Yes will cause n rows to be written with a record when it is added or deleted. For example, setting a field such as the Products->Name field to Yes will cause one row to be written when a record is added or deleted, or when that particular field is modified. Setting all x fields in a table to Yes will cause x rows to be written when a record is added or deleted. Moreover, if you have marked the taxonomy lookup as Delete=Yes, for each linked attribute a record will be written to the Change Tracking table. Tip For Best Practices to Archive MDM change tracking table A2I_CM_HISTORY please see note 1343132.
8.11
Calculated fields
Calculated fields are calculated each time a repository is loaded. The fewer calculated fields a repository has, the less CPU-consumption this causes during repository load. Calculated fields are updated when a record is added or updated. Note The import process does not update calculated fields. You must manually trigger a recalculation after import. Recommendation The number of calculated fields has an impact on performance. It is dependent on the complexity of the calculation. A calculated field with a simple calculation will require about the same processing as a non-calculated field. Calculated fields often use a search parameter. Therefore check the alternative use of callable expressions. When creating expressions, avoid using string functions and nested If statements.
8.12
Keyword index
We do not recommend that you use a keyword index on unique fields. This information would be stored in memory and is of no additional use as the field itself is unique and therefore needs no index.
December 2010
67
Example A Description field is an example of a field you want to keyword; even though the total number of words for all records will be large, the number of distinct words will be small because they are most likely all words from the dictionary. A Part Number field is an example of a field you should not keyword because it is likely to contain a different value for every record. Consider also using free form search instead of enabling fields for keyword search. Memory Usage Server Data modeling decision keyword index Mostly dependent on the number of records since the number of unique keywords tends to be small (fewer than 20,000). However, setting a unique field as keyword=normal will flood the keyword list and use a large (400 bytes per unique word) amount of memory No impact. Minor impact on create, read, update and delete of key worded objects (attributes and records). Setting a unique field as keyword=normal will flood the keyword list, making updates and searches slow. No impact Client CPU Consumption Server Client
If you define Keyword = Normal for your fields, you will have a big impact on the load repository process and on the virtual memory (VM) size used at runtime. It might cause overheads when adding, deleting or updating records. Set Keyword = None whenever there is no need for a keyword definition and where the number of unique values in this field is not too high. For example, Part Number is definitely NOT a candidate for having Keyword = Normal. A lookup table display field is not a candidate either, it is now set to Normal by default, but there is no need to do this. Recommendation Try to eliminate the use of keyword index for fields where it is not necessary and do not use keyword index on unique fields.
8.13
Multi-Lingual Fields
The shipped standard repositories are designed to support numerous languages. These languages add overhead to the system, so it is recommended that you delete all unnecessary languages remember that a language can always be added to a repository later on. Also, for completely new designed repositories, only the required languages should be activated If a field is set to multilingual, the data will be replicated on each level. By default, this attribute is set for all text fields.
December 2010
68
Note Memory usage in server overhead as it multiplies data storage, sort index storage for multilingual fields by the number of languages and server CPU consumption as it increases text fields update due to language inheritance. Indexes need to be updated for all languages; even if only a single language layer is changed. Recommendation MDM performance is affected by the number of languages-specific fields (multiplication of Languages number and fields count). Therefore it is advised to minimize the number of Multi-Lingual fields as much as possible.
Memory Usage Server Data modeling decision Multilingual Multiplies the data storage and sort index storage for multilingual fields by the number of languages. No impact. Client
Moderate increase in updating of text fields due to language inheritance. Indexes need to be updated for all languages, even if only a single language layer is changed.
No impact.
8.14
Sort index
Sort index property allows a field be sorted by the Data Manager in Records mode and also by APIs. Sorting requires indexing which consumes additional memory for the indices. The indices are recreated when a repository is loaded, thus effect performance during repository loading time and also affect import activities. Recommendation Since sort indexes affects MDS performance, it is advised to eliminate its usages as much as possible. Memory Usage Server Data modeling decision Sort index 4 bytes * (# of records) * (# of sort indexes). Note that multilingual fields use one index per language. No impact. Medium impact on record CRUD (soon to be improved). Free Form equal to or starts with search should only be used on fields with a sort index. No impact. Client CPU Consumption Server Client
December 2010
69
8.15
Display fields
A display field for a table is a field whose value is used as: (1) the corresponding lookup field value for each record (2) the node name for the record in hierarchy trees (3) the name of the record in the Product Relationships popup window When a table has multiple display fields, the value that is used for each record is the value combination of the display fields, with each pair of values separated by a comma (,). Display fields are used to create a non-technical key for the main object and the sub-objects and to determine the fields shown to the user in case of a lookup relationship. Recommendation Carefully evaluate which field should be used as a display field. The combination of the display fields should be a unique key in the system. Handle the usage of display fields like the unique fields definition. It is recommended you do not use more than three display fields. The UI clients cache all display fields of non-main table records. More display fields require more memory.
Memory Usage Server Data modeling decision Display field No impact. The UI clients cache all display fields of non-main table records. More display fields require more memory. Client
Minor impact at UI client login to construct the combined display string for all lookup records. Minor impact for API searches that request back the combined display string since it is generated for each request.
Minor impact for Data Manager import/export since display fields will be parsed.
8.16
Taxonomy
The memory usage of MDS depends mostly on the number of record-attribute values which is (# of main records) * (average # of attributes linked to a category). The GUI clients cache all Attribute definitions including Feature Text Values. If you have many of them, the time of client login depends on this. The caches are filled with all fields and values of the repository. The attribute definition including all feature values exists only once. It can be linked to multiple nodes in a hierarchy. You can think of an Attribute definition as a table, and its set of Feature Values as records.
December 2010
70
A high number of attributes and text feature values should be avoided.
Memory Usage Server Data modeling decision Taxonomies Depends mostly on the number of record-attribute values which is (# of main records) * (average # of attributes linked to a category) The GUI clients cache all Attribute definitions including Feature Text Values. Client
Additional processing required retrieving record attribute values. Search processing is similar to lookup fields. Limiting of attribute values mostly depends on the number of unique values, similar to limiting a lookup field.
Additional time needed to load and construct attributes at startup. Minor impact on managing the Attribute grid in Taxonomy mode.
8.17
Relationships
Memory Usage Server Client CPU Consumption Server Client
Data modeling decision Relationships Minor impact. Only the counts are cached. No impact. Minor impact on Record Delete in order to remove relationships. No impact.
8.18
Workflow
The workflow will be kept in memory even if the status is completed. Delete the workflow items on a regular basis. Please see also chapter 9.3 Workflow.
December 2010
71
9.
Core Features
9.1 Matching
9.1.1 Matching Performance
A matching run might last very long. What are the influencing parameters and what matching strategies can be used to improve performance? How is an operation calculated? Influencing parameters for matching time are: o o o o Number of matching rules Function Equals vs. Token Equals Number of source records Number of target records Example The number of operations for 3000 x 50000 records is 3000 multiplied by the number of rules in the matching strategy. In the strategy MDM_Persons there are 4 rules, so the number of operations (displayed when the matching is run) is 12000.
Note It takes less time to perform an operation if function Equals is used due to the number of specific comparisons.
December 2010
72
Recommendation Use workflow to plan nightly matching run and persist matching results Try to reduce the number of tokens By using transformations matching rules. Matching runs on the transformed field. BUT: Transformations are performance intensive if a large number of records is involved.
By creating an additional matching field in the data model and filling it via import using value conversion filters (Replace (, ), AG, Corp, .). Matching runs on the additional field.
December 2010
73
In addition please consider the following topics: Matching is done asynchronously. This means if during a matching run the Data Manager is killed by the user, the matching task still runs on the MDM server. A very large matching run might freeze the Data Manager (UI freeze) that started it. But it will be possible to log on from another Data Manager. A performance issue associated with executing very large matching strategies in real-time might impact the usability of other logged in users if only one CPU is available. The result of the matching is stored in memory, but the scores are calculated when the source record is selected. This means that a change of a record during the matching run is considered, and might lead to a different result if the value of the matching fields has been changed for that record after the record was considered in the matching run. Note With SAP NetWeaver MDM 7.1 SP02 (and and 5.5.64) Multithreaded Matching is enabled using the following mds.ini setting: Multithreaded Matching=True. Multi-threading means a single matching request can be spread across multiple threads. Current multi-threading: 1) Does not help the transformation process when starting a matching request. 2) Is essentially blocked in 5.5 when using token equals 3) Reserves one core for other system request (you need to have more than 2 cores to use this feature).
9.1.2
Stemming for matching
With the introduction of MDM 5.5 SP04 the stemming engine can be also used for matching (MDM tokenizing). MDM tokenizing for matching can be done either with the Stemmer algorithm, or with internal MDM algorithm. Stemmer algorithm is depended upon language and preconfigured linguistic rules; therefore it is difficult to predict the matching results. MDM Internal tokenizing algorithm uses non alpha-numeric characters as token delimiters. When a token contains numeric characters (with or without alphabetic characters), the following characters are allowed to be part of the token: - . , / To activate MDM Internal algorithm for tokenizing, disable the Stemmer by setting the Stemmer Install Dir parameter value in MDS.INI to be empty.
December 2010
74
Stemmer Install Dir= Note For further information please see note 1452067.
9.1.3
Matching Strategy: Match attribute values
The following chapter describes how to create a matching strategy to match attribute values. To identify duplicates there is the need to compare the taxonomy values. The comparison of the category might be not sufficient.
Therefore the attribute values have to be compared.
Attributes usage and fill rate must be analyzed before implementing any of these methods, to minimize the list of attributes which are matched as part of the strategy. Listed below are the alternate methods which can be combined together to achieve best matching results and performance. The general idea is to use the equal function instead of Token equals to improve performance. o Import the attribute values into the matching fields (in addition to the taxonomy attributes). Advantage: Repository loading time is not damaged because of calculated fields. Disadvantages: The list of matched attributes is not dynamic; it is embedded in the repository schema. Import maps should be modified. o Calculated fields holding distinct attribute values. Advantage: The list of matched attributes is dynamic; it is determined by the calculated field expressions.
December 2010
75
Disadvantage: Repository loading time is higher because of calculated fields. o Calculated field holding concatenated mandatory attribute values. Advantages: Only single field is matched for all attributes. The list of matched attributes is dynamic; it is determined by the calculated field expressions. Disadvantages: Repository loading time is higher because of calculated fields. The concatenated attribute values should not be NULL. Example: Calculated field holding concatenated mandatory attribute values For a matching on attributes an additional field is populated with mandatory taxonomy attribute values. 1. Add taxonomy match field to the main table
2. Add callable validations for calculation of Matching Attributes field To define the mandatory attributes within a repository following steps can be performed 1. Select the taxonomy branch at the Data Manager search pane. 2. Use the attributes search pane to find attributes with no NULL value. 3. Attributes can be added to the calculation expression
December 2010
76
Note Be aware that an attribute is not mandatory any more if a new record is imported that has a Null value for a mandatory attribute.
9.2 Stemming
When searching for keywords using the Progressive operator, MDM uses the Inxight stemming engine (if installed) to extract the stem (or base form) of the entered search words. Base forms are simply the form of the keyword found in the dictionary. With keyword stemming, MDM finds records containing the search words you entered, plus record containing any variants of your search words, saving you the hassle of performing multiple searches or simply not finding all of the records you were looking for. Three preconditions exist: 1. For the relevant field the keyword setting has to be set to normal
2. In the free form search you have to use progressive search.
3. The InXight stemming engine has to be installed under the path defined in the mds.ini
When searching for keywords using the Progressive operator, MDM uses the Inxight stemming engine to extract the stem (or base form) of the entered search words.
December 2010
77
Inxight stemming engine is included as part of MDM Installation. The path for the stemmer in the ini file should point to the location of the lang directory. If you open the Lang directory you can see the stemmer files for all the different languages.
Keyword needs to be defined in MDM 7.1 as Stemmer but this is just a renaming and it should automatically rename during upgrade from MDM 5.5.
MDM 5.5
MDM 7.1 Using stemming makes it easier to search since you dont have to remember the exact word. For instance if you have records containing the following values in the keyword defined field: see, saw, and seen, then a keyword progressive search for see will return records containing saw and seen as well. This is of course also a business decision if this behavior/ result is required!
9.3 Workflow
9.3.1 Accessing the Workflows Table
It is possible, given certain conditions, that the Workflows table cannot be accessed in Data Manager. 1. You havent installed the Workflow component MDM 5.5 only 2. The Operating System user you logged into your computer with doesnt have write access to the following registry keys 32 bit Operating System HKEY_LOCAL_MACHINE/SOFTWARE/SAP/MDM 5.5 HKEY_LOCAL_MACHINE/SOFTWARE/SAP/MDM 7.1
64 bit Operating System HKEY_LOCAL_MACHINE/SOFTWARE/Wow6432Node/SAP/MDM 5.5 HKEY_LOCAL_MACHINE/SOFTWARE/Wow6432Node/SAP/MDM 7.1
This problem and the solution are also explained in SAP Note 1300651.
December 2010
78
9.3.2
Import and Workflow
A record can only be in one workflow job at a time. Please check the following example when a workflow is triggered via an import: 1. File 1 includes record A and record B 2. File 1 is imported and a workflow is started 3. File 2 includes record A and record C 4. File 2 is imported This would lead to the conflict that record A is already part of an active workflow job. Therefore File 2 will be put into an exception folder. It will not be split or automatically retried. It will be handled as any files that fail to be imported.
9.3.3
Completed Workflows
With MDM 5.5 or MDM 7.1 there is no automatic deletion of workflows with status "Completed". As a consequence the list of workflow jobs with status "Completed" might be long. These completed jobs are loaded into memory while the repository is loaded. It not only takes up the memory but also affects the software's performance. The number of completed workflow jobs should be decreased. In the current release there are only two possibilities: 1. Delete manually the workflow list with status "Completed". 2. Use the MDM Java API code utility to delete workflow list with status "Completed", it can be run with scheduler after packing it into JAR file (Please see note 1240587).
9.3.4
Workflow Thread
When MDM auto-launches a job based on a max records or max time threshold setting, you should expect up to a to a five-minute delay before the job is actually launched, since the workflow thread sleeps and wakes up every five minutes. If a syndication step is triggered within the workflow, the workflow will listen to the notification coming back from the syndication process. The job will proceed immediately after receiving the notification back without delay.
December 2010
79
10. JAVA-API
10.1 Notifications (Events)
The MDM Java API allows you to build an application that can receive notifications about events e.g. record added, record up dated, etc that occur in MDS. An example of this would be to perform complex validations on a newly created record that cant be done using MDM validations. CAUTION The one comment of note to be made about this functionality is that these notifications are not guaranteed to be delivered to any application that is registered to receive them. Once the Master Data Server sends the notification that an event has occurred, it doesnt have any way to know if the notification was received and will not resend the notification under any circumstance.
December 2010
80
11. Portal Integration

11.1 Portal MDM connectivity
The connection between the portal and the MDM Server works with user mapping only. From 7.1: User mapping is not obligatory. All MDM Java applications connect to MDS via Java API and MDM Connector, using either Trusted Session or Authenticated Session: Trusted Session The MDS is configured with a list of trusted connections. These connections are specified using their IP address Authentication: only user names are passed to MDM Possibility of change tracking based on portal user For further information please see the MDM 7.1 Security guide Authenticated Session Authentication: user name and password are passed to MDM (User Mapping only) Authenticated Session Either the portal user itself or the user group is mapped to MDM repository user(s). In this case the mapping data is inherited from the group. The same can be achieved by using roles.
Recommendation Create user groups and map the portal user group to an MDM repository user. Trusted Session Trusted connections enable users from safe machines to access MDM Servers and repositories using their sign-on credentials only (without having to additionally provide MDM Server and repository passwords). Note Trusted connections are currently available to SAP system usernames only. Note Users must still provide a DBMS password for operations which require a connection to the DBMS.
December 2010
81
There are three basic elements in a trusted connection. Trusted System. The machine sending the connection request. Authenticated User. The username signed on to the trusted system. MDM Server. The MDM Server receiving the connection request.
Trusted systems are identified by IP address in a special file (allow.ip). In this file, you can enter IP addresses for individual machines or for an entire subnet. Requests from IP addresses not included in this file are automatically denied. An optional file, deny.ip, lets you restrict specific IP addresses. Note You must create the allow.ip and deny.ip files, they are not created automatically by MDM. By default, the MDM Server looks for allow.ip and deny.ip in the folder containing the MDM Server executable file (mds.exe). You can change this location by modifying the TrustFilesDir parameter in the MDM Server configuration file (mds.ini). In order for users to connect to an MDM repository from a trusted connection, their usernames must exist on the MDM repositorys Users table. Alternatively, if the MDM Server is configured for LDAP use, the username must exist in the LDAP directory referenced in the MDM Server configuration file. If no matching username is found on the Users table or LDAP directory, access to the MDM repository is denied. Once connected, users are permitted access to MDM repository data and functions based on the MDM role(s) assigned to their MDM or LDAP usernames. SSO Support The connection to MDM via the SAP logon ticket will be supported Only for iViews and Web services. The ticket evaluation is done by the portal and not by the MDS as such this is not real SSO. The user will be extracted from the ticket and propagated to MDS Prerequisites: Trusted Connection AS-MDS configured Same users in Portal and in MDS: LDAP directory or users replication The User Management properties will not be added to MDM System object Example 1. The MDS is configured with a list of trusted connections by adding the IP address of the Portal AS to the allow.ip file of the MDM server 2. User login in the Portal with user ID and password 3. Portal validates user against LDAP or User Management 4. Call of MDM iViews 5. Portal sends user ID but NO pw to MDM for login. Authentication: only user names are passed to MDM. No user mapping necessary 6. MDM recognizes AS/ Portal as trusted server 7. MDM validates user ID
December 2010
82
...
11.2
Garbage Connections
There might be a lot of connections remaining even after logging off from portal, by looking at the MDM console. These dead connections won't have significant performance impact.
11.3 How to find MDM 7.1 Portal Content build version

You would like to know the build number of the MDM 7.1 SCA portal software components (MDM_JAVA_API, BP_MDM_TECHNOLOGY, BP_MDM_APPLICATION.sca files) that: 1. You would like to install to your system 2. Is currently installed in your system. 1. In any .sca file that you have (in the MDM_JAVA_API, BP_MDM_TECHNOLOGY, BP_MDM_APPLICATION .sca files) locate the SAP_MANIFEST.MF file. In this file locate the detailed information after "keycounter" key. 2. Check the detail version information of the MDM SCA portal software components visible at e.g. http://yourportal.com/index.html -> System Information -> all components -> then the info next to MDM_JAVA_API, BP_MDM_TECHNOLOGY, BP_MDM_APPLICATION. The detailed information should be in the following format: 1000.7.1.aa.bb.yyyymmddHHMMSS
December 2010
83
How to read this detailed information: aa - Service Pack level. bb - build number. yyyy - Year of creation (of the .sca file) mm - Month of creation (of the .sca file) dd - Day of creation (of the .sca file) HHMMSS - Hour of creation (of the .sca file) For example: if you see that the detailed information is 1000.7.1.2.59.20090413174341 then you are using 7.1 SP02 build 7.1.02.59 that was created in 13/04/2009 at 17:43:41 Please see note 1361165.
11.4
MDM Web Services

CAUTION This is the current state of planning and may be changed by SAP at any time for any reason without notice. (August 2009)
Web services for MDM are open interfaces to the MDM Server. They are based on the Simple Object Access Protocol (SOAP) and Web Services Description Language (WSDL) standards. Key Features Data management capabilities (create, read, update, and delete) Access to central key mapping (create, read). Synchronous access to MDM
The Web Services solution consists of two modules: Design time the Web Services Generator Runtime the Generated Web Services
The following graphic is an overview of the possible Web services components:
December 2010
84
The MDM Web Service design time software component must be deployed on: A J2EE engine of the SAP NetWeaver 7.0 installation (minimum SP number 15) or A JEE5 engine of the SAP NetWeaver 7.1 installation (EH1) CAUTION There are two versions of the Web Services Generator make sure that each version is deployed on the appropriate NetWeaver Java AS. MDM 7.1 server (minimum 7.1.02.51)
The MDM Web Service runtime software component must be deployed on: A J2EE engine of the SAP NetWeaver 7.0 installation (minimum SP number 15) or A JEE5 engine of the SAP NetWeaver 7.1 installation (EH1) CAUTION There are two different software component archives (SCA files), one that is compatible with the Java AS 7.0 and one that is compatible with Java AS 7.1 MDM 7.1 server (minimum 7.1.02.51)
December 2010
85
11.5
MDM Web Dynpro Components

CAUTION This is the current state of planning and may be changed by SAP at any time for any reason without notice. (August 2009)
MDM Web Dynpro Components are ready-made granular UI building blocks that are configurable and reusable and reduce the effort needed to create custom applications. Planned MDM Web Dynpro Components Result Set Item Details (CRUD) Search Building a Stand-Alone Web Dynpro Application Consuming an MDM Web Dynpro Component in an SAP NetWeaver Business Process Management (BPM) Process
Planned use case scenarios
A Web Dynpro Project is configured prior to configuring a MDM Web Dynpro Component (Item Details, Result Set or Search). The project acts as a container for the configured components. During runtime the MDM Web Dynpro Component is displayed according to the configuration. MDM Web Dynpro components can be consumed by other custom Web Dynpro components to create a Web Dynpro application with the flexibility to run as a stand-alone application or in a portal environment. MDM Web Dynpro Components can also be directly consumed in a SAP NetWeaver Business Process Management (BPM) process.
December 2010
86
December 2010
87
12. PI Adapter
12.1 Communication with MDS
The MDM PI Adapter makes use of the MDM Java API in order to get data from and send data to MDS. It also uses the MDM Java API to receive notifications of events such as an import or a syndication being completed. Note The MDM Java API must be deployed on the same physical server as the MDM PI Adapter.
12.1.1 Import - MDM PI Adapter to MDS

When the MDM PI Adapter receives an XML message from the Integration Engine, it makes a call to the MDM Java API which takes the XML message and writes it to the Ready folder for processing. The Ready folder it writes to depends on how the communication channel is configured. Once the file has been imported, MDS sends an event notification to the MDM Java API which is forwarded to all registered event listeners. Since the MDM PI Adapter is a registered event listener, it will receive it and update the message status in the Communication Channel Monitoring. This completes the process of sending data from PI to MDS. Note As previously noted here (10.1 Notifications (Events)) delivery of event notifications are not guaranteed. As such, it is possible that the status of a message in the Communication Channel Monitoring might not reflect the actual status because a notification wasnt received by the MDM PI Adapter. Syndication - MDS to MDM PI Adapter When Syndicator or MDSS completes syndication to a port, MDS sends an event notification to the MDM Java API which is forwarded to all registered event listeners. Since the MDM PI Adapter is a registered event listener, it will receive it and retrieve the syndicated files and send them to the Integration Engine. This completes the process of sending data from MDS to PI. Note As previously noted here (10.1 Notifications (Events)), delivery of event notifications are not guaranteed. As such, it is possible that files generated from a syndication might not get picked up by the MDM PI Adapter. If the event notification isnt received because the MDM PI Adapter is stopped or unavailable because PI is unavailable, then the MDM PI Adapter will retrieve all files in the Ready folder when it is restarted
December 2010
88
www.sdn.sap.com/irj/sdn/howtoguides

"Good To Know" Topics For A Smooth SAP NetWeaver MDM 7.1 Implementation

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

"Good To Know" Topics For A Smooth SAP NetWeaver MDM 7.1 Implementation

Uploaded by

Copyright:

Available Formats

SAP NetWeaver How-To Guide

Applicable Releases: SAP NetWeaver MDM 7.1 (SP01-SP05)

Topic Area: Information Management Capability: Master Data Management

Version 1.02 December 2010

6.3 6.4 6.5 6.6 6.7

6.16 6.17 6.18 6.19 6.20 7.

JAVA-API ....................................................................................................................... 80 10.1 Notifications (Events) .............................................................................................. 80

MDM Architecture Components:

4.3 Number of servers

5.1 MDM Server Architecture

5.2 Concurrent users

5.3 Memory usage

Example: Example for MS SQL Database: Tables are defined as nvarchar.

5.4 Repository load time

5.5 Accelerator files

5.6 MDM Server Client Requests

5.6.2 Queue Handling Locking

Data Manager on client server 1:

Data Manager on client server 2:

5.7 MDS.ini file parameter: CPU Count

5.8 MDS.ini file parameter: "Max Threads Per Operation"

5.9 MDS.ini file parameter: Extra DBConnection Validation

Performance for AIX

Import and MDIS

6.1 Steps during update and import

6.2 MDIS.ini File Parameters

No. Of Chunks Processed In Parallel

6.4 Registry parameters

6.6 Cartesian Product

Based on the sample the following number of records will be imported:

Bu t r igh tb e fo re i mp o r t , we a r e "f o rce d "to c o n ve rt th e se h ier a rch y d a ta i n to r e co rd s (" fl a tte n ").

6.7 How to import lookup main

The lookup field must be mapped to a source field

6.8 Importing single source field to multiple display fields

For better understanding the problem in detail:

6.9 Strategies for Attribute/Value Import

Example for a XML-File with values:

What has to be done in Import Manager?

Example: Import Price information

Currency (General, Number)

Text - Different settings of decimal point

Text - Text interpretated as Numeric

Transport of maps (value mapping)

Import XML (XSD.exe)

How to avoid disappearing value mappings

6.15 Difference between Value Exceptions and Import Exceptions

Limitation on number of source fields

Limitation on import file size Port Sequence

Please see note 1274858.

Example: Material Lookup main [Suppliers]

6.19 Access, Excel, complex XML not supported on Win64

Maximum multi-record value

Syndicator and MDSS

7.1 Steps during Syndication

Task handling of MDSS:

7.2 MDSS.ini Parameters

7.3 Syndication remote keys

7.4 Suppressing Initial Syndication

7.5 Field triggers for syndication

7.6 Changed Records are not being syndicated

8.1 Performance considerations

8.2 Number of main table fields

CPU Consumption Server Client