Professional Documents
Culture Documents
TECHNICAL REPORT
KEY WORDS
Zoonoses, XML, BIOMO Data Models ,XSD, DCF, Excel.
Correspondence: zoonoses@efsa.europa.eu
SUMMARY
This user manual document provides guidance on use of the BIOMO mapping tool for the
submission of 2012 data by the Member States and other reporting countries.
The manual provides a general introduction to the mapping tool. The report covers all
aspects related to the management and use of the mapp ing tool. In particular detailed
guidelines are provided for the mapping of standard terminology and the creating of a
locally validated XML f ile for submission of data to EFSA via the DCF (Data Collection
Framework)
Instructions are also given on the updating of EFSA standard terminology dictionaries,
and how to incorporate these updates in the Mapping tool.
This manual is specifically aimed to guide the mapping of member specific standard
terminology to those published by EFSA, thus creating a much needed documented and
centralized mapping of controlled terminology.
ABSTRACT
Summary
1 Introducing the Mapping Tool
1.1
1.2
2.1.1
3.1
3.2
3.3
3.4
3.5
1
2
4
11
13
The Mapping tools main goal is to provide a simple yet useable platform for member
states to map their country specific standard terminology to those published by EFSA,
producing an XML file for the submission of sample or aggregated based zoonoses
monitoring data via the DCF.
This User Manual introduces the basic functionalities available in the Mapping Tool, such
as:
Generating and validating XML via published EFSA data mode XSD
This manual will include some suggestions on how to proceed when you want to profile
your data, these are simply suggestions on our be half and do not need to be followed to
the letter.
The mapping tool is essentially an Excel workbook w hich has three default worksheets,
which are colour coded. In the diagram below you will see these default worksheets, the
red coloured tab worksheet (number 1 in diagram) which contains the text
ZOO_FAC T_(MODEL_NAME) is where you will put your own data, this is the first step,
whereby you must fit your current data into the structure of the data model in question.
This worksheet is the format in which Member States must organise their data for
successful mapping and subm ission of data to EFSA.
The next worksheet of importance is coloured green and is called CODED (number 2 in
diagram), this worksheet reads the data inserted into the red ZOO_FACT worksheet, and
then searches for the correct mapped term in the various pick list worksheets included in
the Excel Workbook (more on this later).
The last coloured worksheet (number 3 in diagram) is called DCF_FORMATTED_DATA
(XML in the old mapping tool), and this is where you will eventually paste your finished
data, either to export as XML via the published BIOMO XSD schema or to leave as is in
Excel format, remember the DCF supports both Excel and XML format. It is very
important that the Excel workbook is saved in Excel 97-2003 Wor kbook (*.xls) format as
the DCF only supports Excel files of this type, also when subm itted to the DCF if the
workbook contains more than one worksheet it is only the first worksheet in the
workbook that is read, this is the reason why DCF_FORMATTED_DAT A is the f ist sheet in
the series.
Figure 1
The CODED worksheet contains the LookUpDic Te rm function that will query the
mapping you w ill create in each of the pick list worksheets. The function will return a
mapped EFSA code, it is important to make sure that the mappings you have created
are correct. In figure 2 below, highlighted by numbe r two, there is a cell under the
ZOO_FAC T_AMR_ISOLATE_AST data model element zoonosis selected, as you can
see in number three this cell contains a formula called LookUpDic Term.
The LookUpDic Term excepts as its first value the name of a pick list to search, in this
case ZOO_CAT_PARAM_AMR and a value, the value will be selected from the red
worksheet ZOO_FAC T_AMR_ISOLATE_AST, this value must have been mapped to a
pick list term in the ZOO_CAT_PARAM_AMR pick list, if not the function w ill return
NOT MAPPED, which mean the value you are trying to use has no valid mapping. All
NOT MAPPED values must be fixed before a valid submission can be made.
Numbe r four in figure 2 is highlighting the fact that ZOO_FACT_(MODEL_NAME)
elements (like repCountry, lang, zoonosis .etc) that require a EFSA standard term(pick
list), are linked, you need to simply click on the link and you w ill be moved to the
correct pick list/catalogue in the workbook, where you can easily create your mappings.
By default the CODED worksheet only has 200 pre-filled cells with formulas, if you
have
more
than
200
data
entries
(rows)
in
the
red
worksheet
ZOO_FAC T_(MODEL_NAME), you will need to extend the num ber of rows with
formulas to match. For example I have 560 rows in ZOO_FAC T_(MODEL_NAME),
only the first 200 w ill be visible in the CODED worksheet, you will need to select row
number 200 in the CODED worksheet and drag it to the corresponding row number in
ZOO_FAC T_(MODEL_NAME), which is 560, this way all values will be copied across.
Figure 2
Each ZOO_FAC T_(MODEL_NAME) data model contains an up to date list of BIOMO
standard terms; these are the pick lists that are also available on the online Zoonoses
Web reporting tool. Each model w ill contain only the pick lists that it needs to reference .
1.1
Figure 3
If you do not already have a list of all the terms that need to be mapped per data model
to EFSA terms, one simple way to profile you r data is to use an Excel pivot table. To
profile quickly your data follow the following steps:
1. Before inserting a Pivot table into a new worksheet, first select all data in the red
ZOO_FAC T_(MODEL_NAME) worksheet by clicking the select all table feature
indicated in the figure below.
Figure 4
2. Then select the Insert Tab from the control Ribbon, and choose Pivot table.
Figure 5
3. The table range is already set by virtue of the select all table feature we
performed in step 1, now we want to create a pivot table in a new sheet.
Figure 6
4. We now have a pivot table that has all the data values that are in the
ZOO_FAC T_(MODEL_NAME) red worksheet, so we can now start profiling the
elements like zoonosis and matrix which need to have there terms matched with
the BIOMO standard term pick lists ZOO_CAT_PARAM_ZOO and
ZOO_CAT_MATRIX. The below figure shows the pivot table that we created,
where we have selected Zoonosis, as you can see on the left of the figure, all the
unique terms in the Zoonosis column are displayed. These are the terms t hat we
need to insert into the ZOO_CAT_PARAM_ZOO pick list worksheet. We now
copy the unique terms under the zoonosis column on displayed on the left, and
paste them somew here on in the profiling worksheet to the right of the pivot table
so that we can reference them later.
Figure 7
We repeat the above step for all data model elements that need to be mapped to
the EFSA standard terms pick lists, copy and pasting the resultant unique terms
into columns to the right the pivot table.
Figure 8
5. Eventually the list to the right of the pivot table will have all the unique terms that
need to be mapped like in the figure below.
Figure 9
1.2
At this point you have profiled and saved the lists of unique terms that need to be
mapped from your data set to those of EFSAs. Step five in prof iling your data lists the
profiled terms that we will use in our example mapping. The following sequence of steps
will map a these values to their corresponding EFSA standard term.
The figure below highlights the sections where mapping is performed in the
ZOO_CAT_PARAM_AMR pick list; all pick lists have these mapping sections. A mapping
section is linked to a specific standard term pick list. Area one is where your Member
State specific terms are inserted and area two where the BIOMO standard term that
matches the Member State term is selected from the dropdow n box highlighted in area
figure 10 below.
Figure 10
The following steps describe the mapping process.
1. Open the CODED worksheet in the workbook. As we know from figure 2 on page
5, the Zoonosis element name is linked to the ZOO_CAT_PARAM standard term
pick list. By using the hyperlinked index in the element zoonosis we can move to
the ZOO_CAT_PARAM _AMR pick list.
Once we have selected the ZOO_CAT_PARAM _AMR standards terms pick list we
paste in our profiled terms under column A (MemberState_TERMS.
Figure 11
2. Once you have pasted your profiled terms into the A (Membe rState_TERMS
column, its time to start looking for their matching term in the Zoonoses data
models standard terms pick list. In column B, EFSA_TERM, there is a dropdown
list of all the standard terms in the ZOO_CAT_PARAM _AMR pick list for the
element Zoonosis, you have to now search through this list until you find a
Supporting publications 2012:EN-XXX
10
Figure 12
3. We repeat the above step until all your terms are matched for all you profiled
elements
2.1.1
11
Figure 13
12
3.1
Figure 14
1. In step two we want to paste all the values we copied from the CODED worksheet
in step one, into the Excel worksheet DCF _FORMATTED_DATA. The worksheet
CODED uses cell formulas to generate its cell contents, by referencing data in the
ZOO_FACT_AMR_ISOLATE_AST red worksheet. In the DCF_FORMATTED_DATA
worksheet we only want to paste the actual cell values and not the formulas
contained in the CODED worksheet. To do this, first we select all the cells in the
active DCF_FORMATTED_DATA worksheet, and then under Paste we choose
Paste Values, you can also do the same under Paste Special, selecting the
values option in the dialog box.
13
Figure 15
2. This next step will replace all occurrences of the string NULL in cells, with an
actual true value of null, this is to prevent errors in generating and validating XML
via the XSD. Empty string values in Excel cells cause problems for the XML parser
used by Excel, that is why we replace empty strings first with a string NULL and
later with a real null value. To replace the string NULL, choose the Find a nd
Select tab, then select Replace, when the Find a nd re place dialog appears type
NULL into the Find what: search box, leave the Re place with: text box empty
(indicates null to Excel). Finally select the Replace All button which will now
update all instances of NULL w ith a true null value.
Figure 16
If you do not want to submit XML validated data y ou can stop here at this point , you
must only insure that the DCF_FORMATTED_DATA worksheet is the first worksheet in
the workbook. It is very important that the Excel workbook is saved in Excel 97-2003
Wor kbook (*.xls) format as the DCF only supports Excel f iles of this type, also w hen
submitted to the DCF if the workbook contains more than one wor ksheet, it is only the
14
3.2
this
is
the
reason
why
Figure 17
15
3.3
Figure 18
Once you have selected the XSD, the Multiple Roots dialog will appear, please
select dataset as indicated in the figure below and t hen select OK.
Figure 19
Supporting publications 2012:EN-XXX
16
3.4
1.
After we have imported our data models XSD file in as an XML Map, you can now
see the structure of the XSDs elements in the XML Source window. The section of
the XML Map that we are interested in is under the dataset branch. To map the
XML Map to your data you will need to left click on your mouse button on the
Result branch, with the mouse button still held dow n you will need to drag the
Result branch across to the Top Left most cell (in this case resultCode), containing
the first element column name in the Coded_XML worksheet. Let go now of the
mouse button when the Top Left cell becomes active. Excel will now create a
Table with our XML Map as its basis. We have now successfully mapped our XML
Map to the elements in our data Model.
Figure 20
17
2. Setting the Validate against Schema for Export Feature. To activate this feature
we need to first select the Developer tab followed by the Map Prope rties
button. The XML Map Properties dialog box will now be displayed, under XML
Schema validation, select the Validated data a gainst Schema option followed
by the Ok button. Now when we export the data, the XML file created will
automatically be validated by the data model XSD that we imported.
Figure 21
3.5
Figure 22
Supporting publications 2012:EN-XXX
18
19