Professional Documents
Culture Documents
doi:10.2903/sp.efsa.2017.EN-1181
Abstract
The European Food Safety Authority (EFSA) is tasked with coordinating the reporting of zoonoses,
zoonotic agents, animal populations, antimicrobial resistance and food-borne outbreaks in the
European Union (EU) under Directive 2003/99/EC, as well as analysing and summarising the data
collected. For data collection purposes, EFSA created a simple Microsoft Office Excel-based mapping
tool to allow Member States (MSs) to map their standard terminology to EFSAs standard terminology.
This technical report is a user manual for data providers (reporting officers and reporters) that
describes how to map terms and generate extensible markup language (XML) for submission of 2016
data to EFSA, according to Zoonoses data models: prevalence, antimicrobial resistance, food-borne
outbreak, animal population, disease status and text forms.
European Food Safety Authority, 2017
Acknowledgements: EFSA wishes to thank the members of the Scientific Network for Zoonoses
Monitoring Data that reviewed this report, and EFSA staff members: Kenneth Mulligan, Doreen
Dolores Russell and Anca-Violeta Stoicescu for the preparatory work on this technical output.
Suggested citation: EFSA (European Food Safety Authority), 2017. User manual for data providers
for mapping Member State standard terminology to EFSA standard terminology. EFSA supporting
publication 2017:EN-1181. 26 pp. doi:10.2903/sp.efsa.2017.EN-1181
ISSN: 2397-8325
European Food Safety Authority, 2017
Reproduction is authorised provided the source is acknowledged.
Summary
This user manual provides guidance to data providers on the use of the Microsoft Office Excel-based
mapping tool for the submission of 2016 data by Member States (MSs) and other reporting countries.
This tool was developed by the Evidence Management Unit (DATA) in order to help MSs to report
electronic data for zoonoses, zoonotic agents, animal populations, antimicrobial resistance, food-borne
outbreaks and text forms according to the zoonoses data models.
The manual provides a general introduction to the mapping tool and covers all aspects related to the
management and use of the mapping tool. In particular, detailed guidance is provided for the
mapping of standard terminology used at national level to EFSA standard terminology, and the
creation of a locally validated extensible markup language (XML) file for submission of data to EFSA
via the Data Collection Framework (DCF).
This manual is specifically aimed at guiding the mapping of MS-specific standard terminology to EFSA
standard terminology, thus creating a much needed centralised document on mapping of controlled
terminology.
Table of contents
Abstract.........................................................................................................................................1
Summary .......................................................................................................................................3
1. Introduction........................................................................................................................5
2. Introducing the mapping tool...............................................................................................6
2.1. How the coded sheet works...............................................................................................10
2.2. General points on Excel worksheets ...................................................................................11
3. Getting started by profiling your data .................................................................................11
3.1. Mapping your profiled terms ..............................................................................................13
3.2. Steps in the mapping process ............................................................................................14
3.3. The catalogue/hierarchy search functionality ......................................................................15
4. Refreshing and changing views on data..............................................................................17
4.1. The adding and removing of formula rows..........................................................................17
4.2. The refreshing of data.......................................................................................................18
4.3. Different views on the codified data ...................................................................................18
5. Generating XML file ...........................................................................................................20
6. Rebuilding a broken coded file ...........................................................................................21
7. Particularities of the manual mapping tool ..........................................................................21
7.1. Adding rows with dropdowns in the ZOO_MAN worksheet ...................................................23
7.2. Rebuilding rows with dropdowns in the ZOO_MAN worksheet ..............................................23
7.3. Reporting repeatable data element ....................................................................................23
8. Text forms Manual Mapping Tool .......................................................................................23
References...................................................................................................................................25
Abbreviations ...............................................................................................................................26
1. Introduction
1
Available online: http://registerofquestions.efsa.europa.eu/roqFrontend/questionsListLoader?mandate=M-2015-0231
Manual for reporting on antimicrobial resistance within the framework of Directive 2003/99/EC
and Decision 2013/652/EU for information derived from the year 2016
Data dictionariesguidelines for reporting data on zoonoses, antimicrobial resistance and
food-borne outbreaks using the EFSA data models for the Data Collection Framework (DCF) to
be used in 2017, for 2016 data
User manual for data providers for mapping Member State standard terminology to EFSA
standard terminology
The present technical report is the manual for the Excel mapping tools pertaining to the 2016 data
collection.
The mapping tool is an Excel workbook which has three default data processing worksheets. Outside
of these three worksheets, each model contains all the catalogues/hierarchies needed for each data
model. All the catalogue/hierarchies worksheets have CAT_ at the beginning of or as the prefix to the
catalogue/hierarchies name. In these catalogue/hierarchies worksheets data providers need to map
their own specific terms to those required by EFSA.
In Figure 1 the default worksheets are presented:
Mapping_Options;
ZOO_FACT_(DATA MODEL_NAME);
CODED.
The Mapping_Options worksheet contains all the functionality which data providers need to manage
their mapping file. The features provided there are presented in each of the sections below.
ZOO_FACT_(DATA MODEL_NAME) or ZOO_MAN_(DATA MODEL_NAME) is where data providers insert
their own data. This is the first step whereby data providers must fit their current data into
the structure of the data model in question. This worksheet is the format into which
reporting countries must organise their data for successful mapping and submission of
data to EFSA.
Figure 2: ZOO_FACT worksheet displays mandatory, compound, facet and optional data elements
The repeatable data elements are designed to allow the user to enter in multiple values for the
same data element; to do this, the mapping tool requires that each value to be separated with an
asterix * (without space).
In Figure 4 is an example of how a repeatable value can be reported using the dynamic mapping tool.
The value entered for the esbl element in the ZOO_FACT_AMR_ISOLATE_AST should be delimited
with *. When the mapping of the values inserted in the esbl data element is realised (in the
hierarchy CAT_PARAM_esbl) each of the value should be mapped to their corresponding EFSA control
terminology.
Figure 5 displays in the CODED worksheet the code produced for the repeatable element esbl.
The standard data elements are a single elements which are linked, one to one, between the
values entered for the element and the corresponding mapped element code (only if the standard
element is linked to an EFSA control terminology catalogue/hierarchy).
To further improve the tools usability, definitions for each data element have been included in the
ZOO_FACT_(DATA MODEL_NAME) worksheet (Figure 6). To observe the definition, data providers can
simply place the mouse over the element name displayed in row one of the worksheet and that
elements definition will be displayed (Figure 6). By clicking on the element name, data providers will
be transferred to that elements catalogue/hierarchy and can see the terms that are permissible and
the mappings that have been made between the data providers data and EFSAs standard
terminologies.
Figure 6: ZOO_FACT worksheet displays the definitions for each data element
The CODED worksheet (Figure 7) reads the data inserted into the ZOO_FACT_(DATA MODEL_NAME)
or ZOO_MAN_(DATA MODEL_NAME) worksheet and then searches for the correct mapped term in the
various catalogue worksheets (this is presented in more detail in section 2.1).
Figure 7 shows that any terms that are not mapped are highlighted in red and display the text Not
Mapped and the value of the term in the ZOO_FACT_(DATA MODEL_NAME). All terms have to be
mapped correctly before data providers can produce a valid file to be submitted in the DCF.
The worksheet CODED, Figure 8, contains the LookUpDicTerm function that will query the mapping
that data providers will create in each of the catalogue worksheets. The function will return a mapped
EFSA code. It is therefore important to make sure that the mappings created by data providers are
correct.
In Figure 8, highlighted by number (2), the value RF-00002560-MCG can be seen. This is the code
returned for a valid mapped term, found by the LookUpDicTerm function highlighted by the number
(3) in Figure 8.
The LookUpDicTerm expects as its first value the name of a catalogue to search in, in this case
CAT_PARAM_AMR, and a value. The value will be selected from the ZOO_FACT_AMR_ISOLATE_AST,
column E row number 2 (the zoonosis data model element). The value contained in the cell E2 should
be mapped to a catalogues term in the CAT_PARAM_AMR catalogue. If the term is not mapped, the
function returns NOT MAPPED. All NOT MAPPED values must be fixed before a valid submission file
can be made.
The number (4) in Figure 8 highlights the fact that ZOO_FACT_(DATA MODEL_NAME) elements that
require an EFSA standard term (catalogue) are linked to their respective catalogue in the workbook.
To view the standard terms catalogue, the data providers can simply click the link and the catalogue
sheet will open.
By default, the CODED worksheet has only one row pre-filled with formulas. If data providers want to
extend the number of rows in the CODED worksheet to match the number of data entries (rows) in
the worksheet ZOO_FACT_(DATA MODEL_NAME), then data providers need to use the
Mapping_Options Resize Coded Sheet Formulas button.
For example, in a ZOO_FACT_(DATA MODEL_NAME) worksheet with 560 rows (Please note, that 560
is linked only to rows that have actual data, the first row, containing element names should not be
counted), only the first one (the default is one) will be visible in the CODED worksheet. Thus data
providers will need to insert 560 in the text box beside the Resize Coded Sheet Formulas button in
the Mapping_Options worksheet and then press the button Resize Coded Sheet Formulas (see
Figure 9).
After data providers have performed the resizing of sheet formulas, the number of rows in the
CODED worksheet, with formulas will match the number 560 in the ZOO_FACT_(MODEL_NAME)
worksheet. All 560 row values in the ZOO_FACT_(MODEL_NAME) worksheet will now be copied across
and coded to the CODED worksheet.
Each ZOO_FACT_(DATA MODEL_NAME) data model contains an up-to-date list of standard terms
(catalogues) and each data model contains only the catalogues that it needs to reference.
If data providers do not already have a list of all the terms that need to be mapped per data model to
EFSA terms, one simple way to profile their data is to use the Profile Your Data button in the
Mapping_Options worksheet (see below).
Figure 12: Example of collecting and inserting of all unique terms into a specific catalogue
The profile option selects and adds only unique and new terms to the catalogues; thus, if data
providers add more new data points to the ZOO_FACT_(DATA MODEL_NAME) worksheet, data
providers can continue to use the Profile Your Data button to check for new terms to be added to
catalogues.
Figure 14: Example of mapping a national standard term with EFSAs term
Figure 17: Resize Coded Sheet Formulas functionalities-the default size of the coded worksheets
Data really need to be coded only when data providers are producing a dataset which data providers
wish to submit to EFSAs DCF; to do this data providers simply add as many rows to the CODED
sheet as data providers have in their ZOO_FACT_(DATA MODEL_NAME) worksheet, select the resize
coded sheet button (entering in the number of rows in the text box of course) and all the row values
data providers have in the ZOO_FACT_( DATA MODEL_NAME) will be transferred and codified into the
CODED worksheet.
Because the first row in the mapping tool contains the name of the elements, when we perform an
increase of the size of the coded worksheet, then number of rows with data is always the first row,
plus the number supplied by the Resize Coded Sheet Button, the following series of figures (Figure
18) demonstrate how the mapping tool handles the increasing of formulas in the coded worksheet.
The default size of the coded worksheets rows in set to one, if you look at Figure 18 below, you can
see the sequence for incrementing the size of then CODED worksheet to two. If you look at number
one in the sequence, you can see the default size of the coded worksheet is set to one, so only one
row of data from the ZOO_FACT_(DATA MODEL_NAME) worksheet is displayed in the coded
worksheet. In number two of the series, we are resizing the coded worksheet to two rows, the final
step in the series, 3, now shows that the coded worksheet displays two rows of data from the
ZOO_FACT_(DATA MODEL_NAME) worksheet.
Among the options available via the Mapping Mode section in the Mapping_Options worksheet seen
in Figure 13 above, data providers can see that, for normal mapping there is:
1) Normal Mapping: Code, only codes are displayed;
2) Normal Mapping: Code+Term, codes and terms are displayed together
And under the Display mapped data there are:
1) EFSA Terms
2) Memberstate mapped terms
These two features combined will provide us with the different views we mentioned above. For
example, if we select
1) Normal Mapping: Code only displayed, and
1) EFSA Terms
the coded term is all that we will see in the CODED worksheet:
zoonosis
RF-00000796-MCG
If we choose
2) Normal Mapping: Code+Term and
1) EFSA Terms
we get both the EFSA code and the term text:
zoonosis
RF-00000796-MCG | Salmonella - S. Montevideo
If we really want to be sure the mapping we made was correct between the reporting country term
and the EFSA term, we can also choose
2) Normal Mapping: Code+Term, and
2) Memberstate mapped terms
This will provide us the Member State term (in this example Montevideo), its mapped EFSA code and
the term text:
zoonosis
Montevideo | RF-00000796-MCG | Salmonella - S. Montevideo
This level of display flexibility should greatly help to ease the use of the tool. When the user is ready
to create and submit their data, the first example setting should be selected so that only the coded
term is displayed and not the text associated with it namely:
1) Normal Mapping Code, and
1) EFSA Terms
Create DCF XML functionality should be use for the below dataset operation:
1. Insert: The data provider sends a new dataset to be inserted into EFSAs database. A unique
identifier for the dataset (datasetId) is generated by the receiver system and assigned to the dataset
(EFSA, 2014).
2. Replace: A new dataset entirely replaces a dataset previously stored in EFSAs database. The
system uses, for the new dataset, the same datasetId previously assigned to the replaced one.
3. Partial replace: A set of records is uploaded to partially replace a target dataset present in EFSAs
database. The dataset maintains the same datasetId. This operation is recommended for only large
datasets containing just a few records to be updated. The system updates in the target dataset all the
records that have a corresponding record unique identifier. If the newly uploaded dataset also
contains records for which the record unique identifier does not exist in the target dataset, the records
are added and appended to the target dataset.
4. Partial delete: A set of records is uploaded to partially delete a target dataset present in EFSAs
database. The dataset maintains the same datasetId. The system deletes in the target dataset all the
records that have a corresponding record unique identifier. If the newly uploaded set also
contains records for which the record unique identifier does not exist in the target dataset, the
operation fails.
Create [Update] DCF XML and Create [Delete] DCF XML should be used ONLY to create XML
files to facilitate the Amendment operations. The amendments allow MSs to modify records in
datasets which are ACCEPTED DWH (EFSA, 2014). If an amendment to the dataset which was
ACCEPTED DWH is needed, the data provider shall upload a new dataset with the insert operation
to perform an amendment operation on the records already loaded in the DWH.
Amendment operations are transmitted as datasets. The field, amType (amendment type), has to be
used in order to specify which type of amendment is requested by the data provider for each of the
records to be amended included in the dataset and identified by the record unique identifier.
The amendment operations supported by the tool are:
amType = U (update), then Create [Update] DCF XML should be used
This operation is used to update records (status ACCEPTED DWH) in the EFSA database. This
operation will result in a new version of the record in the database.
amType = D (delete), then Create [Delete] DCF XML should be used
This operation is used to perform a deletion (status ACCEPTED DWH) of accepted records in
the database. The records flagged as deleted will reach their final status in the receiver DWH
and can no longer be modified.
Please note that only the records in the ZOO_MAN/ZOO_FACT_(DATA MODEL_NAME) worksheet that
have their amType set to update U or Delete D will be exported to a DCF formatted XML file.
When the data provider is satisfied that all data has been mapped, the Create DCF XML button
should be used. This action will start the full validation of the data that is in the CODED worksheet,
to produce a valid XML file containing all inserted and mapped data.
If while processing and validating the data in the CODED worksheet, and error is found,
the creation of the XML file will be stopped and an error box will display beside the cell
where the error occurred, explaining in detail what the problem is. Figure 22 below
shows the error message box, where the mandatory data element (cell highlighted in
red) (repYear) was not fill in for the cell B2. This error message box can be moved or
deleted.
When all errors have been fixed, the data provider will be prompted to save the created
XML file, with the option to add the current time. Once the XML export is finished the file
can be uploaded in DCF.
The manual tool does not support all functionality (Figure 24) of the dynamic mapping tool; this is
because the LookUpDicTerm is not used in the CODED worksheet. The CODED worksheet in the
manual model simple removes the text part of an EFSA standard coded term, leaving only the code;
no dictionary lookup is needed because the values in the ZOO_MAN (MODEL_NAME) worksheet are
already coded terms selected from a pick list. Figure 25 below shows how each term is picked from a
dropdown list in the data entry worksheet.
Figure 25: Example of selection of terms from a dropdown list in the data entry worksheet
The manual tools data entry worksheet, ZOO_MAN (MODEL_NAME), supports the pick list search
functionality outlined in section 3.3. Instead of selecting directly from the pick list dropdowns as
shown in Figure 25, data providers can type in their pick list search terms.
Figure 26: Example of rebuilding of CODED and or any of the dropdown pick lists
Figure 27: Repeatable data elements inserted in the FBO manual mapping tool
Figure 28: Example showing a selection of terms from a drop down list from the catalogue
CAT_PARAGRAPH_TYPE
An added feature of the ZOO_MAN_TEXTORMS worksheet is the possibility to select - from a pre-
defined list - the corresponding description of the sub-titles (subTitleOrder) corresponding to the
reported paragraph which is designated by a number. In Figure 28 this aspect is illustrated: in the
column headed subTitleOrder the numbers 1 (Description of sampling designs) and 8 (Sampling
strategy used in monitoring - Methods used for collecting data) have been chosen.
The free text information for each paragraph and subtitle is inserted by the data provider manually in
column N (value) in the text box containing the number and Text Form Subtitle (see Figure 29).
Figure 29: Text box for reporting information relevant to the selected Text Form Subtitle
References
EFSA (European Food Safety Authority), 2014. Guidance on Data Exchange version 2.0. EFSA Journal
2014;12(12):3945, 173 pp., doi:10.2903/j.efsa.2014.3945
EFSA (European Food Safety Authority), 2017a. Manual for reporting on zoonoses and zoonotic
agents, within the framework of Directive 2003/99/EC, and on some other pathogenic
microbiological agents for information deriving from the year 2016. EFSA supporting publication
2017:EN-1175. 98 pp. doi:10.2903/sp.efsa.2017.EN-1175
EFSA (European Food Safety Authority), 2017b. Manual for reporting on antimicrobial resistance
within the framework of Directive 2003/99/EC and Decision 2013/652/EU for information deriving
from the year 2016. EFSA supporting publication 2017:EN-1176. 35 pp.
doi:10.2903/sp.efsa.2017.EN-1176
EFSA (European Food Safety Authority), 2017c. Manual for reporting on food-borne outbreaks in
accordance with Directive 2003/99/EC for information deriving from the year 2016. EFSA
supporting publication 2017:EN-1174. 44 pp. doi:10.2903/sp.efsa.2017.EN-1174
EFSA (European Food Safety Authority), 2017d. Data dictionariesguidelines for reporting data on
zoonoses, antimicrobial resistance and food-borne outbreaks using the EFSA data models for the Data
Collection Framework (DCF) to be used in 2017, for 2016 data. EFSA supporting publication 2017:EN-
1178. 26 pp. doi:10.2903/sp.efsa.2017.EN-1178
Abbreviations
AMR antimicrobial resistance
DCF Data Collection Framework
EC European Commission
ECDC European Centre for Disease Prevention and Control