You are on page 1of 7

Ab Initio: EME Technical Repository

1. Enterprise Meta>Environment (EME)


Page | 1

Ab Initio Enterprise Meta>Environment or EME is an object oriented data storage


system that provide us with a platform to control object version and manage various
kinds of information associated with Ab Initio application. In simple words, EME is a
repository, which contains data about data i.e metadata.
EME datastore or technical repository is a specific instance of EME in the
environment. This is a repository where different versions of code/elements and its
related information like the record formats, transformations etc. are maintained. At
any point of time, user can connect to only one such EME repository instance.

2. EME Project
A Project is a collection of graphs and its associated elements like dml, xfr, etc. in
the EME Datastore.
2.1

Project structure

Primarily a Project is a group of graphs and related objects stored under a single
directory tree. /Projects are the default root under which all the Projects are
maintained inside the Datastore. Each Project directory has a basic structure as
given below.

BIN: Sub-directory for tools and utilities


DB: Sub-directory for database interface files
DML: Sub-directory for record format files
MP: Sub-directory for AI Graphs
RUN: Sub-directory for deployed shell scripts
SQL: Sub-directory for sql queries
Page | 2

XFR: Sub-directory for transform files


2.2
Public and Private Project
There could be a scenario where information is common to multiple Projects. For
instance, multiple Projects may share some record format files (dml) or transform
files (xfr). Such elements which are used across Projects can be made widely
reusable by making them part of a Project and including that Project in other
projects to access the common elements. A Project that is included by other Projects
is termed as a Public Project. A Project that is not shared with other Projects is
known as Private Projects. Thus, a public project can be included in a private
project, but not vice-versa.
A public Project is public in the sense that their data and metadata are expected to
be shared with other Projects and a private Project is private in the sense that their
data and metadata are not expected to be shared with other Projects.
2.3

The Environment Project (Stdenv)

There is a special public Project associated with every project instance of Ab Initio
environment known as the Environment Project or stdenv. This is no different from a
regular Project in terms of the structure. It contains machine and Application
specific settings like the data directory mount points, max-core settings and
application wide parameters like current date, which are used across all Projects.
During creation of any Project, stdenv is included in it by default. A single stdenv is
required for an entire set of applications on a single machine and sharing a single
EME Datastore.

3. Sandbox
Projects held in the EME Datastore cant be manipulated directly. To work on
Projects, they must be checked out to a working area on the file system where we
can develop and modify code. This working area on the file system is known as a
Sandbox. It has exactly the similar directory structure as that of a Project in the
Datastore.
Each object that needs to be worked on is checked out to a sandbox where
modifications or enhancements are carried out. After the changes are complete the
code is checked in from the sandbox area to the EME Datastore. This action creates
a new version of the code in the EME Datastore.
3.1

Sandbox Projects vs. EME Datastore Project

Sandboxes are work areas used to develop, test or run code associated with a given
project. Only one version of the code can be held within the sandbox at any time.
The EME Datastore contains all versions of the code that have been checked into it.

Page | 3

A particular sandbox is associated with only one Project where as a Project can be
checked out to a number of sandboxes, which is a common scenario.

4. Parameters
A parameter is a name-value pair with some additional attributes that determine
when and how to interpret or resolve its value. Parameters are used to provide
logical names to physical location and should always be used instead of hard coded
paths in graphs. This makes the graph more generic in nature. We can have two
types of parameters, graph and Project parameters.
4.1

Graph parameters

Graph parameters, as the name suggests are specific to the individual graphs and
are private to them. They affect execution of the graph for which they have been
defined. Graph parameters can be defined by navigating to Edit>Parameters in the
GDE which opens the graph parameters editor.
4.2 Project parameters
Project parameters are inherited by all the graphs in the Project and are accessed
from
the
GDE
by
the
sandbox
parameter
editor
in
Project>Edit
Sandbox>Parameters.

Page | 4

This shows a dialog box prompting to enter the sandbox path. Choose the correct
host and the sandbox path and press OK to open the sandbox parameter editor,
which is exactly like the graph parameter editor.
4.3 Editing parameters
To add a new Project parameter or to modify the value of an existing one, we should
first lock the parameters in the sandbox parameter editor by clicking the lock button
on the menu. If nobody has locked it in their sandboxes, then the lock symbol turns
green indicating a successful lock. This implies we can add or modify the
parameters now. If a lock is already there before, then while opening the parameter
editor it shows a warning saying the parameters are already locked and the lock
symbol is red in such a case. After getting a lock, others are disabled from editing
the parameters.
4.4 Parameter Attributes
Scope: Scope of a parameter can be formal or local. A local parameter is internal to
the sandbox and most of the parameters have their scope as local. Its value is taken
from the value column in the parameter editor. A formal parameter is one whose
value can be set from outside, i.e. from the environment where the graph is run. Its
value is supplied from the command line. A green diamond can identify the formal
parameters with an arrow mark.
Kind: If scope is local, kind is left unspecified, but if it is formal, the kind is
automatically set to keyword.
Type: This determines the nature of the parameter. Project parameters have four
types as string, common Project, switch and dependent. Graph parameters have
different set of types.
Value: This column specifies the value of the parameter.
Interpretation: This determines how the parameter is going to be evaluated.
Constant: Value is taken literally.
$ Substitution: Variables with $ prefixes are replaced with their values
${} Substitution: Variables within {} and with $ prefixes are replaced by their
values but other occurrences of $ are ignored.
Shell: Korn shell syntax is used to evaluate the value of the parameter.
PDL: Parameter definition language enables to define the parameter interpretation
using inline DML.
Required: This attribute can take two values, required (the default) or optional. If it
is required, the value column cant be left blank but if it is optional, it can be left
blank.
Export: When this check box is checked, the corresponding parameter value is
exported as an environment variable; otherwise it is generated as a local shell
variable.
Page | 5

Private Value: If a parameter is specified as a private value, any subsequent


changes to it remain private to the local sandbox and are not checked in into the
EME. This is useful when different users want different values for the same
parameter.
5. Version Control and Tagging
Each object under EME source control, which may be a file, a directory or a Project,
exist as a series of versions, each of which is a representation of what was checked
in by some user. It can optionally have a referential description attached to it called
a tag and a textual description as a comment. Each version is separately numbered
and can be accessed by either the version number or the tag attached to it. Version
numbers, which are integers and tags, are global to the whole EME datastore. Tags
are the basic units during migration of code across EME datastore instances.
The Ab Initio GDE provides wizards to check in code to the EME and check out code
from the EME to sandbox.
5.1 Check out of object
Check out updates the sandbox with the particular version of code that is being
checked out from the EME. By default the latest version of any object is checked
out, but we can check out any version of code we want. Any object that is version
controlled in the EME datastore can be checked out to a sandbox, which may be
pre-existing or may be created during check out process itself. While checking out a
Project or any objects belonging to the Project to a sandbox, stdenv and any
common Projects associated with it also need to be referenced in the sandbox. If the
sandbox to which you are checking out is an existing one, it would have the
information as to where to reference for the common projects. In case it is a new
sandbox, during check out we have to point to the stdenv and public sandbox (if
any) paths.
5.2 Locking
A lock must be acquired on the object we wish to modify in the sandbox after
successful completion of check out. To modify a graph that has been checked out,
firstly the graph in the GDE should be opened and then lock symbol/icon on the
menu bar needs to be clicked. It checks whether the version in the sandbox is the
latest version of the object in the datastore and if it is true, the lock symbol turns
green showing that the graph is now locked and is editable. If the graph has already
been locked in some other sandbox, after opening the graph in the GDE the lock is
red in color denoting that there is already a lock on the object.
Lock can be acquired on an object only if the sandbox version and the current
version of the object in the EME are the same.
Once a lock is acquired and the changes are complete the object must be checked
in to the datastore to create a new version in the datastore.
5.3 Conflicts during Check out

Page | 6

A conflict occurs when the sandbox version of an object and the latest datastore
version are different for some reasons. In such a case the check-out wizard asks
how to resolve the conflict between the sandbox and the datastore.
Example of a conflict situation:
User 1 checks out a file to a sandbox, locks and updates it. In the meanwhile user 2
also checks out the same file to his sandbox. When user 2 tries to edit the file in the
GDE, he will see that user 1 has already locked the it. He bypasses GDE and
updates the file outside it by some other means. Now user 1 checks in the file. User
2 also proceeds to check in his changes made to the file, but the check in fails due
to conflict. All user 2 needs to do is to check out the current version of the file from
EME to his sandbox and during this process he would be asked whether to keep his
sandbox version or overwrite it with the current version available in the EME
datastore.
5.4 Check in of object
Once the project files have been edited and updated they need to be checked in to
create a new version in the EME datastore, which will be available for other users.
Check in wizard is invoked by navigating to Project>Check in. Before starting the
check in wizard, it checks for any unsaved file in the sandbox and prompts whether
to save them or not.
5.5 Dependency Analysis
It analyses the Project for the dependencies within and between the graphs. The
EME examines the Project and develops a survey tracing how data is transformed
and transferred field by field from component to component. Dependency analysis
has two basic steps Translation and Analysis.
In the translation step the sandbox paths of the objects and the paths to the objects
in the data area are translated to corresponding Project relative paths inside the
EME and the actual analysis is performed in the analysis step.
6. Working with previous versions of graphs/objects
Many a times a previous version of a graph may be required to check out and
update rather than working with the latest or current version of the graph as
available in the EME data store. Using check out wizard in GDE, we may check out a
tagged version of a graph, which is not the latest version available. But GDE doesnt
allow locking such versions. In such a case, the below procedure may be followed:
Check out the required previous tagged version of the graph to your sandbox.
Check it back in with Force Overwrite in advanced option in check in wizard. This
will make it the current version in the data store.
Lock the graph now to make the changes.
Check in the graph back to the EME data store. This updated version will become
the latest version in the EME data store.

Page | 7

You might also like