You are on page 1of 3

FRONT ROOM ARCHITECTURE:

The front room is the public face of the warehouse. Its


what the business users see and work with day-to-day.
In fact, for most folks, the user interface is the data
warehouse. They dont know (or care) about all the time,
energy, and resources behind itthey just want answers.
Unfortunately, the data they want to access is complex.
The dimensional model helps reduce the complexity.
The primary goal of the warehouse should be to make
information as accessible as possibleto help people
get the information they need. To accomplish this, we
need to build a layer between the users and the
information that will hide some of the complexities and
help them find what they are looking for. That is the
primary purpose of the data access services layer.
Front Room Data Stores;
-Access Tool Data Stores
As data moves into the front room and closer to the user,
it becomes more diffused .Users can generate hundreds
of ad hoc queries and reports in a day. These are
typically centered on a specific question, investigation of
an anomaly, or tracking the impact of a program or event.
Most individual queries yield result sets with less than
10,000 rowsa large percentage have less than 1,000
rows. These result sets are stored in the data access
tool, at least temporarily. Much of the time, the results
are actually transferred into a spreadsheet and analyzed
further.
Some data access tools work with their own intermediate
application server provides an additional data store to
cache the results of user queries and standard reports.
-Standard Reporting Data Stores
client/ server-based standard reporting environments are
beginning to pop up in the marketplace. These
applications usually take advantage of the data
warehouse as a primary data source. They may use
multiple data stores, including a separate reporting
database that draws from the warehouse and the
operational systems. They may also have a report library
or cache of some sort that holds a pre-executed set of
reports to provide lightning-fast response time.
Personal Data Marts: The idea of a personal data mart
seems like a whole new market if you listen to vendors
who have recently released tools positioned specifically
for this purpose. The merchant database vendors all
have desktop versions that are essentially full-strength,
no-compromise relational databases. There are also
new products on the market that take advantage of data
compression and indexing techniques to give amazing
capacity and performance on a desktop computer.
Personal data marts may require a replication framework
to ensure they are always in synch with the data
warehouse. They will continue to play an important role
in this personal segment of the marketplace.
Disposable Data Marts: The disposable data mart is a
set of data created to support a specific short-lived
business situation. It is similar to the personal data mart,
but it is intended to have a limited life span. The
disposable data mart also allows the data to be designed
specifically for the event, applying business rules and
filters to create a simple sandbox for the analysts to play
in.
Application Models: Data mining is the primary
example of an application model. Its a collection of
powerful analysis techniques for making sense out of
very large data sets. From a data store point of view,
each of these analytical processes usually sit on a
separate machine (or at least a separate process) and
works with its own data drawn from the data warehouse.
Credit rating and churn scores are good examples of
data mining output that would be valuable in the context
of the rest of the data in the warehouse.
Front Room Services for Data Access:
Data access services cover five major types of activities
in the data warehouse: warehouse or metadata browsing;
access and security; activity monitoring; query
management; and standard reporting.
-Warehouse Browsing
Warehouse browsing takes advantage of the metadata
catalog to support the users in their efforts to find and
access the information they need. Ideally, a user who
needs business information should be able to start with
some type of browsing tool and peruse the data
warehouse to look for the appropriate subject area.
The warehouse browser should be dynamically
linked to the metadata catalog to display currently
available subject areas and the data elements within
those subjects. It should be able to pull in the definitions
and derivations of the various data elements and show a
set of standard reports that include those elements.
Once the user finds the item of interest, the browser
should provide a link to the appropriate resource: a
canned report, a tool, or a report scheduler. Front ends
have grown more sophisticated and now use metadata
to define subsets of the database to simplify the users
view. They also provide ways to hook into the descriptive
metadata to provide column names and comments.
-Access and Security Services: Access and security
services facilitate a users connection to the database.
This can be a major design and management challenge.
Access and security rely on authorization and
authentication services where the user is identified and
access rights are determined or access is refused. For
our purposes, authentication means some method of
verifying that you are who you say you are. There are
several levels of authentication constant password is the
first level, followed by a system-enforced password
pattern and periodically required changes. Beyond the
password, it is also possible to require some physical
evidence of identity, like a magnetic card.
On the database side, we strongly encourage
assignment of a unique ID to each user. Although it
means more work maintaining IDs, it helps in tracking
warehouse usage and in identifying individuals who
need help.we need to determine what they are
authorized to see. the value of a data warehouse is
correlated with the richness and breadth of the data
sources provided. Therefore, we encourage our clients
to make the warehouse as broadly available as possible.


Authorization is a much more complex problem in the
warehouse than authentication, because limiting access
can have significant maintenance and computational
overhead, especially in a relational environment.
-Activity Monitoring Services
Activity monitoring involves capturing information about
the use of the data warehouse. There are several
excellent reasons to include resources in your project
plan to create an activity monitoring capability centered
around four areas: performance, user support, marketing,
and planning.
Performance. Gather information about usage, and
apply that information to tune the warehouse more
effectively.
User support. The data warehouse team should
monitor newly trained users to ensure they have
successful experiences with the data warehouse in the
weeks following training. Also, the team should be in the
habit of monitoring query text occasionally throughout
the day. This will help the team understand what users
are doing, and it can also help them intervene to assist
users in constructing more efficient queries.
Marketing. Publish simple usage statistics to inform
management of how their investment is being used. A
nice growth curve is a wonderful marketing tool, and a
flat or decreasing curve might be motivating for the
warehouse team.
Planning. Monitor usage growth, average query time,
concurrent user counts, database sizes, and load times
to quantify the need and timing for capacity increases.
This information also could support a mainframe-style
charge-back system, if necessary.
-Query Management Services
Query management services are the set of capabilities
that manage the exchange between the query
formulation, the execution of the query on the database,
and the return of the result set to the desktop. These
services arguably have the broadest impact on user
interactions with the database.
Content simplification. These techniques attempt to
shield the user from the complexities of the data and the
query language before any specific queries are
formulated. This includes limiting the users view to
subsets of the tables and columns, predefined join rules
(including columns, types, and path preferences), and
standard filters. Content simplification metadata is
usually specific to the front-end tool whose simplification
rules are usually hidden
Query reformulation. query formulation can be
extremely complex if you want to solve real-world
business problems. Tool developers have been
struggling with this problem for decades, and have come
up with a range of solutions, with varying degrees of
success. The basic problem is that most interesting
business questions require a lot of data manipulation.
Even simple-sounding questions like How much did we
grow last year or Which accounts grew by more than
100 percent? can be a challenge to the tool. The query
reformulation service needs to parse an incoming query
and figure out how it can best be resolved. a query
reformulation service should be able to generate
complex SQL, including subqueries and unions.
Many of these queries require multipass SQL, where the
results of the first query are part of the formulation of the
second query. Since data access tools provide most of
the original query formulation capabilities,
Query retargeting and multipass SQL. The query
retargeting service parses the incoming query, looks up
the elements in the metadata to see where they actually
exist, and then redirects the query or its components as
appropriate. It allows us to query data from two fact
tables, like manufacturing costs and customer sales, on
two different servers, and seamlessly integrate the
results into a customer contribution report.
Aggregate awareness. Aggregate awareness is a
special case of query retargeting where the service
recognizes that a query can be satisfied by an available
aggregate table rather than summing up detail records
on the fly. For example, if someone asks for sales by
month from the daily table, the service would reformulate
the query to run against the monthly fact table. The user
gets better performance and doesnt need to know there
are additional fact tables out there. The aggregate
navigator is the component that provides this aggregate
awareness. In the same way that indexes are
automatically chosen by the database software, the
aggregate navigator facility automatically chooses
aggregates. The aggregate navigator sits above the
DBMS and intercepts the SQL sent by the requesting
client

A good aggregate navigator maintains statistics on all
incoming SQL and not only reports on the usage levels
of existing aggregates but suggests additional
aggregates that should be built by the DBA.
Date awareness. The date awareness service allows
the user to ask for items like current year-to-date and
prior year-to-date sales without having to figure out the
specific date ranges. This usually involves maintaining
attributes in the Periods dimension table to identify the
appropriate dates.
Query governing. Unfortunately, its relatively easy to
create a query that can bring the data warehouse to its
knees, especially a large database. Almost every
warehouse has a list of queries from hell. These are
usually poorly formed and often incorrect queries that
lead to a nested loop of full table scans on the largest
table in the database. Obviously, youd like to stop these
before they happen. After good design and good training,
the next line of defense against these runaway queries is
a query governing service.
Report development environment.This should include
most of the ad hoc tool functionality and usability.
Report execution server. The report execution server
offloads running the reports and stages them for delivery,
either as finished reports in a file system or in a custom
report cache.
Parameter- or variable-driven capabilities. For
example, you can change the Region name in one
parameter and have an entire set of reports run based
on that new parameter value.
Time- and event-based scheduling of report
execution. A report can be scheduled to run at a
particular time of day or after a value in some database
table has been updated.
Iterative execution. For example, provide a list of
regions and create the same report for each region.
Each report could then be a separate file e-mailed to
each regional manager. This is similar to the concept of
a report section or page break, where every time a new
value of a given column is encountered, the report starts
over on a new page with new subtotals, except it
generates separate files.
Flexible report definitions. These should include
compound document layout (graphs and tables on the
same page) and full pivot capabilities for tables.
Flexible report delivery:
Via multiple delivery methods (e-mail, Web, network
directory, desktop directory and automatic fax).
In the form of multiple result types (data access tool
file, database table, spreadsheet).
User accessible publish and subscribe. Users
should be able to make reports theyve created available
to their departments or to the whole company. Likewise,
they should be able to subscribe to reports others have
made and receive copies or notification whenever the
report is refreshed or improved.
Report linking. This is a simple method for providing
drill-down. If you have pre-run reports for all the
departments in a division, you should be able to click on
a department name in the division summary report and
have the department detail report show up.
Report library with browsing capability. This is a
kind of metadata reference that describes each report in
the library, when it was run, and what its content is.
Mass distribution.Simple, cheap access tools for
mass distribution (Web-based).
Report environment administration tools. The
administrator should be able to schedule, monitor, and
troubleshoot report problems from the administrators
module. This also includes the ability to monitor usage
and weed out unused reports.
Future Access Services
Its worth taking a few moments to speculate on the
direction of access services so we can anticipate where
future services might fit into our architecture.
Authentication and authorization. Logging on to the
network once will be enough to identify you to any
system you want to work with. If you need to go into the
financial system to check on an order status or go to the
data warehouse to see a customers entire history, one
logon should give you access to both.
Push toward centralized services. Data access
services soon will migrate either to the application server
or back to the database. Three forces are driving this
change. The first is the leverage the warehouse team
gets by implementing one set of access services (and
associated metadata) and making it available to a range
of front-end tools. The second is the push that tools are
getting from the Web.
Vendor consolidation. There are too many front-end
tool vendors for the market to support in the long run.
The Web push will cause some of them to slip. Once a
few clear leaders emerge, the rest will begin falling
quickly.
Web-based customer access. Another implication of
Web access to the warehouse is that businesses might
view the Web as a means of providing customers with
direct access to their information, similar to the lookup
services provided by express package delivery
companies today. For example, a credit card company
might provide significant value to its corporate customers
by allowing them to analyze their employees spending
patterns directly, without having to stage the data in-
house.
Desktop Services
Only a few services actually live on the desktop, but they
are arguably the most important services in the
warehouse. These services are found in the front-end
tools that provide users with access to the data in the
warehouse. Much of the quality of the users overall
experience with the warehouse will be determined by
how well these tools meet their needs. To them, the rest
of the warehouse is plumbingthey just want it to work
Four main data access categories are push-button,
standard reports, ad hoc, and data mining. Push button
applications generally provide a pushbutton interface to
a limited set of key reports, targeted at a specific user
community. Standard reports are the approved, official
view of information. They are typically fixed format,
regularly scheduled reports that are delivered to a broad
set of users. Ad hoc tools provide users with the ability to
create their own reports from scratch. Data mining tools
provide complex statistical analysis routines that can be
applied to data from the warehouse. Each of these
categories provides certain capabilities that meet
specific business requirements.

You might also like