You are on page 1of 18

RA G ui

Java based GUI for Relational Algebra

Tom Mack & Nick Spilman


University of Minnesota
MSSE 2004 Capstone Project

Abstract
Relation Algebra, while easy to use, has not gained very much acceptance in the database query
world. This is partly due to the ubiquity of SQL and also the lack of tools and methodologies for
applying relational algebra concepts to the construction of complex queries. We attempt to
bridge the gap by providing a useful, robust tool for constructing complex queries. Our
application, RAGui, is an extensible Java application built upon the research and teachings of
Dr. John Carlis, University of Minnesota. It provides an elegant and intuitive interface for the
design and execution of relational algebra precedence charts and an open framework to support
the development of new operators and compilation techniques.
RAGui MSSE 2004 – Capstone

Table of Contents
1 Introduction .......................................................................................................................... 3
2 Relational Algebra ............................................................................................................... 3
3 Architectural Goals .............................................................................................................. 4
4 Architecture .......................................................................................................................... 5
4.1 Graphical User Interface ............................................................................................. 5
4.2 Engine ............................................................................................................................ 6
5 Design .................................................................................................................................... 7
5.1 Graphical User Interface Design................................................................................ 7
5.2 Engine & Data Design ................................................................................................. 8
5.3 Query Execution......................................................................................................... 11
5.4 Saving & Loading Precedence Charts..................................................................... 12
6 Example Extensions........................................................................................................... 13
7 Software Development Process ....................................................................................... 15
7.1 Tools Used................................................................................................................... 16
7.2 Design Process............................................................................................................ 16
7.3 Testing Strategy / QA ............................................................................................... 16
8 Lessons Learned ................................................................................................................. 16
9 Conclusion .......................................................................................................................... 17

Page 2 of 18
RAGui MSSE 2004 – Capstone

1 Introduction
The goal of this project was to design an extensible framework for executing Relation Algebra
queries. It was important that this framework support extensions to add new functionality.
Specifically, adding support for additional databases and operators. Supporting new and
different databases is important because each database system may require slightly different
SQL statements for correct execution. Supporting new operators is essential because the
development of new Relation Algebra (RA) operators is an area of ongoing research. On top of
this framework, a graphical user interface would be built that would allow the manipulation of
queries in their native chart format.

2 Relational Algebra
Because this project is built on top of the concepts of relation algebra, a brief introduction is
necessary. Relation Algebra enforces strict rules about how queries are created. For example,
both inputs and outputs of an operator must be valid relations. As opposed to the standard
concept of a database table, a RA relation is required to have a primary key and is inherently
unordered. This implies that any relation can be stored in a table, but any table cannot be
represented as a relation. Throughout this paper the term table will only be used to refer to the
actual entity within the database system.

Relational Algebra precedence charts provide an elegant means for designing queries for
relational databases. Even though SQL is the de-facto standard for querying databases, it has
some drawbacks. Simple queries are easily constructed, but creating complex queries tends to
yield unreadable SQL statements. RA precedence charts solve this by constructing the query as
a sequence of small transformations performed by RA operators. The result of each
transformation is a relation that is assigned a name. By breaking the query into smaller steps
and naming the output of each step, complex queries become easier to write and easier to
understand later.

Precedence charts are traditionally constructed on paper. They are then compiled to SQL
statements either by hand or with the Oracle RA Package. The Oracle RA Package is a set of
stored procedures designed to allow the representation of queries in a functional syntax.

A comparison of SQL, Oracle RA Package, and a precedence chart for a simple filter is shown to
demonstrate the advantage of using precedence charts. The following query finds all rows in
the creature relation that have a c_type equal to “person”.

TYPE IMPLEMENTATION
SQL CREATE TABLE person AS SELECT * FROM creature WHERE
c_type=’person’
ALTER TABLE person ADD PRIMARY KEYS ‘c_id’

Page 3 of 18
RAGui MSSE 2004 – Capstone

Oracle RA Package ops.go(filter_ra(‘creature’,’c_type=’’person’’,’person’))


Precedence Chart

To use the analogy of a compiler, the SQL represents the machine code, the Oracle RA package
represents a higher-level language and the Precedence chart represents a UML diagram. The
Oracle RA Package abstracts away the details of SQL and gives the user access directly to
relational algebra operators. The precedence chart hides the details of the Oracle syntax and
allows the query designer to focus on the design of the query. This is a trivial example, but for
queries containing more than a few operators and relations, managing them at a higher level is
advantageous to the user and allows for a more iterative and experimental experience.

Extensive information about relational algebra and precedence charts can be found in Prof. John
Carlis’ soon to be published book titled Mastering Database Analysis.

3 Architectural Goals
The primary goal of the architecture was extensibility. Adding support for new operators
should not require recompiling the entire application, just the extension of a few specific classes
and interfaces. Along with extensibility, RAGui needed to portable to different RDBMS systems.

Along the lines of extensibility, a secondary goal was to support optimization of Query
execution. Some designs of Queries inherently lend themselves to optimization. A Filter
operator followed by a Project has the potential to be consolidated into one operation. By doing
so, the Query can be optimized because intermediary tables will not have to be created.

Designing queries on paper is easy enough and one of the secondary goals was to make the
electronic experience as nice as the paper one. A goal was to hide some of the complexities of
using a database from the user, so they could focus on constructing and executing relational
algebra queries.

RAGui is an initial project and is not meant to be a fully functional, production ready
application. The primary focus of the initial work was on developing an extensible framework
with just enough features to have a working prototype. By meeting these goals, RAGui will
support future development into a production ready application.

Page 4 of 18
RAGui MSSE 2004 – Capstone

4 Architecture
At the highest level, the architecture consists of two components: the front-end graphical user
interface for query design and the back-end engine for processing. The graphical user interface
is used to design and layout queries. Once constructed, the query is passed to the Engine for
execution and interaction with a relation database management system.

Figure 1 - Architecture

4.1 Graphical User Interface


The user interface consists of three major components. The left side of the window contains a
list of operators and a list of relations. The remainder of the window is used as the drawing
window for constructing the queries. Queries are constructed by adding operators and relations
from the two lists to the main drawing area. They are then arranged on the screen and
connected to form queries. In addition to those components, the GUI has components for
editing the parameters of Operators and Relations and for viewing a Relation’s data.

Page 5 of 18
RAGui MSSE 2004 – Capstone

Figure 2 - Screenshot of Interface

4.2 Engine
The engine contains three key components: Operators, Matchers and Agents. Operators are
defined by a name, a set of parameters, and definitions of possible input and output relations. It
was decided that operators should not be responsible for compiling themselves and should
merely be data objects. There were two reasons for this. The first reason is to provide platform
independence by ensuring that an operator is not tied to a particular database. This helps meet
our primary goal of extensibility. The second reason is to allow sequences of operators to be
optimized to avoid creating unnecessary intermediary tables. This helps meet one of our
secondary goals, namely optimization.

After laying out the design with Operators and Relations, the Matchers and Agents are used to
perform the actual compilation and execution. Agents are specified in an ordered list for each
supported database. Each Agent contains a Matcher which is used to locate a specific set of
operators that are processed by an Agent.

Page 6 of 18
RAGui MSSE 2004 – Capstone

5 Design
5.1 Graphical User Interface Design
The graphical user interface of RAGui is a working prototype. It is based on Java Swing and
applies the common Model-View-Controller design. Operators and Relations are added to the
primary Query pane and then connected to form a complete query.

Both Relations and Operators are drawn on the Query pane using the QueryNode interface. The
QueryNode defines a set of methods to obtain the renderer, get and set locations, locating
connection points, and can return the list of parameters. The QueryNode interface is
implemented by the RelationModel and OperatorModel classes which also extend the Engine
defined Relation and Operator classes respectively.

The parameter lists allow the OperatorModel and RelationModel objects to expose property-
value pairs to the user which can be edited in a popup dialog. These parameters represent
slightly different things for Operators and Relations. Parameters for relations allow the user to
specify a long name and a database alias. Parameters for Operators represent the customizable
parameters that are stored in the Operator and used during execution. Examples include the
condition for a filter operator or the column list for a project operator.

The Renderer which is returned by each QueryNode implementer describes the on screen look
of the Operator or Relation. All Relations are drawn using a rectangular shape whereas
Operators use the renderer specified in the operator configuration file. In addition to defining
the shape, Renderers also define a set of connection points (represented as green boxes). The
connection points are what the user would click on to draw a connection line between two
QueryNodes. Each connection point is given a name (e.g. topleft or bottom). For Operators, these
names are referenced in the operator configuration file where they are mapped to operator
specific names (e.g. leftinput and rightinput).

Figure 3 - Example of Renderers

Page 7 of 18
RAGui MSSE 2004 – Capstone

5.2 Engine & Data Design


5.2.1 Class Diagram
The full class diagram is too complex to show here, so only the more interesting methods and
relationships are shown. The three key components mentioned earlier map to our class diagram
as shown. The Operator, OperatorDefinition and QueryParameterList classes represent
Operators. Agents are represented by the abstract EngineAgent class and Matchers are defined
by the EngineAgentMatcher interface.

Figure 4 - RAGui Class Diagram


The main interface to the RAGui engine is via the Engine class. The Engine class is a coordinator
and contains three primary helper classes; QueryExecutionManager, DatabaseManager, and
OperatorLoader.

5.2.2 Class Details


QueryExecutionManager
Upon connecting to a database, a QueryExecutionManager is configured with an ordered list of
EngineAgent objects that were defined for the type of database to which the connection was
made. The QueryExecutionManager is then responsible for coordinating the execution of the
query. More details on the execution of a Query are detailed in the next section.

CompiledQuery
After a Query is compiled by the QueryExecutionManager, a CompiledQuery object is created.
It contains a collection of CompiledSubQuery objects. A CompiledQuery also maintains a set of
operators that it uses. This ensures that an operator belongs to only one SubQuery. It is possible

Page 8 of 18
RAGui MSSE 2004 – Capstone

to have multiple EngineAgents that match the same pattern. If the CompiledQuery did not
check this, there would be issues in Query execution.

CompiledSubQuery
A CompiledSubQuery contains a SubQuery and an EngineAgent. In other words, it contains the
data (SubQuery) and the operations (EngineAgent) to be performed on the data. EngineAgents
have such methods as getCreationSql which returns an SQL string generated by using the
SubQuery. A CompiledSubQuery object can also determine if it can be executed by determining
if its input relations have been created.

SubQuery
The SubQuery class is a wrapper for a List of Operators and provides an additional method for
retrieving all input relations. Input relations are any of the relations that are not the result of one
of the operators. Retrieving the input relations is useful for determining if a SubQuery can be
executed. All input relations have to be valid for a SubQuery to be executable.

EngineAgent
An EngineAgent is a class that generates the commands for executing relational algebra as well
as finding the patterns in the Query. EngineAgent is an abstract class that has three key
methods that need to be implemented by all EngineAgent implementations. The getCreationSql
method returns a SQL String that is used to create the result relation. The getPrimaryKeys
method returns a String of columns that is used to set the primary keys of a relation. Lastly, the
getMatcher method returns the EngineAgentMatcher for the EngineAgent.

EngineAgentMatcher
EngineAgentMatcher objects are used to find patterns in a Query. Each EngineAgent is
configured with an EngineAgentMatcher that can find operators or sequences of operators that
are compatible with the EngineAgent. EngineAgentMatcher is an interface and contains only
what method, find(), which returns a SubQuery. Successive calls to find() yield different
SubQuery objects until all possible matches have been found

DatabaseManager
The DatabaseManager is the central point of contact when communicating with the database.
Encapsulated in the DatabaseManager are methods to execute SQL statements and to connect
and disconnect from the database. DatabaseManager contains an instance of the Schema class.
Upon invoking the connect() method, a new Schema object is initialized with the tables that
reside in the database.

Schema
The Schema class is essentially the browser of a database schema. It is used to obtain a list of
tables in the database. It contains methods to allow the creation of Relations as well as the
underlying tables in the database. Many of the methods delegate work to the DatabaseManager.

OperatorLoader

Page 9 of 18
RAGui MSSE 2004 – Capstone

The OperatorLoader contains a collection of all the OperatorDefiniton classes. This class parses
the XML file and stores the definitions. Then, when either the GUI or application need to
instantiate a new Operator, they call OperatorLoader to get the appropriate OperatorDefinition.

OperatorDefinition
OperatorDefinition objects define the structure of an operator, including which parameters,
input relations, and output relations are valid and the name for the Operator.
OperatorDefinition objects are created by the OperatorLoader when the application loads.

QueryParameterDefinition
Contained within an OperatorDefinition is a list of valid parameters. QueryParameterDefinition
defines the structure of a QueryParameter including a key, description, default value and
whether or not it is required. When an Operator is instantiated from an OperatorDefinition,
each QueryParameterDefinition is used to create a QueryParameter object in the Operator.

Query
A Query is the main class used when constructing a query in memory. At its fundamental level,
a Query contains a set of Operator objects and Relation objects. Query provides some useful
helper methods, such as addEdge() which sets the appropriate input/output relation of an
Operator. It also guarantees that a Query can validate itself, instead of forcing validation onto
the Operator. By storing the Relation objects within the Query class, all of the head relations
(relations that are not the output of any Operator) can be retrieved. It is used in the query
execution process to determine which Operators should be executed first.

Operator
Contained within the Query, the Operator class is the most significant of the classes. An
operator contains references to its input and output relations as well as a list of parameters that
can be set. Parameters are such items as a condition or a column_list. The addEdge() method in
the Query class is equivalent to calling setInputRelation/setOutputRelation on an individual
Operator. Each Operator stores a reference to its OperatorDefinition object. Operators are
constructed by passing an OperatorDefinition to the Operators constructor.

QueryParameterList
The QueryParameterList contains a list of QueryParameter objects. It provides methods for
accessing specific QueryParameter objects directly.

QueryParameter
A QueryParameter object references a QueryParameterDefinition and stores of its current value.

Relation
The Relation class is used to store information about relations used within a query. It references
a JdbcTable object that contains all of the database information about a particular table. It
includes methods to check if a Relation is valid for use in a query.

JdbcTable

Page 10 of 18
RAGui MSSE 2004 – Capstone

The JdbcTable class contains information about a table in the database. The Relation objects use
JdbcTable to access the actual table in the database. It stores the information that the JDBC
package needs to uniquely refer to a table.

JdbcColumn
The JdbcColumn is used to access individual columns in a JdbcTable. It contains a name and a
Boolean stating whether or not it is a primary key.

5.3 Query Execution


A Query is constructed using the GUI and then passed to the QueryExecutionManager which
controls the execution of Query objects. It consists of two major steps: the compilation of the
query and the execution of the query.

5.3.1 Compilation
Compilation of a Query involves breaking down the Query into SubQuery objects and
populating the CompiledQuery object with a collection of CompiledSubQueries.
public void compile( Query query ) throws RaGuiException {
// Initialize compilationData object
compiledQuery.reset();

// iterate through engine agents and find all subqueries


for( Iterator i = engineAgents.iterator(); i.hasNext() ; ) {
EngineAgent engineAgent = (EngineAgent) i.next();
Set s = engineAgent.find(query, compiledQuery);
compiledQuery.makeCompiledSubQueries(engineAgent,s);
}
}

The compiledQuery is a member of the QueryExecutionManager. The reset() method clears out
any previously CompiledSubQuery objects. Upon initialization of the QueryExecutionManager,
an ordered list of available EngineAgents is stored in engineAgents. Order is important because
more powerful EngineAgents should be listed first to ensure that they view the entire query.
The for loop then iterates through the engineAgents to find sets of SubQueries in the query.
Set s = engineAgent.find(query, compiledQuery);

The find method on an EngineAgent returns a set of all SubQuery objects found. For example, if
a FilterAgent was searching the Query and three filter operators were present, find would
return a set of three SubQuery objects. The compiledQuery class is passed to the method to
ensure that no operator is included in more than one SubQuery..

Once the find method returns a set of SubQueries, they are added to the compiledQuery object.
compiledQuery.makeCompiledSubQueries(engineAgent,s);

This method will create a CompiledSubQuery object in the compiledQuery object for each
SubQuery that is the Set s. Compilation continues until there are no more EngineAgents.

Page 11 of 18
RAGui MSSE 2004 – Capstone

5.3.2 Execution
Upon completion of the compilation, execution is simply a matter of iterating though all of the
CompiledSubQuery objects created in the compiledQuery object and executing them.
public void execute( Query query ) throws RaGuiException {
boolean next;

compile(query);

// Iterate through executableEngineAgents and execute them


do {
next = executeNext();
} while (next);
}

The execute method calls the compile method before the do/while loop is entered. This ensures
that the query has been compiled and that it is ready to be executed. The do/while loop then
calls executeNext() until there are no more CompiledSubQuery objects to execute().

The executeNext() method was added to allow step-by-step execution of a Query.


public boolean executeNext() throws RaGuiException {
CompiledSubQuery agent = null;
boolean found = false;

for( Iterator i = compiledQuery.iterator() ; i.hasNext() ; ) {


agent = (CompiledSubQuery) i.next();
if(agent.canExecute()) {
i.remove(); // We are done with this CompiledSubQuery
agent.execute();
found = true;
break;
}
}

return found;
}

The iterator supplied by the compiledQuery object will walk through the list of
compiledSubQuery objects and determine if they can be executed. If the CompiledSubQuery
objects can be executed, it is removed from the list, executed, and true is returned. By removing
it from the list, we ensure that it will not be executed more than once.

There is no elaborate logic employed currently to determine the best iteration process for the
CompiledSubQuery objects. It simply looks at each object in the collection and determines if its
inputs have been supplied yet. If so, it is removed from the set and executed. The process
continues until either all have been executed, or there are no more possible ones to execute.

5.4 Saving & Loading Precedence Charts


Once created, precedence charts are easily saved and loaded. Precedence charts are a collection
of relations and operators and are saved in an XML format. The general structure contains a list
of relations followed by a list of operators. The simple query from earlier is represented by the
following in XML.
<query>
<name>Query Name</name>

Page 12 of 18
RAGui MSSE 2004 – Capstone

<relation id=”1” coords=”300,50”>


<jdbctable>
<catalog />
<schema />
<table>creature</table>
</jdbctable>
<primary>true</primary>
</relation>
<relation id=”2” coords=”300,200”>
<jdbctable>
<catalog />
<schema />
<table>person</table>
</jdbctable>
<primary>true</primary>
</relation>
<operator id=”3” coords=”300,350”>
<type>FILTER</type>
<inputs>
<input name=”input_relation” id=”1” />
</inputs>
<parameters>
<parameter name=”condition” value=”c_type=’person’”/>
</parameters>
<outputs>
<output name=”output_relation” id=”2” />
</outputs>
</operator>
</query>

Each relation block contains the JDBC table information and a statement determining if a
relation is primary. Relations marked as primary are persisted between executions of queries.
This allows for intermediary relations to be removed when constructing complex queries. The
operator structure mimics that of the operator definition. Each input relation, parameter and
output relation are stored. Relations contain a reference to previously defined relations.

6 Example Extensions
The architecture was designed to be very open to new projects and extensions. The core
application would not have to be modified extensively to carry out the extensions below.

6.1.1 Addition of New Operators


Inherent to the architecture is the ability to add new operators to RAGui without modifying the
core application. Adding operators is a two-step process: define the operator and design an
EngineAgent to compile it. Defining the operator involves editing the operators.xml file to add
a description of the new operator. This description must include the name of the operator, the
name of the Renderer class used to draw the icon, a list of inputs, a list of outputs, and a list of
parameters. The following is an XML fragment for the Filter operator.
<operator name="FILTER">
<property name="renderer" value="edu.umn.msse.ragui.UnaryOperatorRenderer" />
<inputs>
<input name="input_relation" connector="top" />
</inputs>
<parameters>
<parameter name="condition" description="Condition" required="true" />
</parameters>
<outputs>
<output name="result_relation" connector="bottom" />
</outputs>
</operator>

Page 13 of 18
RAGui MSSE 2004 – Capstone

The name of the renderer class is specified as a property with the name renderer. If the new
operator uses the same icon with the same connection points as an existing operator, the
existing Renderer may be used for the new operator. If icon is different, a new Renderer must
be created by extending NodeRenderer and overriding the necessary methods (most notably,
the draw() method). This new class would then be placed on the application’s classpath and
named in the operator description in the operators.xml file.

The name of the renderer class is the only required property, but other name-value pairs could
be included. These properties would then be available at compile time and could be used to
customize the behavior of a Matcher or EngineAgent (for instance, a Matcher could find all
operators with a specified property set to true).

The list of inputs and outputs are simply pairs of strings that name the input or output and
relate it to a connection point defined by the Renderer. For instance, when creating a Join
operator one would use the HalfHouseRenderer which names its connectors peek and side. The
inputs for this operator might then be described as <input name=”left_input”
connector=”peek”> and <input name=”right_input connector=”side>. The name left_input
would then be used when compiling the operator to refer to the relation connected to the peek
of the half-house icon.

The list of parameters is composed of name-value pairs that can be edited by the user. For
instance, a Filter operator must define a condition statement. For each parameter, a name,
description, and default value would be configured. The name is used to refer to the parameter
when compiling the operator. The description is the human-readable name of the operator that
is displayed to the user in the parameter edit dialog. The default value is the value that the
parameter will have if the user does not change it.

The second step of creating a new operator is creating an EngineAgent that has responsibility
for compiling it. To ensure that queries can always be executed, for each operator there should
always exist at least one EngineAgent that is capable of compiling the operator.

First, a Matcher must be selected. If the goal is create an EngineAgent that just compiles one
operator, the SingleOperatorMatcher may be used. This Matcher is configured with the name of
an operator and will then return SubQueries composed solely of that operator. A more complex
Matcher could be employed if necessary; this is discussed in section 6.1.2.

Next the actual EngineAgent must be created. This involves creating a class that implements the
EngineAgent interface. This interface specifies getCreationSql method, which returns an SQL
string when given a SubQuery and a getPrimaryKeys method, which returns the primary keys
of the result relation. The SQL statement returned would depend on the Operators within the
SubQuery, including their input and output relations and their parameters. The existing
EngineAgents can be used as a starting point.

Page 14 of 18
RAGui MSSE 2004 – Capstone

Finally, the EngineAgent must be made available by listing it in the engine.xml file under each
database system with which it is compatible.

6.1.2 More Complex EngineAgentMatcher and EngineAgent Classes


The current EngineAgent classes and accompanying SimpleEngineAgentMatcher class are very
basic in that the EngineAgent classes map directly to operators. For example, the FilterAgent
class contains the necessary information to execute a filter operator. Future EngineAgent classes
could take multiple operators as input and produce the output in one step. For example, a times
followed by a filter is easily represented as one SQL statement. By skipping the intermediary
step, the database is not forced to create a potentially huge temporary table. More sophisticated
EngineAgents would require more sophisticated EngineAgentMatcher classed to find
additional patterns, such as a times followed by a project followed by a filter. These two classes
have been designed to be as flexible as possible, allowing a wide range of choices for potential
implementations.

6.1.3 Support for Additional Databases


Because JDBC is used to abstract the actual communication with the databases, adding support
for new databases that have JDBC drivers is very straightforward. In most cases, the only
change required is a modification to the configuration file to specify the database driver’s class
name, DSN, and supported EngineAgents.

The database driver’s class and DSN are JDBC specific items and are only applicable to JDBC
supported databases. The class is the name of the standard JDBC driver for the database. The
DSN specifies the format of the DSN used to connect to the database. It can include
placeholders for the host name, database name, user name, and password, which are specified
at runtime from the user interface.

Deciding which EngineAgents are supported involves determining which of the existing
EngineAgents produce compatible SQL. If a basic operator’s EngineAgent does not produce
compatible SQL, a new EngineAgent will be required to compile that operator.

If additional functionality is required, a new DatabaseManger can be specified. Because all


communication with the database is routed through the DatabaseManager, a new class
implementing its interface could customize the access to the database and bypass JDBC entirely.
This usually would not be required, but one obvious use would be to stub out the actual
database for testing.

7 Software Development Process


Although we did not use a formally documented software development process, we did have a
good working process in place to develop our application. Each week we would meet and
decide what the major pieces of functionality needed to be completed that particular week. We

Page 15 of 18
RAGui MSSE 2004 – Capstone

would then decide who would implement the changes and go from there. Because the software
was new to both of us, we did not have formal requirements beyond our initial goals for the
system. We felt that we would learn so much along the way that it was better to spend our
energy on design and coding than making formal requirements document.

7.1 Tools Used


We decided very early which tools we would use. Eclipse was chosen as the IDE because it is an
open source project and has a lot of nice built-in features for team development, including CVS
support. We used a CVS repository to manage our source code as well as other project files
including this report. Eclipse also included support for JUnit. Lastly, Eclipse gave us coding
standards enforcement for free. We began by setting up the rules for source code generation
and then imported the configuration into each of our Eclipse workspaces.

7.2 Design Process


Our design process was pretty simplistic. Because we did not have a clear understanding of the
evolution of the system, our design tended to be very fluid and rapidly changing. We would
design the application one way and quickly realize that there may be a better way. We did a lot
of refactoring to accommodate our quickly evolving design changes and found that the tools
supported us well. A good example of our quickly evolving design is the EngineAgent /
EngineAgentMatcher relationship. Originally, each EngineAgentMatcher contained a list of
available EngineAgent classes to find in a Query. We then decided, after a period of
development that the EngineAgent should contain an EngineAgentMatcher. That way, each
EngineAgent will be able to tell others what pattern it wants to look for. It simplified our code
and provides more extensibility for future projects.

7.3 Testing Strategy / QA


It is widely written that good unit testing will help capture a good majority of the defects in
software. Because our application is written in Java, we were able to use JUnit for our entire
unit testing. As mentioned earlier, Eclipse’s built in support for JUnit gave us our necessary unit
testing built-in. As we wrote our main classes, we also wrote the accompanying unit tests to
ensure that our classes were correct. In the end, we had a good set of unit tests to run against
our code base. Re-running the full suite of unit tests helps us validate any new source code
introduced.

8 Lessons Learned
Some features we have designed for knowing that with enough time they could be
implemented. Along the way we learned good ways to apply the ideas learned in the M.S.S.E.
program to solve a real world problem. Using Relational Algebra for constructing queries is a

Page 16 of 18
RAGui MSSE 2004 – Capstone

good alternative to SQL, but there is very little tool support for it. Creating a graphical user
interface for relational algebra helps fulfill a real need.

We found refactoring to be a major factor in the success of RAGui’s extensible framework. The
tool that we used supported refactoring quite nicely, so it was easy for us to change our mind
about a particular approach. We could try small experiments with an aspect of the design and
then decide that we did not like the approach, and try something different. We were always
looking for better ways to solve the problem.

Lastly, we took out of this project a better understanding of how to design an extensible
framework that needs to support future enhancements. During the entire design process, we
knew that this was the primary goal of the system. This helped us with our design decisions
because we were continuously looking at the evolution of the design and making sure that our
decisions supported future extensions.

9 Conclusion
This project has been a good learning experience, both from a pure programming standpoint
but also from a software engineering aspect. We utilized the theories and ideas presented in the
M.S.S.E. program to produce a quality application that can be used to both learn relational
algebra and create complex queries for use in any Java based application. We hope that future
projects can expand upon our solid architecture and make RAGui a very usable and high
quality application.

Page 17 of 18
RAGui MSSE 2004 – Capstone

Third Party Software Used


Java http://java.sun.com/

Eclipse http://www.eclipse.org/

Log4j http://logging.apache.org

JDOM http://www.jdom.org

MySQL http://www.mysql.com

PostgreSQL http://www.postgresql.org

Doxygen http://www.doxygen.org

JGoodies http://www.jgoodies.com

Page 18 of 18

You might also like