You are on page 1of 12

Tom Keschl COS 571 Term Paper

Object-Relational and Object-Oriented Systems


Overview
Though both object-oriented and relational database management systems have been available for decades, there is still a fair amount of confusion as to what each can provide. While it is true that the vendors of relational products have found it necessary to expand their system to include some objectoriented features, these systems are, at the core, still based on the relational model. What is the difference between the object-oriented and these object-oriented extensions of the relational model? When is it appropriate to use each? Why is the object-relational model still so dominant in the marketplace? These questions (and more) consistently plague the managers and IT professionals who are launching a new project and trying to determine if there is any merit to switching from a relational to an object-oriented approach for persistence. In brief, the following table describes the major features of the two systems. These features will be further discussed and dissected in subsequent sections of this paper. Object-Relational Advantages Stability Widely used and supported Major vendors Well studied mathematical model Single type in storage (tables) SQL or SQL-style query syntax Schema definition and manipulation Disadvantages Impedance mismatch Queries require joins Limited set of types Object-Oriented Advantages No impedance mismatch Speed Developers familiar with style Optimization left to developers User-defined types Support for extended relations Disadvantages No universal type model Immature mathematical model SQL-style syntax harder to achieve Optimization left to developers Less support for changing schema

Table 1 - Differences between Object-Oriented and Object-Relational Systems

This is an overly simple (but intentionally so) representation of the two types of systems. While some of these items may seem obvious in terms of why they are in their respective categories, other topics are more subtle. Why, for instance, is it both an advantage and disadvantage for object-oriented systems that the optimization is left to the developers of a system? What is impedance mismatch and why is it

important to note that this particular issue is a disadvantage for object-relational systems and an advantage for object-oriented ones? The answers are not simple and will be covered in more detail later, but for now it is sufficient to say that these features combine to influence the results obtained when comparing both systems. Like discussion of features, these comparisons will be tackled later with a major focus on two categories: schema and speed. Each technology performs differently in these areas and knowledge of these differences is incredibly valuable when determining which system would be best for a particular project. Finally, these results will be culminated into a distinct list of when each system is best used. While this section is the most concrete derivative of the discussions in this paper, it is also perhaps the most subjective. The fact remains that object-oriented systems occupy a very small percentage of the market share and this final section may illuminate some reasons why this is the case, as well as (hopefully) encourage wider use of these types of systems by clarifying some of the lingering misconceptions about object-oriented database systems. Relational/Object-Relational Systems Object-relational systems are a natural extension of the relational model, meant to make up for some of the known deficiencies of the relational model. For complex applications, the simple types available to the relational model were simply not adequate. Object-relational systems extend the typing system and allow users to create nested tables in an attempt to better represent complex object data. This approach allowed some of the major database vendors to keep their market share, but because this is an extension of the relational model, it inherits many of the same types of problem that the relational model has, which is why discussion of the relational model is really relevant. Advantages Stability Because relational and object-relational systems have been widely used, adopted, tested and are well supported, many of the issues regarding the implementation of such systems have been resolved. There is an entire profession dedicated to building and administering these types of databases. Many schools offer a course curriculum dedicated to producing competent individuals in that field. All of these factors mean that the relational and object-relational systems are incredibly stable. Widely supported Because relational systems were adopted early as the persistence system of choice, they became the dominant technology on the market. This popularity continues today and has led to some of the factors that allow these systems to be stable. Most everyone is familiar with relational or object-relational systems, while relatively few people are familiar with those systems based on the object-oriented model (at least, in terms of persistence). Major product on the market Being the most well-known and widely supported system on the market has its advantages. Vendors, such as Microsoft, have built database integration into their platform, adding SQL-style querying that

relational database people feel more comfortable with. By comparison, object-oriented system vendors work hard to ensure their software works with the underlying virtual machines of the languages which they are meant to be used with and they dont have enough clout to be able to do anything else. Mature mathematical model Relational algebra (the mathematics behind the relational and object-relational models) has been studied for years by both industry leaders and independent researchers alike. This level of research has led to a great understanding of how to organize tables and queries in order speed up the query process. This has allowed the object-relational systems to become highly competitive in recent years in terms of speed. Single type in storage (tables) Having a single type in storage is an advantage in some respects, but a disadvantage in others. In terms of consistency, having a single type of data in persistence (the table) allows the applied relational algebra to shine. While this does restrict the form of the data that can be persisted, the benefits this kind of data provides for performance can, for some applications, outweigh the inability to have an orthogonal type of persistence. This also simplifies the schema and the process of translating from the design to the implementation of a particular database. SQL-style query syntax SQL, being developed originally for use with relational systems, is perfectly capable of defining queries on object-relational systems. This is an advantage because most of the database people are familiar with the way these queries work, and indeed, there are current and ongoing attempts to try to define an SQL-style query language for object-oriented systems in order to meet the demands of these users. This is one of the most visible differences between the object-relational and object-oriented systems. Schema definition and manipulation Because there is a single type in persistence, it is easy to understand schema diagrams and see their impact on performance. It also means that the schema can be changed and a process can be run to upgrade the data that is already persisted so that it conforms to the new schema. Disadvantages Impedance mismatch The relational model (even when enhanced, as in the object-relational model) does not perfectly match up with the object-oriented technology used to build most professional systems. As a result, certain applications of object-oriented technology cannot use an object-relational system to persist its data. This is the main problem which the object-relational model is meant to solve and while it has certainly made the relational systems easier to use with object-oriented programming languages, it has not yet mitigated all of the mismatch problems inherent between the two types of systems. Queries require joins Each join required by a query, when not optimized, can be an O(NxM) operation, where N is the number of rows in one table and M is the number of rows in the other. For large databases, this can be an

incredibly slow operation. However, relational databases have reduced OO databases performance advantage with improved optimizers (Leavitt, 2000). These optimizations are alleviating this problem, but it still exists. Limited set of types Relational systems have a limited set of types that can fill each column in a table. Object-relational has extended this set of types, including support for Binary Large Objects (BLOBs). However: The use of BLOBS is not an elegant solution, some of the drawbacks are: A BLOB cannot contain other BLOBs. This makes representation of composite objects impossible. BLOBs do not have the behavioral aspects of objects. It means that a BLOB cannot contain methods that manipulate its internal data structure. (Olsson & Nordkvist)

Such attempts at extending the types which can be stored in relational databases always result in similar types of problems. This is an even further result of the impedance mismatch problem. Object-Oriented Systems While object-oriented systems have been around for roughly the same amount of time as relational systems, they have not garnered nearly as much of the market share as one would expect. They remain a niche product: suited for use in areas where the challenges provided by the relational model are hard to overcome. These areas include CAD, aerospace industries, and complicated scientific applications. Object-oriented persistence is benefiting from a fairly recent surge in research from the academic world, as well as exposure for its applications in recent European space programs and in the Large Hadron Collider. The model itself is very familiar for professionals coming from object-oriented programming, as it has the same features and solves, by design, the impedance mismatch problem which plagues the relational based systems. However, because it has always remained the underdog in the market, professionals in industry are much less familiar with how these systems work. There are currently major attempts to make these systems more familiar, but so far, object-oriented systems remain a niche product. Advantages No impedance mismatch This is the main advantage of the object-oriented persistence systems. Because they are based on the same model as the major programming technologies, there is no need to spend any time or resources on translating data between the two: With OO databases, the application and the database use exactly the same object model. This isnt the case with relational databases, with which users must utilize an object model for the application and a relational-data model for the database. Users thus must develop mapping procedures between the object and relational models. (Leavitt, 2000)

This fact results in reduced time-to-market of projects that utilize these types of systems when compared against those that utilize the market standard relational systems. Speed Because object-oriented systems can use objects to represent the M:N relations and not joins, speed of these databases can be markedly faster. While object-relational systems have optimized their queries to maintain competitive advantage, object-oriented systems are well positioned to be the dominant performer in years to come. This will be the object of further discussion in a later section, but for now it is sufficient to say that object-oriented systems, as a rule, are faster than object-relational oneseven if the difference is marginal. Developers familiar with style This is a kind of style advantage. Organizations which utilize an object-oriented persistence system no longer have to worry about hiring for the traditional DBA and data entry positions traditionally required for object-relational systems. Anyone who is familiar with object-oriented design can simply design the object model and the database will just work. Optimization left to developers This is both an advantage and disadvantage. Because the demands of different applications are many and varied, most object-oriented database vendors have a hands-off approach to managing optimization. Certain default values are picked (such as number of links to traverse when updating stored data, or number of links to traverse when deleting objects), but it is left to the developers to tweak the settings as necessary. Clearly, however, having no good way to universally optimize complex transactions adds to the list of things that are required when developing these applications. User-defined types If having a limited set of types was a challenge for the relational systems, the ability to have complex user-defined types is a definitive advantage for the object-oriented systems. Orthogonal systems particularly, because they allow persistence of data regardless of type, suffer none of the drawbacks of having to fit square peg data into a round-hole model of persistence. Everything can be stored, no matter what. Support for extended relations The reason why object-oriented databases are fast is also an advantage. Rather than tables which have references to rows of data, object-oriented systems can store references in the same way that this can be done in object-oriented programs. This means that finding a particular object is simply a matter of traversing references rather than performing multiple joins. Disadvantages Many types Having no single type or model makes universal knowledge of object-oriented systems a hard, if not impossible, task. Every object-oriented system, by virtue of allowing any type to be persisted, is a new challenge in terms of finding and tracking down performance problems. There is some level of

acclimation that a optimization professional needs when first brought onboard to solve an organizations performance problems. Immature mathematical model Like relational systems, object-oriented systems have their own mathematical model. Unlike relational systems, however, research in object algebras are just commencing. While this promises that better optimization and knowledge of object-oriented systems is forthcoming, the present ability to optimize is somewhat limited (Bagui, 2003). SQL-style query syntax harder to achieve While most object-oriented systems are built to work with the underlying languages without modifying the interpreters, this means that achieving a SQL-style syntax is difficult, if not impossible, to achieve within the code. LINQ, an extension of the C# language, makes SQL-style syntax possible, but this initiative was more meant to make C# easier to use with Microsofts database product, than to promote object-oriented persistence.

Schema Comparison
Having completed an initial analysis of the advantages that both the object-relational and objectoriented technologies have, as well as the challenges they face, it is clear that one area worth closer examination is in the differences between the schema definitions of each technology. By sheer virtue of the models that each system is based on, there is a large degree of variance between the two. But there is an issue regarding schema of each which could use further exploration: changes in schema. Object-Relational As has been shown, there are both advantages and disadvantages to the relational model having table based persistence. The object-relational model extends this model to allow more complicated objects. However, the limitations remain: no object behavior can yet be stored using the object-relational model. There is an advantage to table storage, however. As the needs of the business, or the data itself changes, the data can be migrated to satisfy the requirements of the newest schema, making these databases more flexible. Impedance mismatch Currently, there are two different types of accepted standards for storing complicated objects-SQL:1999 (SQL3) and Oracle 8i. In SQL3, support for complex objects is done by creating references to custom tables. Under the hood, this is really an automatically generated id (called an OID or Object Identifier) which is referenced by the parent table. This OID is then used to locate the values in the fields of the complex object.

Figure 1 - Employ object as a column type (a) and table type (b) (Marcos, Vela, Cavero, & Cceres, 2000).

In the above example, the difference between the complex types which can be represented by SQL3 is shown. Figure 1a shows how an object is included as a column type. Figure 1b shows how the table for that type is constructed. The Oracle 8i specification behaves the same for complex objects, but it also allows nested tables in order to represent collections. Nested tables are exactly that: a column which contains a whole table as its type. This extension not supported by SQL:1999, allows [representation of] an object collection embedded in another object, that could be a natural way to implement UML aggregation (Marcos, Vela, Cavero, & Cceres, 2000).

Figure 2 - Nested tables allowed by Oracle 8i (Marcos, Vela, Cavero, & Cceres, 2000).

As is evident, however, none of these approaches overcomes the problems of impedance mismatch. The biggest missing gap, which would be hard to overcome, is that of storing behavior. These tricks of storing nested tables and referring to user-defined tables as types for columns only solve the issues of storing the data. There is no way, using any current object-relational technology, to store behaviors associated with objects in programs.

Schema conversion Because everything stored within an object-relational database is in tables, it is possible to define processes that will allow the data to be converted when the schema is updated. If a business or institution, during the operation of some administrative program, suddenly desires to add a field to a particular set of data, the data previously stored in the database can be easily made compatible with the changed schema. Though this requires writing and running a process which will do the conversion, it is a feat which can be accomplished without much difficulty. Even upgrading a relational system to use features of the extended object-relational system is possible, while keeping legacy data in a state where it can be used with the new types. This is important for many businesses for which data directly correlates to profit. In such cases, a relational system might be preferable so that is possible to maintain the same database while constantly receiving the benefits of advances made in object-relational technology. Such database models tend to be more flexible and can change to meet the ever-fluctuating needs of a business over the course of its lifetime. Object-Oriented On the other hand, object-oriented databases do not suffer from the problems and pitfalls of impedance mismatch. Object-oriented technologies both use the same object model and can play well together as a result. This also means that object-oriented databases can be truly orthogonal holding any user defined types without having to resort to tricks or emulation. These databases, as we will see, tend to be harder to change and no universal type model has yet been identified which will make this problem easier. No impedance mismatch Object-oriented systems are object-oriented by definition. They support inheritance and have no fundamental mismatch, nor do they need any sort of conversion routines: they use the exact same model. Because objects themselves are stored, the behavior can also be stored. This may never be possible for object-relational systems. Schema conversion While object-oriented systems solve the impedance mismatch problem, they introduce another unique problem. Anyone familiar with refactoring a class to add a field or method in an object-oriented program will be familiar with the challenges faced when doing the same thing to a database. In addition to the fact that no single standard data model has yet been developed for OODBs, most OODBs do not allow dynamic changes to the database schema, such as adding a new attribute or method to a class, adding a new superclass to a class, dropping a superclass from a class, adding a new class, and dropping a class. RDBs allow the user to dynamically change the database schema using the ALTER command; a new column may be added to a relation, a relation may be dropped, a column can sometimes be dropped from a relation. (Bagui, 2003) Updates to the schema must be made and managed carefully by the users. This problem makes such databases unsuitable for use in an environment where data is rapidly changing. Extension by inheritance, however, is fully supported. If the original data model is well designed, old data can be used with the updated schema, and new data can use the more specific objects that have the added

functionality desired. This kind of extension only makes sense in certain applications and does not suit the needs of many industries.

Speed Comparison
Speed, or more generally performance, is a major factor in most business decisions. For most IT projects, performance factors in as a high priority amongst all the potential non-functional requirements. This comparison also produces some of the more astonishing results of the entire comparison between object-oriented and object-relational databases. The following figures demonstrate the capabilities of a popular object-relational database (Hibernate) versus Versants object-oriented database:

Figure 3 - Hibernate vs Versant while performing complex queries (Kopteff, 2007).

Figure 4 - Hibernate vs Versant while performing structural modifications (Kopteff, 2007).

In Figure 3, Versant beats Hibernate in almost every complicated query. Those in which it loses could be due to the fact that Hibernate was optimized for the benchmark query, while the Versant database had no optimizations at all. These queries do a number of things: joins, projections, filters, etc. It is clear from this chart that Versant outperforms Hibernate on queries. Figure 4 demonstrates the capabilities of each technology while inserting and deleting items. Particularly in the case of delete, Versant beats the competition in these areas also. While this study seems to have a damning impact on the merits of relational or object-relational systems, there is a caveat: If all database applications required only OID lookups with database objects or memory pointers chasing other objects in memory, two to three orders of magnitude performance advantage for OODBs over RDBs would be valid. However, most applications that require OID lookups also have database access and update requirements that RDBs have been designed to meet. These requirements include bulk database loading; creation, update, and delete of individual objects (one at a time); retrieval from a class of one or more objects that satisfy certain search conditions; joins of more than one class; transaction commit, and so forth. For such applications, OODBs do not have any performance advantages over RDBs. (Bagui, 2003) The benchmark also seems weighted in the favor of things that object-oriented databases tend to do well. Another chart (not shown above) from the Kopteff paper shows that object-oriented databases are typically good at traversing references. This should be no surprise; it is infinitely easier, even when swizzling is involved, to traverse a reference than to do a table lookup. Although these benchmarks attempt to make a case that they are comparing apples with apples, it is probably more appropriate to say that they are comparing significantly different kinds of apples. That said, there is another factor to consider. Object-oriented persistence systems are just now making in-roads to academic and scientific institutions. The object algebra underlying such systems is still very immature. That immaturity provides hope that many performance limiting factors will be alleviated in the years to come and that object-oriented systems will become the top performer before too long.

Summary
Given all that has been discussed, when is it best to use each technology? The following table summarizes situations in the data and organization that would lend themselves to a choice of one type of system over another. The reasoning for each of these recommendations has already been discussed and is heavily influenced by what the different types of system can do well. Data, non-functional requirements, or business assets may entirely dictate what technology is best to use for a given situation. When to Use Object Relational Data is naturally separated into tuples (inventory, client records) Stability and support are critical Organization or structure of data changes When to Use Object-Oriented Speed is critical Object behavior must be persisted Complex model Model changes are infrequent or can be

When to Use Object Relational frequently Staff has a great deal of relational knowledge Automatic optimization is desired

When to Use Object-Oriented handled with object-oriented techniques Persisted data does not need to be kept through a schema change Developer optimization is desired

Table 2 Recommended uses for each type of system.

Conclusion With the complexity inherent in any IT project, steps must be made at the outset to determine what type of database is best for the persistent data of the application. While each system has its advantages, the fact that the relational and object-relational systems have been dominant in the industry means that many businesses have resources and knowledge in this area which makes it inadvisable to switch to an object-oriented model. The proven longevity of these systems also means that the vendors can provide more support and that the systems are statistically more stable. A single stored type makes it easy to keep persisted data through changes to the schema. Finally, optimization of queries and tables is minimal and a mature algebra exists for defining and manipulating relations. However, a more complex model and the desire to persist object behavior might be enough to force developers to use object-oriented systems. While schema changes are trickier, the syntax and ideas of object-oriented systems are more familiar to the modern developer, meaning that the overhead it takes to learn such a system is minimal. The lack of impedance mismatch should mean that time to market is slightly faster than in an object-relational system. However, business assets that understand how to administer a relational database would be at a loss to do the same with an object-oriented one. Finally, the lack of joins means that most queries and operations are faster than the same queries done on a relational database and maturation of the object algebra which describes these types of data base means that optimization improvements are bound to be made. The choice between the two types of system must take into account all these factors. Neither system is strictly better than another, but when compared in certain areas like schema and speed, each has its advantages and challenges. Before making a recommendation, each technology should be carefully weighed against the tasks it needs to perform and critical weaknesses of each type of system should be identified. By taking such steps, the system that best meets the needs of the project can be identified and put into place. It is clear that more and more businesses are requiring some of the functionality that object-oriented models provide; otherwise extension of the relational model would never have been necessary. There will probably come a time in the future when the object-oriented systems come to dominate the marketplace in the same way that object-oriented programming languages have. But this prediction has been made before and not yet come true. For the moment, both types of system serve different purposes, and neither is going anywhere anytime soon.

Works Cited
Bagui, S. (2003, July-August). Achievements and Weaknesses of Object-Oriented Databases. Retrieved from ETH Zurich Journal of Technology: http://www.jot.fm/issues/issue_2003_07/column2.pdf Kopteff, M. (2007). The Usage and Performance of Object Databases compared with ORM tools in a Java environment. Retrieved from Object Database Management Systems: http://www.odbms.org/download/045.01%20Kopteff%20The%20Usage%20and%20Performanc e%20of%20Object%20Databases%20Compared%20with%20ORM%20Tools%20in%20a%20Java %20Environment%20March%202008.PDF Leavitt, N. (2000, August). Whatever Happened to Object-Oriented Databases? Retrieved from Leavcom: http://www.leavcom.com/pdf/DBpdf.pdf Marcos, E., Vela, B., Cavero, J., & Cceres, P. (2000). Aggregation and Composition in Object - Relational Database Design. Retrieved from VU Matematikos ir informatikos institutas: http://www.mii.lt/adbis/local1/marcos.pdf Olsson, E., & Nordkvist, . (n.d.). Object Oriented Databases. Retrieved from Mlardalens Hgskola: http://www.idt.mdh.se/kurser/cd5130/msg/HT2000/download/OODBMS.pdf

You might also like