You are on page 1of 12

A TWO-WAY ATTRIBUTE TREE IN P2P OODB SUPPORTING MULTIATTRIBUTE AND RANGE QUERY WITH IMPROVED QUERY RESPONSE TIME

Goh Chiao Wei, Lim Tong Ming School of Computer Technology Sunway University College Bandar Sunway, Malaysia jaygcw2002@yahoo.com, tongminglim@gmail.com Abstract P2P has been one of the most rapid growing technologies in computing world. It has been introduced to overcome the bottleneck of typical client-server paradigm, in which it has the limitation of handling the rapid growing demanding of data. Peer-to-peer object-oriented database is robust, fault-tolerance and scalable. Therefore, it is proposed to overcome the drawbacks of typical client-server data management architecture. DHT, structured P2P system, provides indexing facility enables users to locate a piece of data based on a given key through an efficient routing algorithm. The proposed object-oriented database was built on the openChord, a Chord implementation. In this paper, we present a P2P object-oriented database which supports multi-attribute and range queries. The objects are managed locally in each peer to reduce the network traffic. Object set returned is stored around the nodes along routing paths to provide efficient access to a desired object in the subsequent queries that fulfil the query criteria. Nodes holding the object pointers are distributed around the network to reduce the lookup time. Nodes are organized into logical class hubs where each hub handles pointers to nodes storing objects belonging to a particular class. Queries involving multi-attributes and ranges are routed to the related class hubs to retrieve the desired objects. Keywords: Object-Oriented Database System; P2P System; DHT; Chord 1. Introduction Peer-to-peer (P2P) networks have been one of the most rapid-growing computing technologies nowadays (Harren et al.). The introduction of P2P networks overcomes the limitations founded in conventional client-server network, where the network has a rigid and highly centralized architecture. The conventional client-server network architecture requires a centralized server or cluster of servers to store and process data requested from the clients in the network. Hence, bottleneck might occurred in the servers and thus facing the scalability problem. This creates the issue of single point of failure where once the server is down or disconnect, the transactions in the network will fail. Furthermore, setting up a client-server environment is costly. Apart from this, the demanding of data for storage and processing purposes have been growing extensively. The growing for the performance of processor against the growing of data is not proportional (Vilaca and Oliveira, 2009). The typical client-server systems do not afford to support this rapid growing demanding of data. Therefore, P2P systems come in place to overcome such phenomena. P2P comprises clusters of computers working together in the network to build a cheap and scalable server solution (Bratsberg). Besides, P2P systems have greater scalability, direct access to the nodes which contain the desired resources/data without passing through a central server, robust and resilient to the churn.

Proceedings of Regional Conference on Knowledge Integration in ICT 2010

386

The first generation of P2P systems is the unstructured P2P. The typical examples of unstructured P2P system are Gnutella and Napster. However, unstructured P2P system experienced the poor scaling problem (Harren et al.) and does not guarantee the reachable of desire data in network because of the data placement and network constructions are essentially random (Harren et al.). To overcome the limitations introduced by unstructured P2P, plenty of works on P2P systems have concentrated on structured P2P (Stoica et al., 2001; Rowstron et al., 2001; Zhao et al., 2004; Ratnasamy et al,. 2001) in recent years. These structured P2P networks proposed Distributed Hash Table (DHT), where nodes are logically organized in a structured way and are organized following a predefined pattern, such as ring (Stoica et al., 2001) for Chord, Cartesian Coordination (Ratnasamy et al,. 2001) for CAN and etc. Due to the DHTs excellent properties to manage dynamic systems (Prada et al.), DHT has an edge over unstructured P2P systems. For instance, DHT promises a logarithm routing hops to reach the destination node with an explicit load balancing (Bharambe et al., 2004). DHTs provide an index facility to map the data to the nodes in the network and routing algorithm to route requests to the correct node holding the data source. In addition to this, DHT also provides excellent scalability where nodes can join and leave the networks without any restrictions (Prada et al.). The P2P paradigm has been introduced into the database community (Sartiani et al.) to solve the problem of client-server architecture as stated above. P2P paradigm is able to avoid a single point of failure, which in turn increases the reliability of the system. . In this paper, a preliminary design on the building of an object-oriented database system on top of the openChord, an implementation of Chord DHT to provide a P2P object-oriented database system. The combination of database system with the structured P2P network making the data stored in each node accessible by all the other nodes. The blurring of the client and server role improve the processing time as nodes can access to other nodes directly without going through a server. The objective of this paper is to improve query response time by using attribute search trees with the Chord DHT to support multi-attribute and range query for the P2P database system. The rationale behind choosing structured network as the overlay network over unstructured network because structured network promises the deriving of desired object sets as long as the objects are available in the network. Besides, as mentioned above, structured network guarantees requests reaching the destination node in logarithm hops, which greatly reduce the query response time. 2. Related Works There were existing works on the building of database management system on structured P2P network. Yu et al. (2008) proposed a peer-to-peer database model based on Chord. It used Local Relational Model (LRM) as the database model defined for P2P database system. It had a node catalog on the application layer on top of the overlay network. It made use of the Chord to organize the nodes in the network and to locate the data. The keyword information of a data is put into an appropriate node and thus forming a node catalog. These local nodes catalogs are group together become a global node catalog. There is a manager peer for each catalog where each catalog might represent a region. Data transmission between regions is going through the manager peer. It is somehow similar to the leader in our proposed design where leader is the peer which in-charge of a particular class in the database.
Proceedings of Regional Conference on Knowledge Integration in ICT 2010 387

XPeer (Sartiani et al.) is a XML P2P database system. It manages data spread over an openended network. There is no global schema defined in the database. It is build upon hybrid P2P where peers may perform some administrative tasks. Peers shared the data in the form of tree-shaped, called tree-guide. Peers are logically organized in the cluster of nodes where a super peer is in-charge the management of the cluster of nodes. XPeer uses FLWR subset of XQuery for the query. Similar to our designed, there is a leader which corresponds to the super peer in XPeer for the management of a group of logical nodes. Besides databases, there are also various protocols designed for implementing multi-attribute and range queries on DHT. Mercury (Bharambe et al., 2004) supports multi-attribute and range queries by initialize various logical attribute hubs for various attributes schema. A physical node may involve in multiple attribute hubs. An attribute is responsible for an attribute in the schema. Each attribute hub forms a ring topology and each node handles a certain range of numeric values. Multi-attribute query can be executed by traversing the query to different attribute hubs for different attributes in the query. Range query is also be done by traversing the query within the attribute hub to the correct node responsible for the numeric range. This is somehow similar to our proposed design where various attribute trees handle various attributes. Ramabhadran et al. proposed an adaptive solution to support range queries in DHT-based systems. It supports range search based on the range search tree (RST). Values of the attribute are stored at the leaf node of the RST. Each non-leaf node corresponds to the union range of its two children. Hence, it is a complete and balanced binary tree with [log n] + 1 levels. RST periodically perform path maintenance protocol to maintain the band information of the tree for the query purposes. The query is resolved by determines the minimum cover (MC) of the range in the query by running simple top-down recursive algorithm to find the first node has the largest range within the query range and recursively repeats this process for the segments of range which has not been decomposed yet. Once MC is computed, the query sends to each corresponding MC node in the overlay network. It is similar to our design where values of the attribute are also stored in the leaf node. Our design is also adaptive to the changes of the number of objects in the system. Wang and Li (2009) proposed a range query model to support range query over DHT using B+ tree. The model proposed by Wang and Li (2009) also support for multi-attribute query. It selected Chord as the overlay network. In order to support multi-attribute, there will be several B+ trees in which each tree responsible for an attribute. By using the hash function provided by Chord, it ensures that nodes in B+ tree are evenly mapped to the nodes in Chord. The resources of the attributes are registered in the leaf node of B+ tree. For a range query, the query starts at the root node of the tree. The query is decomposed into several sub-queries and the sub-query must ensure contained in one child node. The sub-query continues to decompose until reaching leaf node. In order to support for multi-attribute query, Wang and Li (2009) adopted store and forward method where the query is execute in the first attribute until reaching leaf node and the query is forwards to the second attributes and so on. 3. The Proposed System Architecture The proposed solution improves multi attribute and range query response time for a P2P persistent object management system on OpenChord P2P platform, an implementation of Chord.

Proceedings of Regional Conference on Knowledge Integration in ICT 2010

388

3.1 Node Structure Nodes connect to P2P network via Chord overlay network as nodes are logically organized in ring overlay where every node is connected by a predecessor node and a successor node (Figure 1). The network module is connected to the Chord network in the proposed design. Nodes are connected in the overlay network are presented in dotted lines. Every node contains a finger table, predecessor list and successor list that maintain the logical overlay of a P2P network. Users connect to the database system through the user interface reside in the presentation module. It is the bridge between users and the system. It receives instructions from users and return results to users graphically. Query and Update Manager responsible for the data query, update data and concurrency control. Caching caches previous object sets and node information of the Attribute Tree (Section 3.3) to accelerate the query performance. Local storage is the local database in which it stores all the objects in the local machine. Export Schema allows the objects in the schema to be shared and modified by other nodes. Network module is the module acts as a communication bridge between the local node and the nodes in the network.

Figure 1: Node structure. 3.2 Class Hub Query search starts from a root class and navigate to the desired user-defined class that contains the desired objects. The search will drill into the classs attributes and finally goes into the attributes values to retrieve matched objects which fulfill the constraints of a query. Hence, in our proposed initial design, nodes are responsible to manage object pointers of a registered class to a leader. The object pointers are stored in the nodes in the form of <Node Identifier; OID; Value>. Each object has a unique OID, which similar to the primary key in relational database. Value represents the data of the particular attribute in the object in which corresponds to its residing attribute tree. Object pointers are only stored in the leaf nodes in attribute tree (Section 3.3).
Proceedings of Regional Conference on Knowledge Integration in ICT 2010 389

A leader (Figure 3), in the class hub acts as a gateway to pass the query searching for a particular class into the right attribute trees. Attribute tree is a logical tree used to accommodate the pointers to the actual object of an attribute (Section 3.3). Basically, a leader of a class hub contains pointers to all the root nodes of the attribute trees, in which the attributes are belonging to the class. In the meantime, the leader also replicates its pointers to few replicas. Leader of the class hub is being selected based on the hashing of the class name. The class name is hash to a key, H(Classname). The overlay network looks for the successor node responsible for the key, successor(H(Classname)) and elects it as the leader. The leader will also replicate the pointers it is holding to n of its successor node for the backup purposes. It also helps to reduce the load of the leader to avoid bottleneck by directing the query to the replicas of the pointers when the load of the leader is heavy. Thus, when a query comes in, it contacts the leader of the class hub. The leader points the query to the related attribute trees to execute the searching procedure (Section 3.5). If the query load of the leader reaches the limit, in which the limit can be defined by users, the leader will redirect the query to its next replica node, in which is its immediate successor node. If the successor node also reaches the load limit, the query will be redirected to the next replica node. The process continues until reaching a replica with light load.

Figure 2: Pseudo code for forwarding query to related attribute tree through leader of class hub. Figure 2 is the algorithm of the proposed query forwarding to leader of class hub technique. Q represents the query passing in. Ni represents the current node the query is passing in. L corresponds to the leader of the class hub and RL is the replicas of the leader. LN i is the load / number of queries passing in to the current node. AN represents the root node of the corresponds attribute tree.

Proceedings of Regional Conference on Knowledge Integration in ICT 2010

390

Figure 3: Depiction of the class hub.

3.3 Attribute Tree An Attribute Tree corresponds to an attribute in a class. It is a logical tree which accommodates the pointers of the actual object location (object pointer) for the values of the attribute. The bound of the values in a tree can be dynamically adapted to the values of the attribute by the merge and split operations. The details of merge and split operations are explained in the subsequent paragraph. The attribute tree consists of three types of nodes, the root, intermediate and leaf nodes. Every node has at most two children nodes except for the leaf node. Attribute tree is a binary tree that defaulted to 2-way. The mapping of nodes into a tree is not a one-to-one relation. A physical node in the Chord might map to various logical node in the same tree or different tree. A node in Chord maps to the node in tree by hashing the range of values it responsible, which will be explained in subsequent paragraph (Figure 4). Since Chord uses the consistent hashing algorithm which ensures an evenly distribution of nodes, hence the mapping of a physical node to the logical node in a tree has the high probability of evenly distributed as well. Root node is the first node at the top of a tree. It corresponds to the entire value range of an attribute. Hence, it covers the full range value of the attribute. It has the information of the entire tree and aware of the total number of objects pointers for the tree. Intermediate nodes are the nodes between the root node and the leaf nodes. Intermediate nodes in charge subrange of the values, which is the unions of its children nodes. Range(Pi)=Range(Ci1) Range(Ci2). Pi is the intermediate node and Ci1 and Ci2 are the children nodes of Pi. Intermediate nodes store the pointers of their children, the information of the sub-trees below them and the pointer of their parent node (The node with one level higher than them). Intermediate nodes store the total number of object pointers of the sub-trees. Leaf nodes are the node at the bottom level of the tree. They are handling the smallest sub-ranges. Besides, leaf nodes also store the object pointers with the attribute values fallen into their range. They replicate the object pointers to their n immediate successor nodes in the overlay network. Each replica has a link to the next replica and hence forms a link list of replicas. Leaf nodes also store the pointer to their parent node and pointers to the sibling nodes (The leaf nodes locate on its immediate left and right respectively) for the better range query response time.

Proceedings of Regional Conference on Knowledge Integration in ICT 2010

391

Tree nodes are mapped to the Chord nodes based on the hashing of the range values. The formula to generate the key of the range values is (L+U)/2, L is the lower bound of the range and U is the upper bound of the range. By using the average point of the range, the key of the range can be hashed, H((L+U)/2). It is then mapped to the successor node of the key in the Chord identifier space. There are basically two operations to carry out by a node; the split and merge operations. The split happens when the density (Number of object pointers accommodate in the node) of the leaf node exceed the threshold. The split is performed when (Di /L)100 > 100/L where Di is the density of the node i and L is the total density for the tree. The merge operation is performed when the density of the two adjacent leaf nodes under the same parent node below a certain ratio. The merge is performed when (Di /L)100 < 100/2L However, if the leaf nodes are directly under the root node of the tree without any intermediate nodes, the split operations are performed based on the other criteria. The split operations for the leaf nodes are only performed when the density of the node, Di exceeds a constant number, C. The value of C is setup at the root node of the tree by the administrator during the tree initialization.

Figure 4: Example of Attribute Tree

3.4 Nodes Leaving It is possible that nodes in a tree leave/disconnect from the network periodically. There are different recovery procedures for different types of nodes. If the root node disconnect from the tree, the leader of the class hub contacts the next immediate successor node of the previous root node in the Chord network. The root node establishes the connection with the tree by linking to the children nodes below it. The leader updates its corresponding root node. If the intermediate node disconnect from the tree, the parent node contacts the next immediate successor node of the previous intermediate node and make linking to the next
Proceedings of Regional Conference on Knowledge Integration in ICT 2010 392

children nodes. The parent node forwards its sub-tree information to that newly joined intermediate node. If the leaf node disconnect from the tree, the parent node links to the immediate successor node of the previous leaf node, which is also the first replica of the leaf node. The replication of object pointers is redistributing across the n successor nodes.

Figure 5: Pseudo code for nodes leaving the attribute tree. Figure 5 above depicts the pseudo code for various kinds of nodes leaving the tree. ANi+1 represent the immediate successor node of the previous root node. IDENTIFIER(CAN1) and IDENTIFIER(CAN1) correspond to the node identifier of both the children nodes for the previous root node. Ci represents the children node for the node Pi. Ci+1 is the immediate successor node for the disconnect children node. LNi is the leaf node for the node Pi. RLNi is the replica of the leaf node.

3.5 Query When a node initiates a query, it first checks if any of the object sets cached in its caching manager subset (Section 3.7) of the query. If there is object set subset of the query, the node directly retrieves the result from the caching manager. Else the node sends the query over the network to obtain the desired result. Before the query is being forwarded to the overlay network, the node determines if it has cache the location of corresponding node in the attribute tree which has the minimum cover
Proceedings of Regional Conference on Knowledge Integration in ICT 2010 393

to the range in the query. For instance, according to Figure 4, a user wants to search for attribute a within a class c with the range [2, 30]. The node with the minimum cover to the range is the node with sub-range [1, 39]. If the node does caches the location of the node, then it forwards the query directly to the corresponding node and performs the query. This is to avoid bottleneck occurred in a root node when the query load is heavy and for faster access to the responsible node for better query response time. In the other way, if the node does not caches of the nodes identifier, it forwards a probe request to the leader of the class hub. From the leader, the probe request is sent to the root node of the attribute tree and the probe request is decomposed into sub-request to its children node. The process continues until the request encounters the node which responsible to the range. The probe request is then sending back to the query initiator and cache the identifier of the node in the caching manager. The actual query is then forwarded to the corresponding node in the tree and perform search traversed down the tree and finally send the result back to the query originator upon successful operation. The result is compiled into a list and sends request to those node that hold the actual objects. The object set is cached into the query originator and the object set is cached into the nodes along the path of how the query is traversed too. While the query traverses down the tree, it checks the local cache manager of the node in the tree as well to look for the object set cached in the cache manager which is subset of the query. If there is object set cached in the cache manager, the query terminates and the object set is sends back to the query originator. The object set is cached in the nodes along the path of where the query traversed. There are two types of list attach with the query along traversing the attribute tree, which are routing path list and object pointers list. Routing path list contains the list of node identifiers which are part of the traversed path. Routing path list stores node identifiers in the form of <Node Identifier, Attribute, Value Range>. Once the query reaches a node, the node checks if whether its own node identifier exists in the routing path. If the identifier is not existed, the current node identifier is adding into the routing path list. For the object pointers list, it contains the list of object identifiers which are the answer of the query who match the query constraints. The object pointers list stores object identifiers in the form of <OID, node identifier>. If the object pointers list is empty, the OID is inserts into the list. If the list is not empty and does not contain the current OID, then the current OID is inserts into the list. 3.5.1 Range Query It is possible to execute a query search for a certain range within attribute tree. We proposed a lower-upper bound approach to execute range query. When query reaches the node with the minimum cover to the range, the query is splits into two sub-queries, which corresponding to the lower bound of the query and upper bound of the query. The lower bound query forwards the query to its left child node responsible for the smaller sub-range and the upper bound query forwards the query to its right child node responsible for the larger sub-range. The process continues until both the lower and upper bound queries reach leaf node. From the lower bound and upper bound leaf nodes, they send the query towards each other through the sibling nodes and finally meet each other. The desired result is the union of the lower bound query and the upper bound query. Figure 6 depicts the searching scenario to search for the range [5, 50].

Proceedings of Regional Conference on Knowledge Integration in ICT 2010

394

Figure 6: The lower-upper bound approach. 3.5.2 Multi-attribute Query Our proposed design also supports multi-attribute query. A query is a conjunction of subqueries where each sub-queries querying an attribute in the class. For instance, a SQL-like query SELECT * FROM CAR WHERE door_number = 4 AND car_color = white is a query with the conjunction of two sub-queries. This query can be spitted into two sub-queries. This query selects all the instance of cars from the CAR class with the constraint attribute door_number must be four and car_color must be white. It is basically the intersection of the sub-queries: SELECT * FROM CAR WHERE door_number = 4 and SELECT * FROM CAR WHERE car_color = white. The intersection of object set from the first sub-query and the second sub-query would produce the final object set answering the query. To achieve multi-attribute query more effectively, we can actually decompose the query into n sub-queries, in which each sub-query corresponds to a criteria. More precisely, a sub-query is responsible for one of an attribute, in which the attribute is the constraint in the query. In POMS, we proposed a store and forward method. The query is starts with the first attribute tree. Once the sub-query for the first attribute tree has complete the searching, the entire query associate with the routing path list and object pointers list are forward to the second attribute tree. From one tree traverse to another tree, the last leaf node checks to determine whether it caches the node location of the next tree which covers the range of the next constraint. If it does, the query can forward directly to that node without having starts from the root node. The process continues until the query completes in the nth attribute tree. At the end of the searching process, a complete list of routing path list and object pointers list are compiled. Both the lists and the query are sending back to the query originator in order to retrieve the complete object set. Although this method requires longer time to complete, it however reduces the workload in query originator and also the network traffic required. The compiling result process can be done in every last leaf node in the tree before forward the query to the next tree. Hence, the compiling process does not need to be done in the query originator as the query originator gets the already compiled version.

Proceedings of Regional Conference on Knowledge Integration in ICT 2010

395

3.6 Maintenance Protocol There are two types of messages involved in the maintenance process, the tree update request and the tree update reply. Nodes periodically sends update request message to its parent node to update the parent node of the sub-tree and the density information. The parent node receives the request message updates its contents and also forwards the message to its parent node associate with its sub-tree and density information. The message continues until it reaches the root node. Meanwhile the message forward upward to the parent node, the purpose of maintenance protocol is also to detect if any node in the tree disconnect without prior knowledge. If the node is missing, the necessary repair procedure is needed to repair the structure of the tree. The repair procedure is similar to the section E. Once the root node retrieves the update message, it updates its sub-tree information and the total density of the tree. It then initiates the tree update reply message down to its children nodes. The children node updates its own information and further sending down the message to its children node. The process continues until reaches the leaf nodes. Split / Merge of nodes process in this stage might be triggered if the density is exceeded or less than the threshold. To avoid some pointers in the tree lost without prior knowledge, peers in the overlay network holding the actual objects periodically randomly choose few objects and send the object identifier to the nodes responsible for these objects. If the nodes do not have these pointers, then the object pointers of these objects are added into the nodes. 3.7 Caching Protocol Each node has a caching manager. The caching manager responsible to cache the object sets of the previous queries and the location of the nodes in the attribute trees. These cache information are store in a list and are associated with the counter and time. The counter represents the number of times the cache being accessed. They are store using Least Recently Used LRU approach. When the list in the cache is full or periodically, the cache objects with the least counter and oldest time will be removed and the vacancy is for the new coming cache objects. 4. Conclusion In this paper, we present a preliminary design for a P2P object-oriented database on a DHT. We utilize class hub and attribute trees for the multi-attribute and range queries. The following summarizes contributions of our proposed design: i. The split and merge operation in the attribute tree will make the attribute tree adaptive to the changing of the attribute values. ii. A lower-upper bound approach is adopted in order to search a range query more effective. iii. A store and forward method is adopted in our design to support multi-attribute query to reduce the workload of the query originator as well as reducing the network traffic. iv. Previous object sets are cached in the nodes along the routing path for faster query response time. v. Our design also caches the node identifier information for faster access to the attribute tree avoiding bottleneck at a single point.

Proceedings of Regional Conference on Knowledge Integration in ICT 2010

396

5. References [1] I. Stoica, R. Morris, D. Karger, M. F. Kaashoek and H. Balakrishnan. Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications. ACM August 2001. [2] A. Rowstron and P. Drucschel. Pastry: Scalable, decentralized object location and routing for large-scale peer-to-peer systems. ACM International Conference on Distributed Systems Platforms November 2001. [3] B. Y. Zhao, L. Huang, J. Stribling, S. C. Rhea, A. D. Joseph and J. D. Kubiatowicz. Tapestry: A Resilient Global-Scale Overlay for Service Deployment. IEEE 2004. [4] S. Ratnasamy, P. Francis, M. Handley and R. Karp. A Scalable Content-Addressable Network. ACM August 2001. [5] I. Clarke, O. Sandberg, B. Wiley and T. W. Hong. Freenet: A Distributed Anonymous Information Storage and Retrieval System. [6] R. Vilaca and R. Oliveira. Clouder: A Flexible Large Scale Decentralized Object Store. WDDDM March 2009. [7] A. R. Bharambe, M. Agrawal and S. Seshan. Mercury: Supporting Scalable MultiAttribute Range Queries. ACM August 2004. [8] C. Sartiani, P. Manghi, G. Ghelli and G. Conforti. XPeer: A Self-organizing XML P2P Database System. [9] J. Yu, M. Yu, B. Wang, Y. Gu and J. Dai. A Peer to Peer Database Model Based on Chord. IEEE 2008. [10] J. Gao and P. Steenkiste. An Adaptive Protocol for Efficient Support of Range Queries in DHT-based Systems. [11] D. Wang and M. Li. A Range Query Model based on DHT in P2P System. IEEE 2009. [12] M. Harren, J. M. Hellerstein, R. Huebsch, B. T. Loo, S. Shenker and I. Stoica. Complex Queries in DHT-based Peer-to-Peer Networks. [13] S. E. Bratsberg. Scaling a Highly-Available DBMS beyond a Dozen Nodes. [14] C. Prada, M. P. Villamil and C. Roncancio. Join Queries in P2P DHT Systems.

Proceedings of Regional Conference on Knowledge Integration in ICT 2010

397

You might also like