You are on page 1of 7

 

SOFTWARE REQUIREMENTS
SPECIFICATION

FOR

NEUTRINO
PREPARED BY

Balakrishnan L M

Karthik C

Kumaran V

Senthil Kumar V

Batch - II

Page i
Software Requirements Specification for Neutrino Page 1

1. Introduction 
1.1 Purpose  
• Enabling mobile database access with good performance over low bandwidth networks. 
• Maintaining consistency in mobile replicas with stale network. 
• Employing partial template based replication in the mobile client. 
• Deployment of a dynamically replicated database. 
• Utilizing the processing power of low power clients. 

1.2 Document Conventions 
• The italicized words represent the terminologies used in this project. 
• The bold words represent the novelties that are involved in this project. 

1.3 Intended Audience and Reading Suggestions 
This document is intended for the following types of readers: 
 
• Developers 
• Project Managers 
• Testers 
• Documentation Writers 
• Users 
 
It is highly recommended that the readers start reading from the Introduction to get a better idea about 
Neutrino. 

1.4 Project Scope 
A distributed system in a wireless network primarily suffers from the problem of inconsistency and 
poor performance. A trade off always exists between the consistency enforced and the 
performance of the distributed system. This project focuses mainly on developing a distributed 
system, in a stale network, without degrading the consistency and also scales better performance 
through partial replication. 

1.5 References 
| GlobeTP : A template based replication service. Dr. Swaminathan Sivasubramanian, Tobias
Groothuyse, Guillaume Pierre Vrije Universitiet, The Netherlands.
| GlobeDB : Autonomic Data Replication for Web Applications. Dr. Swaminathan
Sivasubramanian et al.
| Finger Printing Through Random Polynomials. Micheal O. Rabin, Dept. of Mathematics,
The Hebrew University of Jerusalem.
| Opportunistic Use of Content Addressable Storage for Distributed File Systems. Niraj Tolia
et al. Intel Research Pittsburg.
| Replication for web hosting systems. ACM Computing Surveys. S. Sivasubramanian, M.
Szymaniak, G. Pierre, and M. van Steen.

Page 1
Software Requirements Specification for Neutrino Page 2

| A case for dynamic selection of replication and caching strategies. In Proceedings of the
Eighth International Workshop Web Content Caching and Distribution S. Sivasubramanian,
G. Pierre, and M. Van Steen.
| Akamai EdgeSuite. http://www.akamai.com/en/html/services/edgesuite.html.
| DBProxy: A dynamic data cache for Web applications. In Proc. Intl. Conf. on Data
Engineering, K. Amiri, S. Park, R. Tewari, and S. Padmanabhan.
| Characterizing the scalability of a large web-based shopping system. ACM Transactions on
Internet Technology, M. Arlitt, D. Krishnamurthy, and J. Rolia.
| Adaptive database caching with DBCache. Data Engineering, C. Bornh¨ovd, M. Altinel, C.
Mohan, H. Pirahesh, and B. Reinwald.
| Towards robust distributed systems. Proc. ACM Symp. on Principles of Distributed
Computing, E. A. Brewer.

2. Overall Description 
2.1 Product Perspective 
Nowadays, wireless wide area networks have become quite common in large enterprises that 
work with relational database. Various consistency models have been developed to ensure 
consistency among the replicas available.  This system is developed to provide good consistency to 
enterprises that uses wireless medium as their primary network for communication. This is also developed 
with an aim to improve the scalability of these systems by having partial replication scheme. 

2.2 Product Features 
This product provides good consistency even with stale network. With this scheme, laptops and
PDAs also can be used as replicas. The use of small scale database management systems like
MySQL, Apcahe Derby, Postgre makes the cost of deployment minimal. The scalability of the
system will be very high using this scheme. The scalability is assured by making use of partial
replication schemes. The replication system is dynamic and hence the system will easily adapt to
the situation. The data that are being frequently accessed is replicated dynamically, hence providing
high scalability and availability.

2.3 Operating Environment 
This system is designed to operate in all the environments. The system is to be designed with java,
thereby providing platform independence and portability. The native libraries that are needed are
written with Visual C++ in case of windows and BSD C implementation for all other systems. The
master copy of the database is with any high end database management system such as Oracle or
SQL Server. The replication is done with any light weight database management system, which can
easily fit into mobiles, PDAs or laptops. The operating system is constrained only by the database
being used.

2.4 Design and Implementation Constraints 
The system is designed in such a way that there is no relaxation on consistency, at any degree of
irregularity with the network. The heterogeneity in various areas, such as database or OS, is always
compromised by the use of the proxy. The synchronization among various replication and other
parts of the system is maintained carefully.

Page 2
Software Requirements Specification for Neutrino Page 3

2.5 Assumptions and Dependencies 
The system is developed with an assumption that at least one replication is always available. The
query router is aware of all the database management system being used in the system and it is
capable of converting the query understandable by the necessary database. The database
management systems being used have their own JDBC driver written.

3. System Features 
Neutrino the distributed system for efficient replication has the following features:

3.1 Partial Replication 
3.1.1 Description and Priority

This feature enables the system to replicate the data that are frequently used. The 
replication strategy is chosen dynamically by the system. The system has an eye over the 
data that being accessed. The data that are frequently fetched are replicated by removing 
any of the replicated data, using suitable replacement algorithms. This module is of primary 
concern since it improves scalability. 

3.1.2 Stimulus/Response Sequences

The user requests some data from the server. The query router routes the request to 
appropriate replica. If the number of request to that particular data is high, the data in the 
replica that is least frequently used are replaced by the new data. The change log present in 
the query router is also updated. 

3.1.3 Functional Requirements

To achieve this, the system must have an intelligent algorithm so that it can choose the type 
of fragmentation to replicate data partially. The system must also sensibly choose the 
replacement algorithm that matches the current scenario.   

3.2 Consistency Enforcement 
3.2.1 Description and Priority

Consistency and synchronization among the various replications is a very difficult task. The 
replica that enters the network must be synchronized immediately to the server. Until then, 
the replica is not allowed to serve any of the client requests. A proper versioning system 
must also be implemented so that the client can be aware how old is the data present in it. 
When the fingerprint of the replica and the master has a least variation, then the changes 
alone are updated. When there is a large deviation, then whole data in replica are replaced. 
This is also a primary task. 

3.2.2 Stimulus/Response Sequences

There is an update in the master copy of the database. The clients that are currently 
connected are updated immediately. Consider a new client entering the network after the 

Page 3
Software Requirements Specification for Neutrino Page 4

update. The version of the database is first checked against the version of the data in 
master copy. If there is a mismatch, then fingerprint of the data present in the client is 
obtained and is matched against the fingerprint of the master. If there is less variation the 
changes are updated. If there is a huge variation then the whole database is replaced.  

3.2.3 Functional Requirements

To complete this task, a suitable fingerprinting and versioning system must be chosen. A 
suitable versioning scheme and change log is also mandatory. Use of timestamp based 
versioning system is a good choice since there is no chance of repetition in the timestamp 
value. 

3.3 Adaptation Triggering 
3.3.1 Description and Priority

The system must be capable of serving any number of clients. The system should perform 
well, even with stale network and low bandwidth. Dynamic replication strategy itself solves 
this problem. But special care should be taken on the number of clients connected, 
bandwidth utilization and CPU load. The concept of adaptation triggering also deals with 
maintaining the performance of the system, even when some of the nodes fail. Whenever a 
node, which may be query router or the master copy or some of the replications, fail the 
system has to be fault tolerant. The failure symptoms should not cross the end point of the 
system, which is the edge server. This is an additional feature. 

3.3.2 Stimulus/Response Sequences

Whenever there is a failure in the system, the controller server comes to know about the 
failure. The controller then informs the query router that the particular node has failed. The 
query router then redirects the clients to some other replica and does not allow any other 
new client to be served by that replica until it recovers from the failure. A separate log for 
connected and disconnected nodes in the system is maintained which helps to prevent 
inconsistency. When the failure is with master copy, the replicas receive the updates and it 
updates all other replica and when master server is alive, it is then updated.  

3.3.3 Functional Requirements

This module requires an algorithm to take intelligent decision during the time of failures. 
The messaging system among various nodes has to be developed. The isolation of the 
system failures is also necessary. Consistency among the logs is also implemented. 

3.4 Request routing 
3.4.1 Description and Priority

There are number of query routers available in the network. Query router itself directs the 
client request to appropriate replica. An initial request routing mechanism has to be 
employed to choose a proper query router. The client is routed after taking care of many 
parameters. The parameters include network estimation metrics, traffic, load in the 
network, CPU load of the destination node.  

Page 4
Software Requirements Specification for Neutrino Page 5

3.4.2 Stimulus/Response Sequences

Client when need to by served by the Neutrino, takes the following route to reach its server. 
With network estimation metrics, the calculation of RTT and the geographical distance 
between the client and the server are used to decide the route. The traffic on the network 
must be found and a path with less traffic must be chosen. The bandwidth utilization in a 
network must also be calculated. The path with least utilization of bandwidth is chosen to 
be the right path. At last, the number of client served by that particular server is found. If 
the replica or query router is seen to handle more number of client requests, or it is busy 
due to some other cause, then some other replica or query router has to be found out. The 
node with least cost path and that having less CPU load is finally selected. The clients are 
served by that replica.  

3.4.3 Functional Requirements

This module also needs an intelligence algorithm to take proper routing decision. This 
routing algorithm must be implemented in two levels. First the client has to be routed to a 
proper query router and the query router must route to a proper replica. 

4. External Interface Requirements 
4.1 User Interfaces 
There need not be any rich UI since this is purely a middleware. But to monitor the functionality of
various modules present in the system, there is small console. The console shows some necessary
information like the bandwidth utilization in various networks, CPU load in each node, connected
and disconnected nodes, data present in various replica nodes, client that are being served by
various replica and query router in the system.

4.2 Hardware Interfaces 
This system does not use any specific hardware interfaces.

4.3 Software Interfaces 
Each module in the network like the query router or the client proxy or the JDBC driver is
developed as a component and deployed. Each component is interfaced using regular interfacing
techniques wherever needed.

4.4 Communications Interfaces 
This system uses Wi-Fi as the physical medium for communication and it uses TCP/IP for transfer
of data. The control information is sent using UDP. Many communications are done using RPC and
RMI Technologies.

Page 5
Software Requirements Specification for Neutrino Page 6

5. Other Nonfunctional Requirements 
5.1 Performance Requirements 
The performance of the system is directly proportional to the number of replicas the system and also
on the configurations of various nodes used in the network. The theoretical study made on this
project shows that this system perform high in systems that involves more number of read and
infrequent updates on data.

5.2 Safety Requirements 
With concern towards the safety of the system, more number of replication for the same data is
mandatory. If any of the replicas is away from the network the system can still perform well due to
the availability of the replica.

5.3 Security Requirements 
With concern to security the client cannot access any other module of the system. The only module
known should be the edge servers. The client should not be aware of the internal mechanism
because he may make a security breach in the system, with regard to the internal security policies,
the developer must be allowed to create a UDP, TPC socket. The clients must also be allowed to
make a RMI or RPC request to any other node in the network.

5.4 Software Quality Attributes 
For the customer of the software, the software requires very less configurations. The client itself
discovers the server; the proxy by itself takes care of the connected and disconnected replica. The
request routing and consistency enforcement are to be done dynamically. Regarding security, the
client cannot directly access the system, the client request are stopped at the edge server itself and
the queries and other request are rewritten by itself and it is purely Neutrino base. There cannot be
any sort of security breach inside the system due to the above said features. The system is easily
portable. The system is purely component oriented any component can be revised in future without
any disturbance to the other components in the system.

Page 6

You might also like