You are on page 1of 11

Intelligent Requirements

Mining, classification,
Dependency Analysis










Group Details:

Group 5( will be responsible for Indexing Engine
and Publish Requirement web services)

Group 21( will be responsible for clustering)

Group 25 ( will be responsible for Keyword
search)




PURPOSE:
The purpose of the document is to provide the detail design and requirement of
the system Intelligent Requirements Mining, classification, Dependency
Analysis.
SCOPE:
The scope for the document here includes an overview of the system and
architecture of the system. It also provides details of different services and IOPE
for those services. At the end it provides some functional and non-functional
requirements of the system.

SYSTEM OVERVIEW

Data mining techniques can be extensively applied in EasySaaS. Two types of data
are leveraged: service metadata in the service registry and the service access logs
with pre-gathered tenants profiles. Application requirement contents usually
contain description of the SaaS application. Predicting the SaaS application
category and recommend reusable components, given the requirements, can be
treated as a classification problem. When an application requirement is published,
it is parsed and indexed. The indices can be created using Apache Lucene. Then
these indices are used for classifying the requirements using a Vector Space Model
(VSM). Besides, all shared application components are associated with a text
description. These descriptions are also indexed and used to classify these
components. When a requirement is published by a tenant, it will be classified and
assign the domains it belongs to. Keyword search for application components are
also supported using these information. When keyword queries are issued, the
components whose descriptions are most similar to the keyword queries will be
returned in ranked order to the tenant developers.



TERMS USED

Data Mining: It is the process of discovering patterns in a large datasets involving
methods by the combination of artificial intelligence, machine learning, statistics
and database systems.

Classification: Classification is a data mining function that assigns items in a
collection of target categories or classes. The goal of classification is to accurately
predict the target class for each case in the data.

Clustering: Clustering is the task of grouping a set of objects in such a way that
objects in the same group are more similar to each other than to those in other
groups.

Indexing: Indexing is technique of collecting, parsing and storing data to facilitate
fast and accurate information retrieval.

Searching: Searching includes searching of a particular data item in the database.
The database will present the number of specialized searches related to the
required item.







Architecture Diagram:



The architecture diagram consists of :

Application requirement content database
Indexing engine
Classifier and clustering engine
Web server
Tenants




Workflows:

Initialization Workflow:








1. Indexing engine access data from the application requirement content
database, and indexes it.

2. The indexed data are classified into domains using the classifier and
clustering engine.

Publish Requirement Workflow:

1. Tenant publishes requirement using an API Publish requirement
2. Web server requests indexing engine for indices.
3. Indexing engine returns the output.
4. Indices are sent to the classification engine.
5. Classification engine returns the classified domain.
6. Web server returns domain of tenant requirements.




Keyword Query Workflow:

1. Tenant queries keywords using an API Keyword Query.
2. Web server will send the query to the classification engine.
3. Classification engine returns top 5 components.
4. Web server returns the top 5 components to the tenant.

















SERVICES:

Service for Publishing Service

System allows the tenants to publish services in which each tenants will
provides description of the services using WSDL.

Input: Tenant will provide service description for the service to be
published.

Output: Service will be published in the repository.

Precondition: Service should be able to run and valid.

Effect: Publishing the service will store the service description in the
database.


Publish Requirement Services:

The system contains a repository of all the services published by the tenants.
Publish Requirement Service allows tenants to publish their requirements which
will be processed by the system and a list of services according to the requirements
will be provided to the tenants.

Input: Various Requirements for the desired service

Output: services matched to the given requirement input.

Precondition: Requirements should be valid.

Effect: For the given requirements the classifier and clustering engine will check
the application requirement content database and match it to the service description
stored in the database.
Service for checking Dependencies:
The system will check for the dependency services if any present at the time of
providing service details to the tenants.
Input: Service Id which is to be displayed to the tenant.
Output: A list of dependency details.
Precondition: Service Id should be present and valid.
Effect: When the tenant is to be provided with the service details the system will
check if theres any dependency present for the given service.

Keyword Search:
System provides the keyword Search service using which one can get the
components whose descriptions are most similar to the keyword queries are
returned.
Input: keyword for the component to be searched.
Output: A list of components whose descriptions matches best to the query.
Precondition: no precondition.
Effect: After hitting search for the given keyword the indexing engine will
compute the similarity for the given keyword and the component description store
in the database and give a sorted results matching the keyword from most similar
to least similar.

Get Component Details:
Get component Details Service provides a detail description of the component
which are given by the keyword search service.
Input: A call to give component details from the given component list for the
given keyword query.
Output: A detailed description of the component.
Precondition: None.
Effect: once the call to provide the component description is made system will
fetch the details of the component from the database using the unique identifier for
the component.










REQUIREMENTS:

Open Source Software:

Apache Tomcat
Apache Lucene
Weka
Java
Spring Framework 3.0



References:

1. Wei-Tek Tsai, Xiaoying Bai & Yu Huang EasySaaS: A SaaS Development
Framework.

You might also like