You are on page 1of 10

PirMehr Ali Shah

ARID AGRICULTURE UNIVERSITY RAWALPINDI


Synopsis for MS Degree in Computer Science
Title: DICOM VIEWER AND ANNOTATOR FEATURE EXTRACTION

Name of Student:
Registration Number:
Date of Admission:
Date of Initiation:
Probable Duration:

Ahsan Iqbal
12-arid-2791
Spring 2015
Spring 2016
One year

SUPERVSIORY COMMITTEE

i)

Supervisor

Mr. M. Zeeshan Muzaffar

ii)

Member

Mr. Farhan Sabir Ujagar

iii)

Member

Mr. Amir Rasheed

Director,
Barani Institute of Information Technology

Director,

Advanced Studies

ABSTRACT

INTRODUCTION

The Quran contains meaningful information and gives answer


and solution of many problems which is facing in our daily life. It
contains 114 Surah and each Surah contains verses or Ayat. The Quran
holds a large volume of unstructured subjects that is conceptually
related between verses.
The last decade is referred as age of information and surely this
is the era of knowledge. Every one from novice user to expert desires
authenticated, reliable and relevant results of queries. There is
requirement to develop an application that may give us dynamic
conceptual search. With the emergence of semantic web as a modern
field where structure can be formulated that may be understandable
not only for humans but also for machines, domain specific ontologies
are created and inference from ontologies leads us to Semantic search
A large number of software is available on web about search
from Holy Quran as document has been discussed in our earlier
research work. The software provides electronic audio and text
2

representation facility and some of them provide keyword searching


facility but lack the semantic search. Few software show remarkable
topic search against a query word but these are just static search
based software.
Information retrieval deals with retrieval of unstructured data,
especially textual documents, in response to a query, which may itself
be unstructured like sentence or structured like Boolean expression.
The need for effective methods of automated IR has grown in the last
years because of tremendous explosion of the amount of unstructured
data.
Text mining is a class of what is called nontraditional (IR)
strategies. The goal of these strategies is to reduce the required effort
from users to obtain useful information from large computerized text
data sources. Also text classifications (TC) is a subfield of data mining
which refers generally to the process of deriving high quality of
information from a text, which is typically derived through the dividing
of patterns and trends through methods such as statistical pattern
learning.
However; text classification is one of the most important topics in
the field of natural language processing (NLP), where the purpose of its
Algorithm is to assign each document of text dataset to one or more
pre-specified classes.
3

Text classification techniques are used in many applications,


including e-mail filtering, mail routing, spam filtering, news monitoring,
sorting through digitized paper archives, automated indexing of
scientific articles, classification of news stories and searching for
interesting information on the web.
Also, an important research topic appears in this field called
Automatic text classification (ATC) because of the inception of the
digital documents. Today, ATC is a necessity due to the large amount of
text documents that users have to deal with
According to the growth of text documents and Arabic document
sources on the web, information retrieval becomes an important task
to satisfy the needs of different end users; while automatic text (or
document) categorization becomes an important attempt to save
human effort required in performing manual categorization.
In this paper, a knowledge discovery algorithm for Holy Quran is
proposed in order to classify it to one of predefined classes, this
algorithm is consists of two major phases; the training phase and
Classification phase.

REVIEW OF LITERATURE

Zhang (2004) builds a Nave Bayes (NB) classifier, which


calculates the posterior probability for classes then the estimation is
based on the training set that consists of pre-classified documents, in
his system testing phase the posterior probability for each class is
computed then the document is classified to the class that has the
maximum posterior probability.
Isa (2008) explore the benefits of using enhanced hybrid
classification method through the utilization of the NB classifier and
Support Vector Machine (SVM). While Lam, et al. (1999) built a neural
network classifier addressing the classifier drawbacks and how to
improve its performance.
Bellot (2003) propose an approach that combines a named entity
recognition system and an answer retrieval system based on Vector
Space model and uses some knowledge bases, while Liu, et al. (2004)
focus on solving the problem of using training data set to find
representative words for each class, also (Lukui, et al. 2007) explore
how to improve the executing efficiency for classification methods.
Yu-ping,

(2007)

propose

multi-subject

text

classification

algorithm based on fuzzy support vector machines (MFSVM).


AL-Kabi, et al. (2007) present a comparative study that
represents the efficiency of different measures to classify Arabic

documents.

Their

experiments

show

that

NB

method

slightly

outperforms the other methods,


AL-Mesleh (2007) proposes a classification system based on
Support Vector Machines (SVMs), where his classifier uses CHI square
as a feature selection method in the pre-processing step of text
classification system procedure.
El-Halees (2006) introduces a system called ArabCat based on
maximum entropy model to classify Arabic documents
Saleem (2004) present an approach that combines shallow
parsing and information extraction techniques with conventional
information retrieval
Khreisat (2006) conducts a comprehensive study for the
behavior of the N- Gram Frequency Statistics technique for classifying
Arabic text document.
QARAB.EL-Kourdi, (2004) build an Arabic document classification
system to classify non-vocalized Arabic web documents based on Nave
Bayes algorithm,
AL-Kabi, et al. (2005) represent an automatic classifier to classify
the verses of Fatiha and Yaseen Chapters to predefined themes, where
the system is based on linear classification function (score function),

Hammo,. 2008 discuss the enhancement of Arabic passage


retrieval for both diacritisized and non-diacritisized text, they propose a
passage retrieval approach to search for diacritic and diacritic-less text
through query expansion to match users query

PROPOSED APPROACH
We have extensively discussed the issues that are obstacles in
achievement of specific search to Holy Quran. In focusing this,
exploratory research has been carried out and then sample prototype
has been developed. Definition of such Islamic beliefs may well require
very extensive care and team of different scholars should be given a
task to define the concepts in such a way that it should be accepted
universally.

MATERIAL AND METHODS

The proposed system consists of four phases; first one is the


preprocessing phase. Second phase is the training phase where the
learning

database

is

constructed

which

contains

the

features

representing a class. The input for this phase is a set of pre-classified


documents. Third phase is the classification phase in which the
resulted training database of previous phase is used with the
classification method to classify targeted Holy Quran verse, also a
7

query expansion occurs in this phase and the output of it will be the
class of targeted Holy Quran verse. Finally, data analyzing and
evaluation phase. These phases are shown in figure 1.

LITERATURE CITED
AL-Kabi. (2007). "A Comparative Study of the Efficiency of Different
Measures to Classify Arabic Text", University of Sharjah Journal of
Pure and Applied Sciences, 4(2), pp. 13-26
AL-Shalabi (2005),AL-Hadith

Text Classifier,

Journal of applied

sciences, 5(3), pp.548-587


Al-Serhan.

(2003).

"New

Approach

for

Extracting

Arabic

roots"Proceedings of the 2003 Arab conference on Information


Technology (ACIT2003), pp 42-59
Bellot 2003,"Coupling Named Entity Recognition, Vector-Space Model
and Knowledge Bases for TREC-11 Question Answering Track", In
Proceedings of The Eleventh Text retrieval Conference (TREC
2002), NIST Special Publication.
Duwairi,(2006)"Machine

Learning

for

Arabic

Text

Categorization"

Journal of the American Society for Information Science and


Technology, 57(8), pp.1005-1010
El-Halees,(2006)"Mining

Arabic

Association

Rules

for

Text

Classification", Proceedings of the First International Conference


9

on

Mathematical

Sciences,

AL-Azhar

University

of

Gaza,

Palestine, 15-17 July.


Rachidi,(2004) "Automatic Arabic Document Categorization Based on
the

Nave

Bayes

Algorithm",

Workshop

on

Computational

Approaches to Arabic Script-Based language (COLING-2004),


University of Geneva, Geneve, Switzerland, pp. 51-58 .
Atta-ur-rahman (2013) Teacher Assessment and Profiling using Fuzzy
Rule based System and Apriori Algorithm International Journal of
Computer Applications (IJCA), vol. 65, no. 5, pp. 22-28.

Hllermeier, E. (2005) Fuzzy methods in machine learning and data


mining: Status and prospects, Fuzzy Sets and Sytems (Elsevier),
vol. 156, no.3, 16, pp. 387406.
Joachims T. (1998) Text Categorization with SupportVector Machines:
Learning

with Many Relevant Features.

In European Conference on MachineLearning(ECML)


Apte, F. Damerau, and S. Weiss.Automatedlearning of decision rules for
textcategorization.ACM

tinsactions

Systems,12(3):233-251, 1994.

10

on

Information

You might also like