Professional Documents
Culture Documents
Abstract—This paper presents an open image-mining data mining of biomedical image data. The tools
framework that provides access to tools and methods for the implemented as Web Services can be directly integrated
characterization of medical images. Several image processing workflow management platforms (e.g., TAVERNA [3]),
and feature extraction operators have been implemented and allowing their integration in several workflows
exposed through Web Services. Rapid-Miner, an open source
corresponding to different image processing pipelines.
data mining system has been utilized for applying classification
operators and creating the essential processing workflows. The Proper authentication and encryption mechanisms have been
proposed framework has been applied for the detection of utilized in order to guarantee the appropriate security. The
salient objects in Obstructive Nephropathy microscopy images. rest of the paper is organized as follows: Section 2 discusses
Initial classification results are quite promising demonstrating related work in image mining through Web Services.
the feasibility of automated characterization of kidney biopsy Section 3 presents the tools and methods that enable the
images.
functionality of the proposed platform, whereas the
architecture scheme is described in Section 4. Section 5
I. INTRODUCTION
describes the Obstructive Nephropathy images
4109
features features can be extracted and transformed into a The GLCM is a tabulation of how often different
tabular format which can be further used by RapidMiner for combinations of pixels brightness values (grey
data mining and processing. levels) occur in an image.
The groups of image transformation and feature i = the row number
extraction operators constitute the most relevant part. Due to j = the column number
self-description methods of the Web Service, RapidMiner is Pi,j = the element i, j of the normalized symmetrical
able to detect the set of provided algorithms automatically. GLCM
N 1
(1)
Hence, new image mining algorithms can be deployed ASM iP 2
i, j
automatically without a need to update the RapidMiner i, j 0
components. Whereas RapidMiner is not aware of image The Contrast, which gives a measure of how sharp
files, formats, etc., the image mining operators can be the structural variations in the image are.
n 1
combined flexibly to transform images into a tabular format C ontrast Pi j ( i j ) 2 (2)
well-suited for data mining and further processing. i, j0
component is the Image Processing Web Services Core that The Inverse Difference Moment, which gives a
hosts all the functionality exposed to the client measure of the local homogeneity of the segment of
communication through SOAP messages and the the image.
HTTP/HTTPS protocol. Appropriate classes and functions ID M 1
1 ( i j )2
Pi j (4)
implement the aforementioned functionality utilizing any i j
essential application programming interfaces (APIs) that The Entropy, which is a measurement of randomness
provide access to advanced functionality (e.g., data Entropy (5)
management, image processing, etc.) or to data repositories i
P j
ij log( Pij )
4110
workflow tool. The framework enables experts to utilize
image mining techniques without any requirements for
specific image processing or data mining knowledge. Initial
evaluation results are quite promising. Future work includes
more extensive evaluation of the platform using new
datasets.
ACKNOWLEDGMENT
This work is funded by Information Society Technology
program of the European Commission “e-Laboratory for
Fig. 3. Screenshot of the RapidMiner [13] interface illustrating a workflow Interdisciplinary Collaborative Research in Data Mining and
for processing, feature extraction and classification of Obstructive
Nephropathy Images.
Data-Intensive Sciences (e-LICO)” (IST-2007.4.4-231519).
Authors would also like to thank Joost Schanstra and Julie
VI. INITIAL EVALUATION RESULTS Klein from INSERM for the provision and annotation of the
biopsy images.
In order to evaluate the accuracy of the image mining
framework for the characterization of obstructive
REFERENCES
nephropathy images, an initial dataset of 6 Kidney biopsy
images has been utilized. The images have been provided by [1] Klahr S, “Obstructive nephropathy”, Internal medicine, vol. 39, no 5.,
INSERM (France). They have been obtained from healthy 2000, pp. 355-361.
and pathogenic kidney biopsies of mice, and have been [2] Biological Web Services, June 2009, available online at:
http://maurobio.infobio.net/bws/biows.htm.
treated following Masson's trichrome staining technique in
[3] Tom Oinn, Matthew Addis, Justin Ferris, Darren Marvin, Martin
order to disclose the most important structures (see Fig. 4). Senger, Mark Greenwood, Tim Carver, Kevin Glover, Matthew R.
A magnification of 200, aperture of 0.5, 10 ms exposition Pocock, Anil Wipat and Peter Li, “Taverna: a tool for the composition
and gain of 1.0 have been used as shooting settings. In order and enactment of bioinformatics workflows”, Bioinformatics, vol. 20,
no. 17, pp. 3045-3054, June 2004.
to overcome the issue with the small dataset, the images [4] Shadbolt, N. Lewis, P., Dasmahapatra S., Dupplaw D., Hu B. and
have been gridified (See Section V) resulting into a quite Lewis H., “MIAKT: Combining Grid and Web Services for
larger dataset. Collaborative Medical Decision Making”, In Proc. of AHM2004 UK
eScience All Hands Meeting, September 2004, Nottingham, UK.
After applying the aforementioned image processing and [5] Todica V., Vaida M.F., “SOA-based medical image processing
the k – Nearest Neighbor classifier, 260 over 283 non- platform”, in Proc. of IEEE International Conference on Automation,
pathogenic Glomerulus (see Fig. 4) have been successfully Quality and Testing, Robotics, 2008, vol. 1, pp. 398-403, May 2008.
[6] D. Perez, J. Crespo, A. Anguita, J. Ordonez, J. Dorado, G. Bueno, V.
recognized (i.e. 91,87% accuracy) , whereas all pathogenic Feliu, A. Estruch, J. Heredia, “Biomedical Image Processing
Glomerulus were successfully classified. integration through INBIOMED: A Web Services-based Platform”,
presented at the 6th International Symposium on Biological and
Medical Data Analysis (ISBMDA 2005).
[7] Newcomer, Eric; Lomow, Greg (2005). Understanding SOA with Web
Services. Addison Wesley. ISBN 0-321-18086-0.
[8] Kyle Gabhart, “Secure, Reliable Web Services with Apache”,
available online at: http://www.xml.com/pub/a/2007/05/02/sure-
reliable-web-services-with-apache.html.
[9] The OpenSSL Project, information available online at:
http://www.openssl.org/.
[10] Ilias Maglogiannis, Charalampos Doukas, “Overview of Advanced
Computer Vision Systems for Skin Lesions Characterization”, IEEE
Transactions on Information Technology in Biomedicine, vol 13, no 5,
Fig. 4. Annotation of important structures that determine pathogenesis in a
pp. 721-733, Sept. 2009, DOI: 10.1109/TITB.2009.2017529..
Kidney biopsy image. Strong line: Glomerulus, Dashed line: Tubulus
[11] I. Maglogiannis, E. Zafiropoulos: “Utilizing Support Vector Machines
for the Characterization of Digital Medical Images” BMC Medical
The Bayesian classifier achieved 67,82% accuracy in Informatics and Decision Making 2004.
predicting pathogenic areas. SVM have reached an accuracy [12] Haralick M et al, “Textural Features for Image Classification”, IEEE
Transactions on systems man and cybernetics Vol. SMC-3 pp. 610-
of 76,87%, in addition to the fact that all non-pathogenic 621, 1973.
Glomerulus were successfully predicted. [13] Miersw Ingo, Wurst Michae, Klinkenberg Ralf, Scholz Martin and
Euler Timm. “YALE: Rapid Prototyping for Complex Data Mining
Tasks”, in Proceedings of the 12th ACM SIGKDD International
VII. CONCLUSION Conference on Knowledge Discovery and Data Mining (KDD-06),
This paper has presented an open image mining 2006.
framework for the characterization of Obstructive
Nephropathy images. It is based on image processing and
feature extraction operators available as Web Services and
their integration in RapidMiner, an open data mining
4111