You are on page 1of 8

Volume 1, No.

10, December 2012 ISSN 2278-1080 The International Journal of Computer Science & Applications (TIJCSA) RESEARCH PAPER Available Online at http://www.journalofcomputerscience.com/

A Survey on Test Case Generation and Extraction of Reliable Test Cases


Anbarasu I
PG Scholar, Department of Computer Science and Engineering, Sona College of Technology, Salem, Tamilnadu. emailtoanbarasu@gmail.com

Anitha Elavarasi S
Assistant Professor, Department of Computer Science and Engineering, Sona College of Technology, Salem, Tamilnadu anisha_er@yahoo.co.in

Abstract:
Software testing is a process of ratifying the functionality of software. It is one of the crucial area which consumes more time and high cost. The time spent on testing is mainly concerned with testing large number of test cases, which are unreliable. Our goal is to reduce the number of test cases and to give reliable test cases. To extract reliable test cases from large number of test cases, clustering algorithm is used which is a data mining approach to reduce the number of test cases.

Keywords: Software Testing, Clustering, Test case, Test case extraction. 1. Introduction:
Software Engineering [1] is an area to study about requirement gathering, analyzing, designing, developing, testing and maintaining software. Among this, software testing is one of the important part in software engineering. Testing [2] is a process of verifying and validating the software, to meet the customer requirements, to find bugs and also to test for accurate results. It aims to provide assurance to customers that the software works correctly at any circumstances. This need to build test cases that exploits each and every possible path of the software. Test cases are set of conditions or inputs given to an application to validate its functionality. Manual test case generation is a tedious process and consumes more time. Nowadays there are various tools to generate test case (e.g. CodeProJunit) which generates test cases automatically by getting necessary inputs from user. But these tools may generate large number of test cases which are unreliable and redundant. Testing
2012,

http://www.journalofcomputerscience.com - TIJCSA All Rights Reserved

Anbarasu I, Anitha Elavarasi S, The International Journal of Computer Science & Applications (TIJCSA) ISSN 2278-1080, Vol. 1 No. 10 December 2012
such test cases will take more time and high costing. To avoid this problem, reliable test cases are extracted using clustering algorithm. Data mining [3] is a process of extracting data from large data sets using different techniques like preprocessing, classification, clustering, association, prediction and sequential patterns. Clustering [3] is a process of grouping or extracting clusters that have similar behaviour from large database. Clustering broadly divided into two types: hierarchical approach and partitional approach. In hierarchical approach, clustering performed by seeking the hierarchy of clustering but partitional approach prohibits subset of clusters. This paper focus on survey of test case generation and extraction of reliable test cases, the next section describes various survey paper related to test cases. Section 3 devoted to the Comparison table which discusses the advantages and disadvantages of various survey papers. Finally, section 4 describes Conclusion.

2. Literature Survey:
2.1. Lilly Raamesh and G.V.Uma presents, Reliable Mining of Automatically Generated Test Cases from Software Requirements Specification [4].
They propose how test cases generated automatically from state charts and a method to reduce test suite using data mining technique. Their approach has three steps: 1. Generation of classification rules. 2. Generation of test cases from UML State Machines. 3. Data mining methods applied to reduce generated test cases.

2.1.1. Generation of classification rules:


In this module, Software Requirement Specifications (SRS) taken as input to classify Functional Requirements (FN) and Non-functional Requirements (NFR). It is converted into State charts systematically using relevant information. Weka tool is used for classification. The Weka Classifier initially trained with training set. Later classification rules are applied to SRS which classifies it into FR and NFR. The NFR are transformed into state machines which specify the behaviour of a system.

2.1.2. Generation of test cases from state machines:


This section describes the generation of test cases from state machines which is a method in [5]. Their approach generates a valid set of test sequences where the preconditions of all transitions are established either by previous actions or by properties of test data. Their approach has three steps. 1. 2. 3. Predicate Selection Predicate Transformation Test data generation

2.1.2.1. Predicate Selection:


Predicate selection can be performed either by using Breadth First Search (BFS) or Depth First Search (DFS). A predicate is selected on a transition from UML state machine diagram. Here they used DFS traversal for selection of predicate from UML state machine diagram. During traversal, on each transition conditional predicate looked and test data generated. The test data generated based on the true and false values of conditional predicate which satisfies previous path condition.

2.1.2.2. Predicate Transformation:


Predicate transformation for UML state machine performed based on the criteria of boundary testing (for example; two points named ON and OFF for a given border satisfying the boundarytesting criteria). Relational expressions of the predicate are transformed into function F called predicate function. For example, If the predicate P is of the form (A1 op A2), where A1 and A2 are arithmetic expressions and op is a relational operator, then F = (A1 -A2) or (A2 - A1) depending on whichever is positive for the input data. Next, the input data is modified such that the function F
2012,

http://www.journalofcomputerscience.com - TIJCSA All Rights Reserved

Anbarasu I, Anitha Elavarasi S, The International Journal of Computer Science & Applications (TIJCSA) ISSN 2278-1080, Vol. 1 No. 10 December 2012
decreases and finally returns negative. When F returns negative, it corresponds to the alternation of the outcome of the predicate. As a result, the point at which the outcome of P changes to the corresponding function F, minimization occurs. This minimization is achieved through repeated modification of the input data values which is known as alternate variable method.

2.1.2.3. Test Data Generation:


Test cases were generated based on the predicate function F which is derived from the state machine.

2.1.3. Data mining technique used:


Data mining is a process of extracting similar pattern from large number of data available in data base. Clustering technique is used to minimize test suite since it has various advantages like scalable, high dimensionality and able to cluster noisy data. Advantage: Test cases generated automatically from SRS. Manual test cases generation avoided which reduce faults during generation. Low time consumption for generating test cases. Limitation: Absence of agents to perform test case generation and reduction. Generates large number of test cases which takes more time to test.

2.2. Sarita Sharma and Anamika Sharma presents Amalgamation of Automated Testing and Data Mining: A Novel Approach in Software Testing [6].
They propose on automated software testing, to avoid manual testing. Since manual testing gives huge loss to the software industry. Also they propose the potential use of data mining algorithms, which automatically generate functional requirements from the execution data. The induced data mining model of testing software can be used to design minimal set of regression tests, recover incomplete and missing specifications and to evaluate correctness of output produced from software. The automated software testing performed, based on mining and knowledge extraction from test cases. The automated software testing has the following automated testing lifecycle components. Automate testing design (Decision to Automated Testing.) Test Tool Acquisition. Introducing automated testing process (Automated Testing Introduction process.) Test Planning Design and Development. Execution and Management of tests. Test Program Review and Assessment

2.2.1. Challenges of automated software testing [7]:


Test tool selection. Test tool customization. Selecting automation level. Developing and verification of test scripts. Implementation of test management system. Advantage: Software tested automatically which saves human energy and time.

2.3. Kartheek Muthyala and Rajshekhar Naidu P presents, A Novel Approach to Test Suite Reduction Using Data Mining [8].
They propose a new approach to reduce the number of test cases using clustering technique. Two clustering algorithms used to reduce the number of test cases; they are Simple K-means and
2012,

http://www.journalofcomputerscience.com - TIJCSA All Rights Reserved

Anbarasu I, Anitha Elavarasi S, The International Journal of Computer Science & Applications (TIJCSA) ISSN 2278-1080, Vol. 1 No. 10 December 2012
pickupCluster algorithm. Weka tool is used to apply data mining algorithm which is open source (freeware) available in internet. The test cases are stored into text file, and then it is converted into ARFF (attribute related file format). Conversion performed since Weka [9] tool accepts only ARFF file. Now ARFF file loaded into Weka tool then clustering algorithms applied and it also supports various data mining techniques like classification, prediction, preprocessing, association etc. Advantage: Large number of test cases reduced into smaller. Avoid redundancy and low time consumption. Future Work: In the above proposed work, dependency should be included among test cases.

2.4. Vinaya Sawant and Ketan Shah presents, Automatic Generation of Test Cases from UML Models [10].
This is a new approach to generate test cases from UML models. UML diagrams like sequence diagram, class diagram and use case diagram of any applications are used to generate test cases. This is done to avoid manual creation or generation of test cases which takes more time. So they decided to create a new tool which generates test cases automatically. To generate test cases automatically three UML diagrams used. They are Sequence diagram, Class diagram, Use case diagram which is transformed into a representation called Sequence diagram Graph [11] (SDG). For generating the test case, the input, expected output and pre- and post-condition are taken from use case diagram, class diagram, data dictionary in the form of OCL [12] expressions along with sequence diagram is considered. Then SGD traversed to generate test cases based on coverage criteria. Advantage: Test cases generated automatically which reduces time and avoids manual creation of test cases. Test cases were written before implementation. Limitation: Test cases generated large in number which consumes more time to test.

2.5. Jim Heumann presents, Generating Test Cases from Use Cases [13].
This is a new idea to generate test cases from use case. By using this approach test cases were generated before development phase from user requirements. Use cases are presented visually by UseCase diagram. Use cases contain information like Name, Brief description, Flow of events, precondition, post-condition. Flow of events is important to derive a test case which has two flow of events Basic flow of events and Alternate flow of events. Use case scenarios are the base for test case generation which gives complete path between the use cases. The three steps to generate test case from use case shown below: 1. Generate use case scenarios for each use case. 2. Identify at least one test case for each use case scenario. 3. Identify data values for each test case.

2.5.1. Generating Use Case Scenarios:


To generate use case scenarios, read use case textual description and identify basic and alternate flow which gives scenarios.

2.5.2. Test case Identification:


By analyzing scenarios and reviewing textual description of use case test cases can be identified. Advantage: Test cases can be generated from user requirements before development phase. Limitation: Large number of test cases generated.

2012,

http://www.journalofcomputerscience.com - TIJCSA All Rights Reserved

Anbarasu I, Anitha Elavarasi S, The International Journal of Computer Science & Applications (TIJCSA) ISSN 2278-1080, Vol. 1 No. 10 December 2012
2.6. Dharmendra K Roy and Lokesh K Sharma presents, Genetic k-Means Clustering Algorithm for Mixed Numerical and Categorical Data Sets [14].
In this, genetic K-means clustering algorithm proposed to cluster both numerical and categorical data [15]. Here genetic algorithm is used to cluster large data sets which contain

both categorical and numerical data. The genetic algorithm has three operators selection operator, crossover operator and mutation operator. Here instead of crossover operator Kmeans operator is used to avoid reassigning of patterns to empty clusters. Then the illegal string will remain illegal when K-means operator used.
Advantages: Clustering can be applied to both numerical and categorical data sets.

2.7. Ali Ilkhani and Golnoosh Abaee presents, Extracting Test Cases by Using Data Mining; Reducing the Cost of Testing [16].
In this, Case-Based Reasoning (CBR) and data mining are used as effectual method for automated testing and effort estimation. During testing different test results are produced, then these test results are classified into different categories based on the type of faults and type of software process models. To classify the test results Case-Based reasoning and data mining are used. After classification the classified results are used for testing the same application in future which reduces the cost of testing.

2.7.1. Case-Based Reasoning:


Case-Based Reasoning is used to solve a new problem based on remembering previous situation and reusing the data and knowledge of that situation. According to Aamodt and Plaza [17], CBR is able to utilize the specific knowledge of the previously experienced, concrete problem situation (cases). A new problem is solved by finding a similar past cases and reusing it in the new problem situation. Case-Based reasoning has four steps. 1. 2. 3. 4. Retrieve: Retrieves test cases from memory which are relevant to target problem. Reuse: Reusing the solution from previous problem to the target problem. Revise: Revising the solution of the target problem. Retain: Storing the results of target problem in the memory.

2.7.2. Data mining technique used:


Clustering technique used to group similar patterns. After grouping the similar data the following are various processes related to the proposed system: 1. 2. 3. 4. Converting the software to case. Measuring the pattern and its similarity. Defining case attributes and domains. Finding appropriate test cases using CBR.

Advantages: Software tested automatically which reduces human time. Limitation: Without remembering previous history CBR not possible. Future Work: In testing using the pervious test cases will give good performance for upcoming software. In order to do this it seems, whatever the number of attributes is more, the range of domain in closer to reality.

2.8. Ahamed Shafeeq B M and Hareesha K S presents, Dynamic Clustering of Data with Modified K-Means Algorithm [18].
In this modified K- means algorithm [19][20] proposed to cluster the data sets, which gives optimal number of clusters and to improve quality of cluster. In K-means clustering the known number of clusters (K1) cannot be given in advance only unknown number of cluster (K2) possible, but in Modified K-means clustering known number of cluster (K1) and unknown number of cluster
2012,

http://www.journalofcomputerscience.com - TIJCSA All Rights Reserved

Anbarasu I, Anitha Elavarasi S, The International Journal of Computer Science & Applications (TIJCSA) ISSN 2278-1080, Vol. 1 No. 10 December 2012
(K2) can be fixed in advance. Using the Modified K-means clustering, the user has the flexibility to fix the number of cluster in advance it act as K-means. In latter case it act as the algorithm computes the new clusters by incrementing the cluster counter by one in each iteration. Advantage: Number of clusters known and unknown can be fixed in advance. Limitation: Known number of clusters cannot be given in advance. Computational time is high compared to K-means clustering.

3. Conclusion table:
S.no
1

Title
Reliable Mining of Automatically Generated Test Cases from Software Requirements Specification Amalgamation of Automated Testing and Data Mining: A Novel Approach in Software Testing A Novel Approach To Test Suite Reduction Using Data Mining Automatic Generation of Test Cases from UML Models Test case generation from use case Genetic k-Means Clustering Algorithm for Mixed Numerical and Categorical Data Sets Extracting Test Cases by Using Data Mining; Reducing the Cost of Testing Dynamic Clustering of Data with Modified K-Means Algorithm

Advantage
Automatically generates test cases from SRS. Avoid manual creation of test cases. Softwares tested automatically, reduce human time. Large number of test cases reduced to smaller Test cases generated from UML models automatically. Test cases generated before development Both numerical and categorical data can be clustered. Automate software testing without human intervention Fixing number of clusters can be given in advance

Disadvantage
Provides large number of test cases.

Conclusion
Test cases can be generated from SRS. Softwares are tested automatically by automated system. Test cases can be reduced. Test cases can be generated using UML models. Test cases generated from use case. Using this all type data can be clustered. Test cases were extracted using mining. Test cases can be extracted easily when number of clusters fixed in advance.

No Agents to do the proposed system.

3 4.

No Dependency among test cases. Generates large number of test cases. Large number of test cases generated Applies only mixed data sets. to

5. 6.

7.

No previous history of test cases for first time extraction. Takes more computational time than K-means clustering.

8.

4. Conclusion:
From this survey, the software industry spent more money only to testing. This due to the following reasons: 1. Manual generation of test cases. 2. Generating large number of test cases.
2012,

http://www.journalofcomputerscience.com - TIJCSA All Rights Reserved

Anbarasu I, Anitha Elavarasi S, The International Journal of Computer Science & Applications (TIJCSA) ISSN 2278-1080, Vol. 1 No. 10 December 2012
3. Testing with more test cases. 4. Testing with unreliable, redundancy test cases. To overcome manual generation of test cases various test cases generation tools are developed. But these tools will generate large number of unreliable and redundant test cases. For reducing the test cases, clustering algorithms can be used.

5. References:
[1]. Roger S. Pressman, Software engineering: a practitioners approach, 5th ed. (McGraw-Hill series in computer science) Includes index. ISBN 0-07-365578-3, Avenue of the Americas, New York, NY, 10020. Copyright/2001. [2]. Glenford J. Myers, The Art of Software Testing, Second Edition, Published by John Wiley & Sons, Inc., Hoboken, New Jersey. Copyright 2004 by Word Association, Inc. All rights reserved. [3]. Jiawei Han and Micheline Kambler, Data Mining: Concepts and Techniques, Second Edition, Morgan Kaufmann Publisher, an imprint of Elsevier, 500 Sansome Street, Suite 400, San Francisco, CA 94111. [4]. Lilly Raamesh and G. V. Uma , Reliable Mining of Automatically Generated Test Cases from Software Requirements Specification, IJCSI International Journal of Computer Science Issues, Vol. 7, Issue 1, No. 3, January 2010 ISSN (Online): 1694-0784 ISSN (Print): 1694-0814. [5]. P. Samuel R. Mall A.K. Bothra, Automatic test case generation using unified Modeling language (UML) state diagrams, Department of Computer Science and Engineering, Indian Institute of Technology, Kharagpur 721302, West Bengal, India [6]. Sarita Sharma and Anamika Sharma, Amalgamation of Automated Testing and Data Mining: A Novel Approach in Software Testing, IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 5, No 2, September 2011 ISSN (Online): 1694-0814. [7]. Hughes Software Systems Ltd. Test Automation, http://www.hssworld.com/whitepapers/whitepap er_pdf/test_automation.pdf, December 2002. [8]. Kartheek Muthyala and Rajshekhar Naidu P, A NOVEL APPROACH TO TEST SUITE REDUCTION USING DATA MINING, Indian Journal of Computer Science and Engineering (IJCSE), Vol. 2 No. 3 Jun-Jul 2011, ISSN: 0976-5166. [9]. Remco R. Bouckaert, Weka Manual 3-6-1, Software manual, June 4, 2009, pp-11-14. [10]. Vinaya AND Ketan Shah, Automatic Generation of Test Cases from UML Models, Lecturer, D.J. Sanghvi COE, Mumbai, International Conference on Technology Systems and Management (ICTSM) 2011,Proceedings published by International Journal of Computer Applications (IJCA) [11]. Monalisa Sarma, Debasish Kundu, Rajib Mall, AutomaticTest Case Generation from UML Sequence Diagrams,Department of Computer Science & Engineering, IIT Kharagpur, IEEE 2007. [12]. Object Constraint Language 2.0 is available from Object Mangement Groups web site http://www.omg.org/
2012,

http://www.journalofcomputerscience.com - TIJCSA All Rights Reserved

Anbarasu I, Anitha Elavarasi S, The International Journal of Computer Science & Applications (TIJCSA) ISSN 2278-1080, Vol. 1 No. 10 December 2012
[13]. Jim Heumann, Generating Test Cases from Use Cases, Copyright Rational Software 2001, http://therationaledge.com/content/jun_01/m_cases_jh.html. [14]. Dharmendra K Roy and Lokesh K Sharma, GENETIC K-MEANS CLUSTERING ALGORITHM FOR MIXED NUMERICAL AND CATEGORICAL DATA SETS, International of Artificial Intelligence & Applications (IJAIA), Vol.1, No.2,Page No:23-28, April 2010. [15]. A. Ahmad and L. Dey, A k-mean clustering algorithm for mixed numeric and categorical Data, Data and Knowledge Engineering Elsevier Publication, vol. 63, pp 503-527, 2007. [16]. Ali Ilkhani1and Golnoosh Abaee, Extracting Test Cases by Using Data Mining; Reducing the Cost of Testing, International Journal of Computer Information Systems and Industrial Management Applications, ISSN 2150-7988 Volume 3 (2011) pp. 730-737. [17]. Agnar Aamodt and Enric Plaza, "Case-Based Reasoning: Foundational Issues, Methodological Variations, and System Approaches," Artificial Intelligence Communications 7 (1994): 1, 39-52. [18]. Ahamed Shafeeq B M and Hareesha K S, Dynamic Clustering of Data with Modified K-Means Algorithm, 2012 International Conference on Information and Computer Networks (ICICN 2012) IPCSIT vol. 27 (2012) (2012) IACSIT Press, Singapore 221. [19]. Wei Li, Modified K-means clustering algorithm, IEEE computer society Congress on Image and Signal Processing, 2008, pp. 618-621. [20]. Ran Vijay Singh and M.P.S Bhatia , Data Clustering with Modified K-means Algorithm, IEEE International Conference on Recent Trends in Information Technology, ICRTIT 2011, pp 717-721.

Authors:
1. Anbarasu. I completed B.E Computer Science & Engineering in Sona College of Technology, Anna University Coimbatore in 2011 and doing M.E Software Engineering in Sona College of Technology, Anna University Chennai.

2. Ms .S. Anitha Elavarasi, Assistant Professor, Sona College of Technology, Salem. She completed her B.E Computer Science Engineering from Bharathiar University. She completed her M.E Computer Science Engineering from Anna University - Chennai and currently pursuing her Ph.D. on Cluster Algorithms.

2012,

http://www.journalofcomputerscience.com - TIJCSA All Rights Reserved

You might also like