You are on page 1of 20

A TAXONOMY OF PHISHING

By

Name___________________________

Introduction Literature Review


The age of Information Technology was first launched in the US in 1968 when the ARPANET (Advanced Research Projects Agency) network was first developed to exchange information amongst defence laboratories. It was the ARPANET that first spawned Usenet, the precursor of email in 1979 (Martin et al., 2007). The Usenet was used as bulletin board and to share and exchange knowledge amongst various colleges. The development of personal computers in the ninety seventies gave a big boost to the Usenet which morphed into email while chat facility was introduced in 1988 (Saunders 2003). The introduction of the World Wide Web in the 1990s by Tim Berners and Robert Caillau and the development of browsers and websites spawned a huge boom in use of the web. Domain registration saw an increase in personalized websites on the internet. The web registered a phenomenal growth of 341634% by 1995 (Perkins 2010). Search Engines were released and a large number of browsers developed. Today the internet has become an integral part of daily lives both personal and professional. Its primary applications include the World Wide Web, Email, Data / File Transfers and Chat (Rolf 2007). These applications are offered to millions of computers globally interconnected through a large network.

The key to its popularity lies in its ability to change the way people communicate, do business, work, socialize, access services and information. The internet continues to grow in terms of size, processing power and functionalities making it the most rapidly expanding technological innovation in the history of mankind (Zhang 2007). A measure of the growth of the internet can be borne by the fact that the APARNET in 1968 connected just 1000 systems (Tally 2004). By 1987 there were 30000 users on the internet. In 1995 this figure rose to 16 million persons comprising 0.4% of the worlds population. The turn of the century, i.e. 2000 saw an incredible 479 million persons, almost 8%, of the worlds population hooked to the internet and that figure rose to a phenomenal 2110 million in June 2011 comprising 30.4 % of the worlds population (Salton et al., 2005).

When the internet was first conceived it was in a user friendly neutral environment (Davies 2003). This meant it was transparent and fostered the development of a variety of applications. It did not incorporate security measures since the possibility of any form of attack was not considered at that time (Conta et al. 1998). However since its inception, the internet has been a conduit through which various attacks are perpetrated on users particularly those who use financial services. From the AOL account robberies in the 1990s, identity theft and large scale fraud of institutions such as banks and their customers have become more common (Bradner 2006). The annual financial loss in the US alone on account of these attacks is estimated to be USD 3 billion (Davies 2003). The loss is not limited to money alone. Classified information, personal / identity data are also vulnerable to being leaked through unauthorized access. These attacks result in network breakdown, fraud, robbery of identity and sabotage of industries. The attendant damages include loss of trust, erosion of brand value, costly lawsuits and reduced usage of online systems and service offerings. These attacks are called hacking and the perpetrators are called hackers or crackers due to their ability hack into system by cracking their security codes. There are different variants of these attacks with varied levels of technological sophistication. However one common feature is that they are perpetrated through electronic media, i.e. they are primarily technical in nature. There are several mechanisms developed by organizations to counter these attacks. These countermeasures include antiphishing and antivirus mechanisms, intruder detectors, firewalls etc. These measures have attained a fair degree of success in combating technical attacks. The increasing success of complex antihacking technological infrastructure has led hackers to now employ less technical means to conduct their attacks. This is done by attacking the human elements in the target environment which when combined with technology have resulted in a new variety of called social engineering. Social Engineering has been variously defined as hereunder: Social engineering is a term that describes a non-technical kind of intrusion that relies heavily on human interaction and often involves tricking other people to break normal security procedures. The practice of deceiving someone, either in person, over the phone, or using a computer, with the express intent of breaching some level of security either personal or professional.

Social engineering is a collection of techniques used to manipulate people into performing actions or divulging confidential information. From the above definitions social engineering can be summarized as the method of attacking computers through data gleaned from social interactions. This data includes passwords, PIN Numbers, access codes to an organizations infrastructure and the targets include telecommunication companies, banks and other financial institutions, military installations, government agencies, industries and hospitals. Hackers use this data to penetrate security measures installed by the target institutions. While security features are functionally dependent on technology, their success is a measure of the trust they generate in the authenticity of protection provided. Thus hackers target the weakest chink in organizations armour, i.e. its people. Social Engineers are technically proficient persons who also possess very good social skills. They work on certain attributes of the process by which human beings make decisions called cognitive biases. Also called bugs in the human hardware, they all rest on the fundamental human predispositions to trust, to be helpful, to be liked. This tendency to trust is then exploited in different methods to generate attack vectors. Hackers gain trust by manipulating people into connecting to various emotions such as avarice, sympathy or fear and then divulging sensitive information. A socially engineered attack successfully perpetrated nullifies large investments in security infrastructure. There are many variants by which people are manipulated. However confusion still exists amongst the as to what exactly is a socially engineered attack leading to varied descriptions. This results in an inability to devise adequate security measures. A virus can be described as a worm and vice versa by two different organizations. However the security measures for a virus is different than that of a worm. Hence systems cannot be protected adequately unless they are identified and classified in a scientific manner. Hence the need for a Taxonomy of Social Engineering. Taxonomy is derived from the Greek words taxis or arrangement and nomia or practice. Thus Taxonomy may be defined as the art and science of classification. The purpose of taxonomy of social engineering is to provide a consistent method to classify attacks. It provides a structured method to identify attacks. It incorporates previous information to categorize attacks so that similarities can be made out and suitable computer / network antihacking measures devised. This

enables an organization to successfully combat new variations attacks. It is simultaneously all encompassing and specific taking into account all aspects of hacking to identify a specific attack vector.

A good taxonomy comprises some essential features. These include: Acceptability A taxonomy must be accepted by the community at large Comprehensibility Its language and concepts should be easily understood Complete It should exhaustively consider all aspects of all possible attacks and thus provide
categorization Clarity Terms must be clearly defined. Determinants The process of categorization must be described in an unambigous manner Mutually Exclusive There should be no overlap while categorization. Any one attack must be categorized into one category Repeatability Categorizations should be able to be repeated Terminology The terms used for describing should be contemporary and relevant. This prevents confusion Utility A taxonomy should be able to be used by industry to recognize and prevent attacks. While the goal of a good taxonomy is to incorporate all of the above, it may not be necessary to include all of them in any particular taxonomy. Existing Taxonomies The Protection Analysis (PA) Taxonomy of Bisbey and Hollingworth and the Reasearh in Secured Operating Systems (RISOS ) of Abbot devised a taxonomy based on vulnerabilities. They categorized faults in security systems and devised similar classification methods. They suffer from ambiguity in definitions thus violating the mutually exclusivity principle that makes for a good taxonomy. The primary utility of these two taxononmies is the foundation provided to develop more advanced variants. Bishops Vulnerability Taxonomy Bishop conducted an evaluation of the PA and RISOS taxonomies and used the base of susceptibilities to construct his own vulnerability taxonomy. His main contribution was to create a classification according to six axes including the nature, time, gains, effect, number & source of the attack. This method replaced the hitherto flat method of classification. He postulated the requirements of a good taxonomy. Howards taxonomy

This taxonomy is based on processes used to perpetrate attacks. It classifies attacks according to five processes involved in an attack. These include the hackers, implements used, access, effects and motivation. This primary contribution of this taxonomy is its all encompassing approach including all aspects of an attack. However it is not mutually exclusive. Eg. the hackers category can contain overlap, i.e. the terrorist can also be spy. Loughs VERDICT Taxonomy This taxonomy is the Validation Exposure Randomness Deallocation Improper Conditions Taxonomy (VERDICT). It focuses on 4 characteristics of a hacking attempt. These include Improper Validation lack of authentication mechanisms results in unauthorized access to systems Improper Exposure A system that is exposed to attack and not adequately protected Improper Randomness attacks can occur at random Improper Deallocation Confidential data is available for copying due to faulty deletion. The VERDICT Taxonomy while being all encompassing in concept does not allow for identification to specific cases. In fact while the above taxonomies fulfill most of the conditions that make for good taxonomies, they are too general in nature. Hence the need for a taxonomy that is practical and specific that will adequately fulfill the usefulness criteria of a good taxonomy. This will fulfill the end goal of this dissertation which is to provide a pragmatic approach for organizations to detect a socially engineered attack and prevent it during the daily course of events. There are three stages of a social engineering attack. These are the: 1) Preparatory Stage 2) The Implementation or Attack Stage 3) Post Attack Stage

The Taxonomy proposed in this report will focus on the Implementation or Attack Stage as this is specific to Social Engineering. Any attack, even non socially engineered ones will require a preparatory and post attack stage. For an attack to be classified as being socially engineered it must follow the Social Engineering Cycle which consists of four distinct stages including: 1) Research During this stage, the hacker conducts a background search on the organization and its people to find out those persons who potentially hold classified data and who are vulnerable to attack. 2) Development of Trust All humans are predisposed to trust. The hacker exploits this tendency by a series of social interactions / communications that result in gaining the victims trust . This is done by persuasive techniques that either connect the victims ability to reason or to human emotions such as avarice, sympathy and fear.

Persuasion through reasoning is called the central route while persuasion through emotion is called the peripheral route. The aim of gaining the victims trust is to get them to then respond 3) Exploitation of Trust The victim is then coaxed to divulge sensitive, classified information to the supposedly trustworthy entity, i.e. the hacker. This information typically includes passwords, user ids, PIN numbers and access codes. 4) Utilization of Trust The hacker then utilizes this data to access services available to the genuine user. These can be both financial and non financial in nature. The former includes actual siphoning of funds, purchase of expensive goods etc, while the latter includes unauthorized access to an organizations infrastructure such as a data bank. All Social Engineering attacks can be divided into two broad categories 1) Human based social engineering (Man to Man) 2) Technology based social engineering (Man to Man via Media) The two other methods of obtaining classified information includes open source research or shoulder surfing and covert searches or dumpster diving. However these two methods do not rely on any form of social interactions which is a primary requirement for an attack to be classified as being socially engineered. While the human based approach involves face to face interactions that exploit a human beings possible ignorance and natural inclination to trust, the technology based approach exploits the users through communications through contemporary media.
D

This brings us to the Fundamental Research Question of this Dissertation: How does a Taxonomy of Phishing aid in identifying pishing attacks and hence the development of effective anti phishing mechanisms? Aim of the Dissertation

This dissertation aims to construct the taxonomy of phishing attacks based on the social engineering techniques use for perpetration. This study will provide an insight into the nature of these attacks thereby aiding in the development of detective and preventive mechanisms. Objectives To construct a Taxonomy of Phishing based on the use of Social Engineering Techniques To identify the current taxonomies present and discuss their advantages and disadvantages To experimentally demonstrate the effectiveness of Social Engineering Techniques To educate, alert and sensitize the public as to the dangers of phishing and the insidious

methods used for perpetration To use the insights gained from the above study to devise suitable antiphishing mechanisms

Research Flow Chapter 1 Introduction This chapter will introduce the reader to the phenomenon of phishing. The research question, aims and objectives of the dissertation will be discussed here. Chapter 2 Literature Review This chapter is a compendium of extant literature on various Taxonomies of Phishing. Their advantages and disadvantages will be highlighted after which Phishing Attacks will be classified based on their method of perpetration. The different techniques will be elaborated upon.

Chapter 3 Methodology This chapter is an expose on the quantitative and qualitative techniques that will be used to conduct the experiment that will demonstrate the effectiveness of social engineering techniques used to perpetrate phishing attacks. Chapter 4- Findings This chapter is an elaboration of the experiment and its findings Chapter 5 Analysis and Discussion The findings will be analyzed using the Descriptive Statistics tool. The deductions arrived at will be discussed to show how successful social engineering techniques are in deceiving the larger public. Chapter 6 Conclusion, Limitations, Future Scope This chapter summarizes the entire dissertation. Those aspects of phishing not yet discussed leaving them open for future study will be highlighted

Methodology This chapter critically evaluates those methodologies that will achieve our research objectives. Research Approaches All research methodologies can be classified as Inductive and Deductive, Quantitative and Qualitative (Florencio & Hurley 2007). Deductive methodologies apply a general postulate or hypothesis to specific individual cases while Inductive methodologies start with individual cases and arrive at a hypothesis or postulate. Deductive methodologies move from the general to the specific while Inductive methodologies move from the specific to the general (Gutmann 2007) An inductive approach starts with experiments and observation under scientific conditions resulting in the collection of primary data. This data is then subject to empirical analysis and a general theories or hypotheses arrived at. This method is numbers based and is hence quantitative in nature (Litan 2009). The deductive approach employs descriptive, theoretical data for its analysis and is hence qualitative in nature.

Research Strategies The method of approach in this dissertation is both inductive & deductive in nature. The deductive approach to be used in this dissertation will first entail detailed study of literature available. This includes study of books, various research articles and reputed journals. The purpose of the study is to identify the various taxonomies used to classify phishing attacks and highlight their advantages and disadvantages. The author will propose a method of classification based on social engineering techniques used to perpetrate these attacks. A detailed explanation of the methodology of each attack will be given. Insofar as this method is theoretical and descriptive it is qualitative in nature.

The inductive approach to be used in this dissertation involves perpetration of a simulated attack on facebook accounts of a select audience. The findings of each attack in terms of ease of deception, the effectiveness of the social engineering techniques used etc will be analyzed and discussed. Insofar as this method is going to be experimental in nature and will use numbers and analysis to arrive at conclusions it is inductive or quantitative in nature. The research tools used will include the following:

Creation of a simulated phishing environment using PERL & Web scripting on a Facebook

Page. Spoofed Email Addresses will be used to send typical phishing messages to an identified target audience. This message will be socially engineered and will contain a link that will direct users to a webpage with the same look and feel of a facebook login page. The page will prompt them to enter their login ids and passwords. The success rate of these attempts will then be measured. The data from the above experiment will be collected and analyzed using The Descriptive

Univariate Statistics Analysis Tool in Excel. PERL Scripting and Facebook PERL is multipurpose, feature rich programming languages capable of being run on over 100 platforms and interfaced with several third party modules (Jackson & Simon 2007). It web development capabilities will be used to create a phisher site with the same look and feel of a facebook login page. It was chosen since it is easily available on the net, can be configured to most platforms and can be used for programming by anybody with basic Web Scripting Knowledge. Facebook Accounts were chosen since real phishers frequently hack into social networking sites such as Linked in, My Space and Orkut as also Facebook to steal personal information which can then be maliciously used (Fette & Sadeh 2006). Facebook being one of the most popular social networking sites is frequently subject to phishing attacks. Hence it was chosen as part of this experiment. Population Sample

Sampling involves choosing a select representative group relevant to the research at hand from amongst the general population (Allan & Heiser, 2006) and administering the experiment to them. According to Bellovin (1995) budget and time constraints have to be considered when selecting a sampling method. Sampling method is hence divided into two different categories. Probability sampling is conducted with a determined number of persons while non-probability sampling is sent to an indeterminate number of persons (Bignell 2006).

To identify the sample population for this experiment, probability sampling method will be used. This method consists of identifying from amongst the larger population those individuals whose responses would most suit our research purposes (Dasgupta & Chatha 2006). Probability sampling also allows for quick collection of data since the sample audience is limited (Dierks & Rescorla 2006). The sample audience will be chosen from amongst friends, associates and their references. This method will be chosen since it is obviously not possible to contact everyone in the research population. Only a select 30 persons identified on account of their willingness to participate in this experiment will be chosen. The strategy that will be used will not be disclosed to the participants to prevent any pre emption or increased alertness that will result in biased responses.

Permission to participate in the experiment will be sought through an email sent from the authors ID. Only after receiving their confirmation will the spoofed email be sent to them. Descriptive Statistical Analysis Tool Univariate analysis or Frequency Count is the simplest form of Quantitative Statistical Analysis. It summarizes individual variables in a given data set. (Franklin 2007). The responses from the target audience will be fed into a summary excel sheet and then subject to this analysis. Given that the responses are limited in terms of whether the respondents actually logged in or not, it was thought fit to use the univariate method of analysis.

System Requirements Hardware Requirements:


PROCESSOR RAM MONITOR HARD DISK KEYBOARD NETWORK : PENTIUM IV 2.6 GHz :512 MB RAM :15 COLOR :20 GB :STANDARD 102 KEYS : ACCESS TO INTERNET

Software Requirements:
FRONT END BACK END : Facebook Connectivity : Ms Excel PERL Scripting HTML Scripting

OPERATING SYSTEM : Windows XP

Research Ethics Ethical considerations must be involved in every stage of the research process to ensure transparency, reliability, confidentiality and to present original data and not a plagiarized version. All the ethical guidelines issues by Sheffield Hallam University will be implemented throughout the research. Compliances with regard to the universitys IT policy will be maintained at all times. Only those who confirm their participation in the experiment will form the target audience of 30. Risk Assessment and Limitations The main limitations of this research are that Quantitative research methods are highly subjective. (Fielding et al., 1999). Risks incurred and mitigations mechanisms put in place to counter these will be discussed.

August 11 Approximate Timeline Earl y Mid End

September 11 Earl y Mi d End

October 11 Earl y Mid End Earl y

November 11 Mid End

December 11 Early Mid End

Collection of books, journals and other literature

Phase 1

Writing of Introduction, Research Question, Aims & Objectives, and Questionnaire. First Submission Obtain feedback, incorporate them and submit revised proposal Identify target audience, obtain their permission, send them Spoofed Mails Collect Data and analyze

Phase 2

Phase 3

Phase 4

Phase 5

Finish the Dissertation and submit

Phase 6

References ABAD, C. (2005). The Economy of Phishing: A survey of the operations of the phishing market. New York. Cloudmark. ADIDA, B, HOHENBERGER and RIVEST, R.L. (2005). Fighting Phishing Attacks: A Lightweight Trust Architecture for Detecting Spoofed Emails. New York. ACM Press.

ALLAN, A. and HESIER, J. (2006). State of the Art for Online Consumer Authentication. New York. Gartner. ALSAID, A and MITCHELL, C.J. (2006). Preventing Phishing Attacks using Trusted Computing Technology. Plymouth. ACM Press. ANTI PHISHING WORKING GROUP. (2003). Proposed Solutions to Address the Threat of Email Spoofing Scams. Available: http://www.antiphishing.org. Last Accessed: 19 September 2011. BARROSO, D. (2007). Botnets The Silent Threat, European Network and Information Security Agency (ENISA). BELLOVIN, S.M. (1995). Using the Domain Name System for System Break-ins. Proceedings of Fifth Usenix UNIX Security Symposium. New York. BIGNELL, K.D. (2006). Authentication in an Internet Banking Environment: Towards Developing a Strategy for Fraud Detection. International Conference on Internet Surveillance and Protection. Plymouth. CHRISTODORESCU, M. and JHA, S. (2004). Testing Malware Detectors. New York. ISSTA. CLAYTON, R. (2005). Whod phish from the summit of Kilimanjaro? ISBN 9783-540-26656-3. CLAYTON, R. (2005). A Chat at the Old Phishin Hole. Lecture Notes in Computer Science, Stockholm.Springer-Verlag. CLOSE, Tyler. (2004). Trust Management for Humans. Waterken Technical Report. CRANOR, L. and ENGELMAN, S. (2006). Phinding Phish: An Evaluation of Anti Phishing Toolbars.CyLab Technical Report. New York.

DAPENG, J. (2007). Personal Firewall Usability A Survey. New York. McGraw & Hill. DASGUPTA, P and CHATHA, K. (2006). Personal Authenticators: Identity Assurance under the Viral Threat Model. New York. ACM Press. DELPHA, L. and RASHID, M. (2004). Smartphone Security Issues. Europe.

Black Hat Briefings. DEPARTMENT OF DEFENCE, (1985). Trusted Computer System Evaluation Criteria. Dod 5200.28-STD. In the Glossary under entry Trusted Computing Base (TCB). New York. DHAMIJA, R. TYGAR, J.D. & HEARST, M. (2006). Why Phishing Works, New York. McGrawHill. DIERKS, T. and RESCORLA, E. (2006). The Transport Layer Security. New York. RFC. DYNES, S, BRECHBUL, H. and JOHNSON, M.E. (2009) Information Security in the Extended Enterprise: Some Initial Results from a field study of an Industrial Farm. Harvard University. Available: http:// infosecon.net/workshop/pdf/51.pdf. Last Accessed: 18 June 2011. EGELMAN, S. and CRANOR, F. (2008). Youve been warned: An empirical study of the effectiveness of web browser phishing warnings. ACM Press. New York. EMIGH, A. (2011). Online Identity Theft: Phishing Technology, Chokepoints and Countermeasures, Identity Theft Technology Council, Available: http://www.antiphishing.org/phishing-dhs-report.pdf. Last Accessed: 19 June 2011 FETTE, I. and SADEH, N. (2006). Learning to Detect Phishing Emails. New York. Carnegie Mellon Cyber Laboratory Technical Report.

FLORENCIO, D and HERLEY, C. (2007). A large scale study of web password habits. New York. ACM Press. FRANKLIN, J. (2007). An Inquiry into the Nature and Causes of the Wealth of Internet Miscreants. New York. ACM Press. FIELDING, R. GETTYS, J & MOGUL, J. (1999). Hypertext Transfer Protocol Request for Comments. ACM Press. GUHRING, P. (2007). Concepts against Man in the Browser Attacks, Financial Cryptography. New York. ACM Press. GUTMANN, P. (2007). Phishing Tips & Techniques. Cambridge. HALLAWELL, A and LITAN, A. (2007). Brand Monitoring and Anti Phishing Services Intersect Several Security Markets. Gartner Publications. JACKSON, C. and SIMON, D.R. (2007). An Evaluation of Extended Validation and Picture inPicture Phishing Attacks. New York. USEC. JAKOBSSON, M. (2007). Phishing and Countermeasures: Understanding the Increasing Problem of Electronic Identity Theft. Wiley. ISBN: 978-0-471-78245-2. JAKOBSSON, M. (2005). Modeling and Preventing Phishing Attacks. Phishing Panel of Financial Cryptography. New York. JAKOBSSON, M. & RATKIEWICZ, J. (2006). Designing Ethical Phishing Experiments: A study of ROT13 query features, International Conference on World Wide Web. JAMES, L. (2005). Phishing Exposed. Syngress. ISBN: 978-1-597-49030-6.

KAMINSKY, D. (2004). Black Ops of DNS. Black Hat Briefings. LITAN, A. (2009). The War on Phishing is Far from Over. New York. Gartner Group Report. MCMILLAN, R & GARTNER (2006). Consumers to Lose $ 2.8 Billion to Phishers in 2006. New York. Network World. MOORE, T. and CLAYTON, R. (2009). Evil Searching: Compromise and Recompromise of Internet Hosts for Phishing. Barbados. MORRIS, R.T. (1985). A Weakness in the 4.2 BSD UNIX TCP / IP software. Computing Science Technical Report. AT & T Bell Laboratories. MUTTON, P. (2006). PayPal Security Flaw Allows Identity Theft. New York. Netcraft. OLLMAN, G. (2004). The Phishing Guide: Understanding & Preventing Phishing Attacks. NGS Software Insight Security Research. New York. PARNO, B. and KUO, C. (2006) Phoolproof Phishing Prevention. Available:

http://sparrow.eco.cmu.edu/-adrian/projects/;phishign.pdf. Last Accessed 15 July 2011 PHILIPPSOHN, S. (2001). Trends in Cybercrime An Overview of Current Financial Crimes on the Internet. Computers & Security Publications. New York. ROWE, R. and Gallaher, M.P. (2009). Private Sector Cyber Security Investment: An Empirical Analysis. Cambridge. Available: http://weis2006.econinfosec.org/docs/18.pdf. Last Accessed: 10 June 2009. SALTON, G. MCGILL, M.J. (1986). Introduction to Modern Information Retrieval. New York. McGraw-Hill.

TALLY, G. THOMAS, R. and VAN VLECK, T. (2004). Anti-Phishing: Best Practices for Institutions and Consumers, New York. McAfee. ZHANG, Yue & Jason. (2007). A context Based Approach to Detecting Phishing Web Sites. New York. McGraw-Hill.

You might also like