You are on page 1of 11

[Type text]

CAPTCHA TECHNIQUE FOR PHISHING ATTACK .


NIKHIL V. AGARWAL
to Internet Crime such as phishing has broken its

ABSTRACT
Phishing is the latest technique used by the crackers (non-ethical hacker) to crack into the e-mail accounts, social-networking sites accounts and online banking accounts etc. In phishing technique, the duplicate web page is created as similar to the original web page to make fool to the account users. A link is send by crackers to targeted user in his/her e-mail account if he/she click on the link and gives its account details such as account-id and password. The targeted user loses his/her personal information such as project tenders, office documents etc. Phishing is proving beneficial for the crackers as they do not use their own system to crack the accounts. Cybercrime branch find it difficult to track the crackers positions i.e. IP address of the System immediately but it takes a long time to track the system used by the crackers. Cyber Laws are not so strong to give penalties to the crackers. So it is beneficial for us to use countermeasures for the phishing attacks. The technique such as strong website authentication, Captcha, mail-server authentication and lot of other techniques can be used to avoid phishing. Keywords:- Captcha, Cyberlaws, Phishing, Cybercrime etc. 1. INTRODUCTION : Since the first phishing term was recorded in 1996 which was hunting for free AOL account, phishing is having an increasing tendency over the years. It then evolutes to financial fraud quickly, as the criminals always aim for high yield. Luckily, with the pursuit of online banking, the banking industry is always motivated to play a leading role in fighting phishing threat. However, the reported loss

record each year. It is telling us that we are still looking for a better solution. CAPTCHA is the use of hard AI problem to distinguish Human and bot apart which was originally evolved from Visual authentication and identification. The primary use of CAPTCHA is to fight against auto-bot in Account Registration and Click Fraud. Also, its application can be used to authenticate a group of people sharing common knowledge or abilities. In fact, visual human verifiable techniques are vulnerable to MITM attack. Also, careless

CAPTCHA implementation can lead the application fail to achieve its mission. CAPTCHA alone is nothing in defending MITM attack, such visual security depend on user conscious which can not authenticate other end actually. Motivated by mitigating MITM attack, we propose Extended CAPTCHA Input System (ECIS) which can withstand the described RT-MITM, by combining CAPTCHA and OTP, E-CIS can authenticate specific person, which can be used in secure online banking login scenario. 2. PHISHING In the field of computer security, Phishing is the criminally fraudulent process of attempting to acquire sensitive information such as usernames, passwords and credit card details, by masquerading

[Type text] as a trustworthy entity in an electronic 3. The Phishers obtains the confidential

communication. Phishing is a fraudulent e-mail that attempts to get you to divulge personal data that can then be used for illegitimate purposes. There are many variations on this scheme. It is possible to phish for other information in additions to usernames and passwords such as credit card numbers, bank account numbers, social security numbers and mothers maiden names. Phishing presents direct risks through the use of stolen credentials and indirect risk to institutions that conduct business on line through erosion of customer confidence. The damage caused by Phishing ranges from denial of access to e-mail to substantial financial loss. The simplified flow of information in a Phishing attack:

information from the server. 4. The confidential information is used to

impersonate the user. 5. The Phishers obtains illicit monetary gain. 2.1 PHISHING TECHNIQUES Phishers use a wide variety of techniques, with one common thread.

LINK MANIPULATION Most methods of Phishing use some form of technical deception designed to make a link in an email appear to belong to the spoofed organization. Misspelled URLs or the use of sub domains are common tricks used by Phishers. In the following example, http://www.yourbank.example.com/, it

appears as though the URL will take you to the example section of the yourbank website; actually this URL points to the "yourbank" (i.e. Phishing) section of the example website. An old method of spoofing used links containing the '@' symbol, originally intended as a way to include a username and password. For example, http://www.google.com@members.tripod.com/ might deceive a casual observer into believing that it will open a page on www.google.com, whereas it

1. A deceptive message is sent from the Phishers to the user. 2. A user provides confidential information to a Phishing server (normally after some interaction with the server).

actually

directs

the

browser using a

to

page on of

members.tripod.com, www.google.com: the

username opens

page

normally,

regardless of the username supplied.

[Type text] FILTER EVASION Phishers have used images instead of text to make it harder for anti-Phishing filters to detect text commonly used in Phishing e-mails. WEBSITE FORGERY Once a victim visits the Phishing website the deception is not over. Some Phishing scams use JavaScript commands in order to alter the address bar. This is done either by placing a picture of a legitimate URL over the address bar, or by closing the original address bar and opening a new one with the legitimate URL. 3 CAPTCHA Overview: You're trying to sign up for a free email service offered by Gmail or Yahoo. Before you can submit your application, you first have to pass a test. It's not a hard test -- in fact, that's the point. For you, the test should be simple and straightforward. But for a computer, the test should be almost impossible to solve. This sort of test is a CAPTCHA. They're also known as a type of Human Interaction Proof (HIP). You've probably seen CAPTCHA tests on lots of Web sites. The most common form of CAPTCHA is an image of several distorted letters. It's your job to type the correct series of letters into a form. If your letters match the ones in the distorted image, you pass the test. CAPTCHAs are short for Completely Automated Public Turing test to tell Computers and Humans Apart. The term "CAPTCHA" was coined in 2000 by Luis Von Ahn, Manuel Blum, Nicholas J. Hopper and John (all of Carnegie Mellon of IBM). University, They are

Langford (then

challenge-response tests to ensure that the users are indeed human. The purpose of a CAPTCHA is to block form submissions from spam bots automated scripts that harvest email addresses from publicly available web forms. A common kind of CAPTCHA used on most websites requires the users to enter the string of characters that appear in a distorted form on the screen. CAPTCHAs are used because of the fact that it is difficult for the computers to extract the text from such a distorted image, whereas it is relatively easy for a human to understand the text hidden behind the distortions. Therefore, the correct response to a CAPTCHA challenge is assumed to come from a human and the user is permitted into the website.

3.1 Types of CAPTCHAs CAPTCHAs are classified based on what is distorted and presented as a challenge to the user. They are:

Text CAPTCHAs: These are simple to implement. The simplest yet novel approach is to present the user with some questions which only a human user can solve. Examples of such questions are: y What is twenty minus three?

Such questions are very easy for a human user to solve, but its very difficult to program a computer to solve them. These are also friendly to people with

[Type text] visual disability such as those with color blindness. Other text CAPTCHAs involves text distortions and the user is asked to identify the text hidden. The various implementations are: CAPTCHA Fig 3.2 Yahoos Ez Gimpy

Gimpy: Gimpy is a very reliable text CAPTCHA built by CMU in collaboration with Yahoo for their Messenger service. Gimpy is based on the human

BaffleText: This was developed by Henry Baird at University of California at Berkeley. This is a variation of the Gimpy. This doesnt contain dictionary words, but it picks up random alphabets to create a nonsense but pronounceable text. Distortions are then added to this text and the user is challenged to guess the right word. This technique overcomes the drawback of Gimpy CAPTCHA because, Gimpy uses dictionary words and hence, clever bots could be designed to check the

ability to read extremely distorted text and the inability of computer programs to do the same. Gimpy works by choosing ten words randomly from a dictionary, and displaying them in a distorted and overlapped manner. Gimpy then asks the users to enter a subset of the words in the image. The human user is capable of identifying the words correctly, whereas a computer program cannot to do so. This is a simplified version of the Gimpy CAPTCHA, adopted by Yahoo in their signup page. Ez Gimpy randomly picks a single word from a dictionary and applies distortion to the text. The user is then asked to identify the text correctly.

dictionary for the matching word by brute-force.

ourses

Fig 3.3 BaffleText

MSN Captcha: Microsoft uses a different CAPTCHA for services provided under MSN umbrella. These are popularly called MSN Passport CAPTCHAs. They use eight characters (upper case) and digits. Foreground is dark blue, and background is grey. Warping is used to distort the characters, to produce a ripple effect, which makes computer recognition very difficult.

[Type text] because the user has to understand the language and XTNM5YRE the accent in which the sound clip is recorded. 3.2 APPLICATIONS: CAPTCHAs are used in L9D28229B various Web applications to identify human users and to restrict access to them. Fig 3.4 MSN Passport CAPTCHA Some of them are: y Graphic CAPTCHAs: Graphic CAPTCHAs are challenges that involve pictures or objects that have some sort of similarity that the users have to guess. They are visual puzzles, similar to Mensa tests. Computer generates the puzzles and grades the answers, but is itself unable to solve it. Online Polls: As mentioned before, bots can wreak havoc to any unprotected online poll. They might create a large number of votes which would then falsely represent the poll winner in spotlight. This also results in decreased faith in these polls. CAPTCHAs can be used in websites that have embedded polls to protect them from being accessed by bots, and hence bring up the reliability of the Audio CAPTCHAs: The final example we offer is based on sound. The program picks a word or a sequence of numbers at random, renders the word or the numbers into a sound clip and distorts the sound clip; it then presents the distorted sound clip to the user and asks users to enter its contents. This CAPTCHA is based on the difference in ability between humans and computers in recognizing spoken language. Nancy Chan of the City University in Hong Kong was the first to implement a sound-based system of this type. The idea is that a human is able to efficiently disregard the distortion and interpret the characters being read out while software would struggle with the distortion being applied, and need to be effective at speech to text translation in order to be successful. This is a crude way to filter humans and it is not so popular y Preventing comment spam: Most bloggers are familiar with programs that submit large number of automated posts that are done with the intention of increasing the search y Protecting Web Registration: Several companies offer free email and other services. Until recently, these service polls.

providers suffered from a serious problem bots. These bots would take advantage of the service and would sign up for a large number of accounts. This often created problems in account management and also increased the burden on their servers. CAPTCHAs can effectively be used to filter out the bots and ensure that only human users are allowed to create accounts.

[Type text] engine ranks of that site. CAPTCHAs can be used before a post is submitted to ensure that only human users can create posts. A CAPTCHA won't stop someone who is determined to post a rude message or harass an administrator, but it will help prevent bots from posting messages automatically. y Email spam: CAPTCHAs also present a they do make it more difficult to scalp tickets on a large scale.

plausible solution to the problem of spam emails. All we have to do is to use a CAPTCHA challenge to verify that indeed a human has sent the email.

Search engine bots: It is sometimes desirable to keep web pages not indexed to prevent others from finding them easily. There is an html tag to prevent search engine bots from reading web pages. The tag, however, doesn't guarantee that bots won't read a web page; it only serves to say "no bots, please." Search engine bots, since they usually belong to large companies, respect web pages that don't want to allow them in. However, in order to truly guarantee that bots won't enter a web site, CAPTCHAs are needed. y .Improve Artificial Intelligence (AI)

technology: Luis von Ahn of Carnegie Mellon University is one of the inventors of

CAPTCHA. In a 2006 lecture, von Ahn talked about the relationship between things like CAPTCHA and the field of artificial

intelligence (AI). Because CAPTCHA is a barrier between spammers or hackers and their goal, these people have dedicated time and energy toward breaking CAPTCHAs. Their successes mean that machines are getting more sophisticated. Every time someone figures out how to teach a machine to defeat a CAPTCHA,

E-Ticketing: Ticket brokers like Ticketmaster also use CAPTCHA applications. These

we

move

one

step

closer

to

artificial

intelligence.

applications help prevent ticket scalpers from bombarding the service with massive ticket purchases for big events. Without some sort of filter, it's possible for a scalper to use a bot to place hundreds or thousands of ticket orders in a matter of seconds. Legitimate customers 3.3 Breaking CAPTCHAs The challenge in breaking a CAPTCHA isn't figuring out what a message says -- after all, humans should have at least an 80 percent success rate. The really hard task is teaching

become victims as events sell out minutes after tickets become available. Scalpers then try to sell the tickets above face value. While CAPTCHA applications don't prevent scalping;

a computer how to process information in a way similar to how humans think. In many cases, people who break CAPTCHAs concentrate not on making

[Type text] computers smarter, but reducing the complexity of the problem posed by the CAPTCHA. Let's assume you've protected an online form using a CAPTCHA that displays English words. The application warps the font slightly, stretching and bending the letters in unpredictable ways. In addition, the CAPTCHA includes a randomly generated background behind the word. A programmer wishing to break this CAPTCHA could approach the problem in phases. He or she would need to write an algorithm -- a set of instructions that directs a machine to follow a certain series of steps. In this scenario, one step might be to convert the image in grayscale. That means the application removes all the color from the image, taking away one of the levels of obfuscation the CAPTCHA employs. Next, the algorithm might tell the computer to detect patterns in the black and white image. The program compares each pattern to a normal letter, looking for matches. If the program can only match a few of the letters, it might cross reference those letters with a database of English words. Then it would plug in likely candidates into the submit field. This approach can be surprisingly effective. It might not work 100 percent of the time, but it can work often enough to be worthwhile to spammers. For more complex CAPTCHAs like Gimpy, CAPTCHA displays 10 English words with warped fonts across an irregular background. The 4 MAN IN THE MIDDLE ATTACK As it turns out, with the right CAPTCHA-cracking algorithm, it's not terribly reliable. Greg Mori and Jitendra Malik published a paper detailing their approach to cracking the Gimpy version of CAPTCHA. One thing that helped them was that the Gimpy approach uses actual words rather than random strings of letters and numbers. With this in mind, Mori and Malik designed an algorithm that tried to identify words by examining the beginning and end of the string of letters. They also used the Gimpy's 500-word dictionary. Mori and Malik ran a series of tests using their algorithm. They found that their algorithm could correctly identify the words in a Gimpy CAPTCHA 33 percent of the time. While that's far from perfect, it's also significant. Spammers can afford to have only one-third of their attempts succeed if they set bots to break CAPTCHAs several hundred times every minute. to type in three correct words in order to move forward.

CAPTCHA arranges the words in pairs and the words of each pair overlap one another. Users have

[Type text] Fig4.1: Control Relaying - Man-In-The-Middle Attack (CR-MITM) Since CAPTCHA authentication systems visual interface can be relayed. Hacker can employ a Remote Terminal Service, which project the 1) Eavesdrops and Intercepts all messages going between the victims; 2) Relays messages between them. hackers browser content to the Remote Desktop Client running on victims browser. Since the victim input on CAPTCHA authentication system is processed directly on hackers browser in real time. In short, MITM Attack makes the victims believe that they are directly talking to each other in a direct connection without indicating the existence of middle man. The above is also true for Trojan compromised scenario, but our CR-MITM attack can capture and One famous MITM attack on cryptographic Public Key Infrastructure (PKI) algorithm is the attack on initial in version fixed of by Diffie-Hellman it advanced MITIGATION We can start from the root of problem. Generally, to avoid MITM we can use hardware or trusted platform to perform destination validation by means of cryptographic. However, it is always costly, and trusted platform is not widely deployed still. relay user inputs remotely without local Trojan assistant. After the bank server verify the user creditential, hacker then gain access to online banking.

algorithm

1976,

Authenticated Key Exchange (AKE) version in 1992, Diffie etal. Combine the use of Digital Signature and random number to authenticate each end parties. This lesson telling the fact that a secure protocol without actual authentication will risks suffer from MITM attack.

As the hypothesis of CR-MITM attack is based on MITM can be at user interface layer visually, Schneier described a RT-MITM attack at user interface layer in 2005, which can defeat 2 factor secure token. 5 EXTENDING THE IDEA OF CAPTCHA Control Relaying-Man in the Middle (CRMITM) attack, a remote attack that can capture and relay user inputs without local Trojan assistant, which can possibly defeat CAPTCHA As user general ignorance of CA cert validation warning, it seems there is no way to guarantee the security of online without a costly full hardware solution. Indeed, securing online banking by FOR AUTHENTICATION victim conscious and visual interface relaying, if the design of application can depress those, it can possibly mitigate CR-MITM.

authentication system.

[Type text] CAPTCHA is worth to be developed, as it is human verifiable that it is user friendlier than cryptography way, and its ability of distinguishes between human and bot can raise the cost of bot automatic attack. input authentication factor valid only in a short time that allow only one manual input time by the legitimate user, thus that the time induced for relayed login parameters input in RT-MITM scenario will not able to gain access. Motivated by the analysis of BEA CAPTCHA Input System defeated by RT-MITM, we further design an Extended CAPTCHA Input System (E-CIS) for login process which we aim to mitigate the flaws in BEAs design, and hence it can defends the described RT-MITM attack Assuming CAPTCHA is not understandable for computer, or at least makes significant

processing time to be understood by computer; and human resolver also takes time to recognize the CAPTCHA.

Consider the failure of CAPTCHA because of its reliable property, in our design, the E-CIS will not be easily relayed and exploited by hacker. In our design, by combining OTP in moving-CAPTCHA, the E-CIS requires the OTP Security Token owner to input the OTP by solving relevant CAPTCHA digits. We further propose several non-relay-able properties for the E-CIS application.

For One Time Password that is based on timesynchronization between the authentication server and the client. Consider calibration and customization may be needed for E-CIS scenario.

Input method should be specially designed to against Key-logger and Mouse-logger, so that the input creditential can be secret to attacker. It

The trick is to make hacker cannot automate the login by relaying the CAPTCHA to be solved by victim. Even for the case that hacker finally receive the answer of OTP, he still has to input the OTP to E-CIS manually. However the second manual input will cost extra time, but then the OTP will no longer be valid after the client first manual input. In the end, attacker with timeout OTP cannot gain access to Banking service.

should resist to Visual Relaying that it can further avoid Trojan screen capturing and human resolver attack.

PROCEDURE: Client Login procedures through E-CIS:

1) Client connect to Bank Server by HTTP over SSL, request a logon input page

5.1 Defending RT-MITM by Extended-CIS

2) Bank generates an E-CIS on-the-fly with unique pre-share secret, CAPTCHA challenge

HYPOTHESIS: By utilizing OTP, setting an

(C); Then upload to user. The E-CIS perform a

[Type text] Reverse Turning Test (RTT) utilizing visual CAPTCHA: RTTCAPTCHA {C} 3) The E-CIS make a new HTTPS connection to bank server by build-in CertBank and E-CIS is immune to key logger, mouse logger, information relaying, and session hijacking by its properties.it links OTP input with human OTP owner by combining CAPTCHA and time

Destination IP address of Bank server. 4) Client input his user ID, Password, and especially input OTP by mouse clicking on the floating CAPTCHA Digits in the E-CIS frame. 5) The selections of numbers are sent back to Bank in form of Time and Coordination encrypted by the Pre-Share Secret Key (SK).

restriction. It can mitigate the described RT-MITM attack which threatening CAPTCHA and 2-factor authentications system. Confidentiality is achieved by combining

CAPTCHA input time with One Time Password time restriction, since the OTP is only valid up to first manual input time induced by the human

E-CIS

B : EncSK {(T, Cr)i}

security token owner. A unique, independent, stateful E-CIS application:

6) Bank verifies the OTP by decrypting the cipher by Pre-Share Secret Key (SK).

attacker cannot bypass CAPTCHA Challenge, nor hijack session, nor earn credit by decompilation and analysis of the application.

OTPi = DecSK {EncSK {(T, Cr)i}} 7: 7) If passed, signal E-CIS to Transaction mode 8) Transaction of online banking will be done in the E-CIS application just like a virtual browser. LIMITATION AND DRAWBACKS INHERIT CAPTCHA PROPERTIES: The use of E-CIS will inherit user acceptance issues as in CAPTCHA system e.g. Visual CAPTCHA is not feasible for Blind user. We are not going to discuss here.

PRACTICAL

ISSUE:

Since

our

E-CIS

demonstration is base on Time Synchronous type OTP and its human input time with its valid period, the human input time are various for users, e.g. Old man input slower, Youth input faster. It is difficult to set a single valid time for all users. To make it practical, it should has a initial calibration customizing E-CIS valid time for each user, by taking average input time of first few login process. 6 ACHIEVEMENTS

10

[Type text] Indeed, for practical issues in synchronization and calibration, we can also consider other form of OTP delivery such as SMS which the timing factor may be more deterministic. Also, other form of CAPTCHA challenge can be considered, as the spirit of E-CIS is to utilize the property of CAPTCHA that can only be solvable by human, combining the time restriction of OTP, then the ECIS application can resist to automated MITM attack as well as human assisted attack. [1]C.-M. Leung, Visual security for Anti-Phishing, in is feeble 10 REFERENCES

ICASID09: IEEE International Conference on Anti-counterfeiting, Security, and Identification in Communication. IEEE, Aug. 2009. [Online]. Available: http://sites.google.com/site/lcmkov/ [2] T.-L. Chang, Captcha based one-time password authentication system, Tsung-Lun Chang Masters Thesis, Graduate

9 CONCLUSIONS In this work, we reviewed Man-In-The-Middle (MITM) attacks which can even defeat CAPTCHA phishing protection.

Institute of Information Engineering, Feng Chia University, Taiwan, Jul 2006. [3] S. Saklikar and S. Saha, Public key-

embedded graphic captchas, Consumer Communications and Networking

To mitigate the above MITM attacks, we designed an Ex-tended CAPTCHA Input System (E-CIS), which we firstly enable a CAPTCHA system to authenticate a specific human by combining the use of OTP and its time restriction, and the design of ECIS makes it highly resist to information relaying attack. The E-CIS is software base, no installation needed; it is feasible to be widely deployed as compared to costly hardware. Our solution reuses the large scale shipped OTP token, which can save huge amount of money instead of re-design and shipping of a new hardware solution.

Conference, 2008. CCNC 2008. 5th IEEE, pp. 262266, 10-12 Jan. 2008. [4] A. Spalka, A. B. Cremers, and H. Langweg, Trojan horse attacks on

We hope this work will encourage other attempt to optimization of CAPTCHA Input System, or even find more optimal candidate of CAPTCHA type and its relative One Time Password.

11

You might also like