You are on page 1of 4

Information Security Technology Based on DNA Computing

Guangzhao Cui (1), Limin Qin*(1), Yanfeng Wang (1,2), Xuncai Zhang(2) (1) College of Electrical Information Engineering, Zhengzhou University of Light Industry, Zhengzhou 450002,China Email: cgzh2008@163.com (2) Department of Control Science and Engineering, Huazhong University of Sci. and Tech., Wuhan 430074, China Email: qinlimin20008@163.com Abstract: DNA computing is a new method of simulating biomolecular structure of DNA and computing by means of molecular biology technological computation. It introduces a fire-new data structure and calculating method, providing a new way for solving the NP-complete problem. It is a new computational method by harnessing the enormous parallel computing ability and high memory density of bio-molecules, which brings potential challenges and opportunities to traditional cryptography. DNA cryptography is a new field of cryptography arising with DNA computing research in recent years. It can realize several security technologies such as Encryption, Steganography, Signature and Authentication by using DNA molecular as information medium. We firstly introduce the basic idea of DNA computing, and then discuss the information security technology in DNA computing. Keywords: DNA, DNA computing, DNA-Based Cryptography, Information security technology 1 INTRODUCTION

DNA computing is a new method of simulating biomolecular structure of DNA and computing by means of molecular biological technology which is a novel and potential growth interdiscipline.In a pioneering study, Adleman used DNA to solve a directed Hamiltonian path problem [l], it indicated the feasibility of a molecular approach to solve combinatorial problems. This approach has been extended by Lipton to solve another NP-complete problem the satisfaction problem [2]. These elegant studies demonstrated how problems corresponding to Boolean formulas can be solved by a massively parallel processing procedure. Such a procedure makes use of the ability of DNA sequences to hybridize specifically to their complementary sequence. DNA computing has been proposed to solve difficult combinatorial search problems such as the Hamiltonian path problem, using the vast parallelism to do the combinatorial search among a large number of possible solutions represented by DNA strands. For example, DNA computing methods can break the Data Encryption Standard (DES)[3,4]. DNA computing highly parallel, it can simultaneously attack different parts of the computing problem. While these methods for solving hard combinatorial search problems may succeed in fixed sized problems, they are ultimately limited by their volume requirements, which will increase exponentially by input size. However, DNA computing has many further exciting applications besides the pure combinatorial search. For example, DNA and RNA are appealing mediums for data storage due to the very large amounts of data that can be stored in compact volume[5, 6]. They vastly exceed the storage capacities of conventional electronic, magnetic, optical medium. A gram of DNA contains about 1021 DNA bases, or about 108 tera-bytes. Hence, a few grams of DNA may have the potential of storing all the data stored in the world [7, 8]. Recent research has considered DNA as a medium for ultra-scale computation and for ultra-compact information storage [9, 10]. DNA cryptography is a new born cryptographic field emerged with the research of DNA computing [11-14], in which DNA is used as information carrier and the modern biological technology is used as implementation tool. The vast parallelism and extraordinary information density inherent in DNA molecules are explored for cryptographic purposes such as encryption, authentication, signature, and so on. One potential key application is DNA-based, molecular cryptography systems. DNA computing provides a parallel processing capability with molecular level, introducing a fire-new data structure and calculating method, proposing challenges to traditional information security technology. Research in this area may eventually lead to the birth of new computers, new data storage systems and new cryptography systems, that will trigger a new information revolution. 2 DNA and DNA computing

2.1. DNA

DNA (Deoxyribonucleic Acid) is the germ plasm of all life styles. It is a kind of biological macromolecule made up of nucleotides. In a DNA molecule, deoxyribonucleotides are joined into a polymer by a phosphodiester bond between the 5 hydroxyl of one ribose and the3 hydroxyl of the next, thus left a free 5 end and a free 3 end. Conventionally, DNA molecules are directed polymers written from the 5 to the 3 end. Each nucleotide contains a

1-4244-1035-5/07/$25.00 .2007 IEEE.

288

single base. There are four kinds of bases, which are adenine (A) and thymine (T) or cytosine (C) and guanine (G).DNA most commonly occurs in nature as the well-known double-helix but both single and double DNA string fragments can be synthesized outside the cell. In a double helix DNA string, two strands are complementary in terms of sequence, that is A to T and C to G according to Watson-Crick rules, which is one of the greatest scientific discoveries of the 20th century. It reduced genetics to chemistry and laid the foundations for the next half century of biology.
2.2. DNA computing

DNA computing is a novel method for solving intractable computational problems. In 1994, Adleman first demonstrated that a directed Hamiltonian path problem (HPP) could be encoded in DNA and evaluated. After that, many researchers around the world have been attracted to the new field of DNA computing. In 1995, Lipton extended Adlemans idea to solve the satisfiability problem. In 1997, Quyang presented a molecular biology-based experimental solution to the maximal clique problem [15]. In 2000, Liu designed a DNA computing model system, which is called surface-based DNA computing, and solved the satisfiability problem [16]. In 2001, Wu analyzed and improved their surface-based method [17]. In the same year Benenson designed a programmable and autonomous computing machine made of biomolecules, on which a finite automaton can run [18]. Independently of the future technological success of DNA computing, this area has led already to interesting new computing paradigms which are certainly enriched our understanding of the nature of computation [19]. In order to introduce the principles of DNA computing we briefly review the model which Prof. Adleman had used to solve a directed Hamiltonian path problem. The Hamiltonian path problem is to find a path that begins at vin, ends at vout and enters every other vertex exactly once on a directed graph. For each vertex i in the graph, a random 20mer oligonulecotide DNA sequence was generated. The process to solve the directed Hamiltonian path problem is list as follows: (1) Generate random paths through the graph. (2) Keep only those paths which begin with vin and end with vout. (3) If the graph has n vertices, then keep only those paths which enter exactly n vertices. (4) Keep only those paths which enter all of the vertices of the graph at least once. (5) If any paths remain, say yes, otherwise say no. Generally speaking, Adleman used the DNA sequence encoding of all possible answers to the problems, removing the solutions that do not meet the requirements through a series of restrictive conditions. Finally, Adleman found the solution. From the algorithm solving the directed Hamiltonian path problem, we can see the difference between DNA computing and traditional computing that DNA computing has massive parallelism and highdensity information of biomolecules. DNA computing pioneer Adleman reviewed DNA computing as follows: For thousands of years, humans have tried to enhance their inherent computational abilities using manufactured devices. Mechanical devices such as the abacus, the adding machine, and the tabulating machine were important advances. But it was only with the advent of electronic devices and, in particular, the electronic computer some 60 years ago that a qualitative threshold seems to have been passed and problems of considerable difficulty could be solved. It appears that a molecular device has now been used to pass this qualitative threshold for a second time. The development of DNA cryptography benefits from the progress of DNA computing. On the one hand, cryptography always has some relationship with the corresponding computing model more or less. On the other hand, some biological technologies used in DNA computation are also used in DNA cryptography. 3 The application of DNA computing in security field

3.1 DNA encryption techniques

It is known that, one-time-pads security is completely decided by the random of cipher key, this kind of algorithm is absolutely secure in theory. The algorithm required: firstly, the data of cipher key is completely random; secondly, the cipher key cant be recycled. But physically, if one-time-pads provides a complete security exists two basic difficulties: on the one hand, it is very difficult in producing a large-scale random cipher key as well as have to in accordance with the plain code length; on the other hand, facing the problem of cipher key saving and distributing. These problems result in the one-time-pads algorithm physically infeasible. However, DNA as information carrier has high memory density, which is one hundred billion to one thousand billion times compared to the commonly used disk memory. It will be better solved the huge cipher key producing and saving problem and providing a even road for the one-time-pads algorithm. Prof. Gehani utilized this thought to present one-time-pads mechanism based on DNA to design two encryption methods of one-time-pads of DNA sequence. One method is to translate the fixed length DNA plain code sequence cell

1-4244-1035-5/07/$25.00 .2007 IEEE.

289

to DNA cryptograph sequence according to the defined mapping graph, we call it mapping substitute. The other is called exclusiveor method, which uses biological molecular techniques to carry through exlusiveor operation of DNA plain code and cipher key sequence. It is absolutely secure to use these two methods of one-time-pads encryption mechanism. But in this case, it is crucial that how to set down the encryption mapping graph or cipher key carrier (called DNA material) between the two communicators and ensure this material cant be filched and replicated. In the meantime, how to carry on the error-correction disposal and long period conservation of DNA cipher key is also a problem that has to be resolved. Gehani also lead the DNA computing into dissymmetric encryption mechanism. They utilize the super parallel computing ability and incomparable information saving capacity of DNA to improve the strength of code system with the more complex algorithm, which is a brave speculation. Essentially, Gahanis way is in virtue of DNA as the information carrier, which is a coding and decoding techniques based on traditional encryption algorithm. It will increase the amount of information to realize more great and complicated data structure if adding precise coding information.
3.2. DNA steganography

With the development of biotechnology, the way of transmitting DNA are more and more abundant and brief. The advanced transmitting way does not only reduce the cost but increase the information security [20, 21]. DNA steganography has more a layer of protection than the simplex code encryption techniques, which provides a novel thought for information security and a new orientation for its research. The principle of DNA steganography is conceal the information which needs encryption in the large numbers of irrelevant DNA sequence chains. This way of decoding like looking for a needle in a bottle of hay which make attackers difficult to ascertain the correct DNA fragment. Only the proper receiver can find the correct DNA fragment based on the conventional information in advance between the two parties as well as require the information which conceal in it. One can argue that steganography is not actually encryption, since plaintext is not encrypted but only disguised within other media. The experimentation about DNA steganography have firstly accomplished by Bancroft, who apply the DNA steganography in a piece of celebrated information of the Second World War, which withdraw successfully. By using an alphabet of exoteric short nucleic acid sequence, they encode the plain code information in DNA chains and add a special section marked information on the bottom of DNA chains. This kind of DNA will blend with the same length DNA chains. These amounts of DNA chains were divided into lots of microdots, however, just a microdot contains DNA molecules which take count of a hundred million. In this way, to attackers, though they can ascertain the information exist in a microdot among numerous microdots, it like looking for a needle in a bottle of hay to select one which contains in a hundred million DNA chains. As a matter of fact, the key of deciphering information lies in looking for a special section of bottom mark which is able to utilize the method of DNA computing to search. Once the DNA chain have confirmed through the mark, the receivers will adopt PCR to replicate this DNA chain as well as acquiring information by deciphering. By introducing these methods, Bancroft and his colleagues successfully code and decode a piece of information: June 6 Invasion: Normandy.
3.3. DNA certification

Strictly speaking, DNA certification doesnt deal with much DNA computing techniques, but mainly employ the biological characteristics of DNA. Currently, the DNA certification is broadly applied in the field of justice, finance, and so on, which will certificate biological individuals accurately. Celland apply their methods to the purpose of certification and security. In 2000, DNA Technology Company of Canada also using the DNA sequence to the product certification of the Sydney Olympic game in those years. Nearly 50,000,000 keepsakes were all marked with a special ink from Olympic T-shirts to coffee cups. This kind of DNA segment in the inky mark was randomly extracted an athlete genome from nearly a hundred, then it was rather difficult to fabricate. This kind of way to utilize a portable scanner to scan the information in the ink marking would distinguish whether the keepsakes were authentic which were much cheap than the whole interest trademarks but the increasing cost was only one nickel. If we make use of the basic principle of DNA stegangraphy to the appraisal of DNA, we can carry on much wider certification. At present, there have been masses of biological genetic engineering are under way. The above technologies will make the researchers add the DNA certification information to the organ tissue, and by identifying the information of DNA certification to validate the customer identity and the copyright information. Currently, in these kinds of DNA techniques, the development of DNA certification technique is most mature and the application is most wide. However, we introduce the DNA computing into the DNA steganography and certification techniques will improve the complication of algorithm and the level of security.

1-4244-1035-5/07/$25.00 .2007 IEEE.

290

The problem and prospect that the DNA technique faces

Although the DNA computing is a fire-new computing mode, it cant get away from the influence of Turing in the corresponding theoretical computing model. The DNA computing is still placed in a theoretical stage, its computing model was mostly just using molecular technique to resolve a certain problems, and put on an experiment to resolve a certain problem, the varieties of problems lead to the discrepancy of computing schemes, there still havent an uniform computing and coding model currently. Under the existing DNA computing mode, the time complexity of DNA computing compared to the space complexity doesnt increase with the computational complexity remarkably. That is, DNA computing only converts the time complexity into space complexity. Then, once the complication of problems break the physical limit of DNA segment which operated by the bio-chemical technique, DNA computing is still too far away to reach. Boneh spend nearly 4 months to construct DES-1(E) solution, however, the quantities of cipher key of AES algorithm utilized by the US federal government is 21 times compared to DES algorithm. Therefore, according to the Bonehs way, it will cost several years if we construct AES-1(E) solution. So we can say that Bonehs method can only break the symmetric system under 64 bits. Mathematical cryptography can be easily increasing the length of the cipher, thereby itll prevent the cryptography from powerful attack using DNA computer. Therefore, in terms of existing DNA computing mode, though DNA computer greatly improve the ability of the cipher break of people, it is disable to construct genuine intimidation to the security of cryptography. DNA cipher is the beneficial supplement to the existing mathematical cipher, it is a good prior choice especially to the lower demand real-time encryption system. DNA has a bright development potential in steganography and DNA certification, which has a more layer protection than a single encryption. With the rapid development of modern biotechnology, the costly biological experiment has been a formal one. The further development of biotechnology and the appearance of a better DNA cipher design alternative will provide a new orientation for the research of the information security. However, the security, feasibility, stability of DNA cryptography is still need a further research. REFERENCES [1] Adleman L. Molecular computation of solutions to combinational problems[J]. Science, 1994, 266: 1021-1024. [2] Lipton R J.Using DNA to solve NP-complete problems.Science, 1995, 268: 542-545. [3] Boneh D, Dunworth C, Lipton R. Breaking DES using a molecular computer[R]. Technical ReportCS-TR-489-95, Princeton University, 1995. [4] Adleman L,et al.On applying molecular computation to the date encryption strands in DNA based computers[C]// Proc. of the 2ed Annu.Meet, E B Baum et al. eds. Princeton, NJ, 1996: 28-48wetrewtewtw. [5] Celland C T, Risca V, Bancroft C. Hiding messages in DNA microdots[J]. Nature, 1999, 399: 533-534. [6] Cox J P.L. Long-term data storage in DNA. Trends Biotechnol. 2001, 19, 247250. [7] Guangzhao Cui, et al. New Direction of Data Storage: DNA Molecular Storage Technology[J]. Computer Engineering and Applications, 2006, 42(26): 29-32. [8] Jie Chen. A DNA-based, biomolecular cryptography design. ISCAS (3) 2003: 822-825. [9] Amosa M, Paun G, Rozenbergd G. Topics in the theory of DNA computing[J]. Theoretical Computer Science 287 (2002) 3~38. [10] Guozhen Xiao, et al. New field of cryptography: DNA cryptography[J], Chinest Science Bulletin, 2006, 51(10): 1139-1144. [11] Gehani A, LaBean T H, Reif J H. DNA-based cryptography. Dismacs Series in Discrete Mathematics and Theoretical Computer Science, 2000, 54: 233-249. [12] Leier A, Richter C, Banzhaf W, et al. Cryptography with DNA binary strands. 2000, 57(1):13-22. [13] Kartalopoulos S.V. DNA-inspired cryptographic method in optical communications, authentication and data mimicking Military Communications Conference. 2005,2:774-779. [14] Kazuo T, Akimitsu O, Isao S. Public-key system using DNA as a one-way function for key distribution[J]. Biosystems, 2005, 81,25-29. [15] Ouyang, Q. et al. DNA solution of the maximal clique problem. Science, 1997, 278:542. [16] Liu, Q. et al. DNA computing on surfaces. Nature, 2000, 403: 175. [17] Wu, H. An improved surface-based method for DNA computation. Biosystem, 2001, 59:1. [18] Benenson, Y. et al. Programmable and autonomous computing machine made of biomolecules. Nature, 2001, 414:430. [19] Ouyang Q, Kaplan P D, Liu S, et al. DNA solution of the maximal clique problem. Science, 1997, 278: 446-449. [20] Kari L. DNA Computing :Arrival of Biological Mathematics[J]. The Mathematical Tntelligencer, 1997, 19: 9-22. [21] Kamei, T et al. NA-containing inks and personal identification system using them without forgery. Jpn. Kokai TokkyoKoho, 2002, 8 pp. Kawai J, Hayashizaki, Y, DNA book.Genome Res. 2003, 13, 14881495.

1-4244-1035-5/07/$25.00 .2007 IEEE.

291

You might also like