The need for cryptography is historically linked to the business of spying and warmaking. If an illegal agent has to communicate with his embassy, he must do it in such a way that even though the message might fall into the police hands, he and the embassy staff must not be compromised. During any war, if two regiments have to communicate in order to synchronize an attack on enemy lines, they must make sure their message is not intercepted by the enemy; and if it is, at least that it is not understood. Since the beginning of the industrial revolution and capitalistic society, there is also one other widespread application of cryptography: that of protecting corporate secrets like manufacturing process descriptions or private business transactions. Even more recently, with the advent of a technologically advanced information infrastructure, sending messages has become very easy, but so has interception. Cryptography is nowadays needed by most everyone who uses electronic and digital communications.
The literal meaning of ``Cryptography'' is ``unintelligible writing'' - this is quite different from ``hidden writing'' (``steganography''). Cryptography is not about preventing the enemy from intercepting a message, but rather about preventing the enemy from being able to read the message. Thus, devices like invisible ink, tattoing the message on the scalp or any other form of information smuggling are not in the realm of cryptography.
In the field of cryptography, any message that is clearly readable by both ally and enemy is called a ``clear text message''. After the message has undergone the process of encryption it is called ``encrypted message''. In what follows, when we talk about ``clear text message'' we will always consider a message written in the english language with the 26-letters English alphabet (A B C D E F G H I J K L M N O P Q R S T U V W X Y Z), the ten arabic digits (0 1 2 3 4 5 6 7 8 9) and one punctuation sign - the word separator, the blank (or space)2. For simplicity we will suppress all other punctuation signs.
Having said that, any message which does not use this alphabet or this language is unintelligible and hence encrypted, but this is not considered cryptography. Hiring two australian Aborigines as radiotelegraphists for two armies that need to communicate could probably be very effective in terms of hiding information to the enemy, but only if the enemy itself was not in the position of hiring australian Aborigines radiotelegraphists. Cryptographic processes tend to be ``general'' - that is, to hold in every similar situation.
Since there are 26 letters, ten digits and one punctuation sign in our alphabet - that is, 37 symbols, we can permute them in any of 37! -1 ways (that is read ``thirtyseven factorial minus one'' and it means
1 × 2 × 3 × ... × 36 × 37 - 1, that is 13763753091226345046315979581580902399999999 ways). Why this number? Let's take it one step at a time. if we had only one letter in our alphabet, there would only be one possible permutation: the only letter is mapped to itself. This permutation is called ``identity''. If we had two letters in our alphabet (say A and B), we would have either the identity (A and B remain the same) or one other possible permutation (A becomes B and B becomes A). If our clear text message is ``ABBA'', the encrypted message using the identity is still ``ABBA'', whereas using the other permutation gathers ``BAAB''. It should now become clear why in counting letter permutations used for encryption we always have to subtract one from the total number: we never want to use the identity permutation. If we had three letters in our alphabet, we have a total of six permutations:
|
By way of an example of this kind of encryption, we take our alphabet,
ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789_
we ``shift'' it to the left by one position putting the leftmost character (A) at the end, and we write the permutation:
BCDEFGHIJKLMNOPQRSTUVWXYZ0123456789_A
We thus obtain the mapping of letters: A--> B, B--> C and so on. If the clear text message is
CAESAR IS ATTACKING
the encrypted message is
DBFTBSAJTABUUBDLJOH
This particular type of permutation is called a ``cyclic'' permutation because it just shifts the whole alphabet but it doesn't change the relative order of the letters, save for the first and last. In fact cyclic permutations are extremely easy to implement and they were used by the Romans in their conquest wars.
The mapping between the alphabet and the permutation which permits the encryption of the message is called the ``encryption key''. More generally, whatever permits the encryption of a message is an encryption key, and whatever permits the decryption of a message (that is, the process that transforms an encrypted message back to the clear text message) is called the ``decryption key''. In the case of letter cryptography these keys are the same: one needs the permutation in order to encrypt a message (as we have already seen); and to decrypt it, all one does is reading the mapping backwards. So the same information used for encryption is also used for decryption.
Since the point of encrypting a message is making sure it is not intelligible by anybody save those people who possess the decryption key, it is crucial that there is no way of obtaining the key by unauthorised means (this is called ``breaking a key''); or else, that it takes the enemy a longer time to break the key than it takes the allies to change the old key with a new one. In the case of letter cryptography by substitution it is actually quite easy to break any key in a very short time indeed. Substitution keys are usually broken by frequency analysis. This is based on the fact that the relative frequencies of the mapped letters does not change. Mapping the word ``attacking'' with the key given above, we got the word ``buubdljoh'', and we can observe that the repetition of ``a'' and ``t'' in the clear text word has been transformed into a repetition of ``b'' and ``u'', but the frequency structure of the encrypted word has not changed. All the enemy must do in order to perform frequency analysis is to compile frequency tables in which to every letter in the alphabet there corresponds the average frequency of appearance of that letter in the English language (or the language used to write the clear text message). Thereafter, the enemy only has to compute the frequency with which each letter appears in the encrypted message to find out what the encryption key is. With modern calculators working on average length messages, this takes a fraction of a second. And the longer the message, the more accurate the frequency tables turn out to be (length of message is not an issue: even it the encrypted text is forty billion letters long, the enemy need only perform frequency analysis on the first 500 or 1000 letters to retrieve the key; and once the key is known, the whole message can be decrypted very quickly).
The way to defy frequency analysis is to change the permutation used to encrypt the message for each letter in the clear text message. This is called ``one time pad'' cryptography because when it was first introduced after World War I each of the allies had to have a copy of one-time pads booklet. Each of the pads contained a key which was totally random and long enough to encrypt simple messages. By way of an example, if we have three different permutations:
BCDEFGHIJKLMNOPQRSTUVWXYZ0123456789_A
CIBEF05DJKAM_OTQRSGLVWUXZH1234G6789NY
ACEGIKMOQSUWY0123987456_BDFHJLNPRTVXZ
we can safely encrypt (that is, safely from frequency analysis) clear text messages of three letters: the message MUM would become N_Y. As we can see, the frequency of the letter M (two thirds of the message) is not kept constant in the encrypted message; hence no frequency analysis can be performed. The trouble with one-time pad cryptography is that the key to be transmitted to the allies is three times as long as with simple permutation methods, and that is just to encrypt/decrypt three letters words! In general, in order to encrypt an n-letter message written in an m-letter alphabet in this fashion one needs to use keys which are n×m letters long! This is excessive: broadcasting keys needs a secure communication channel (if the key is intercepted the all messages can be understood), but the lack of secure channels is exactly the reason why cryptography is used. One-time pad cryptography is absolutely secure against frequency analysis but it is, in practice, very difficult to implement.
The difficult part, then, is how to produce a pair of keys (encryption/decryption) where the decryption key cannot be calculated just by knowing the encryption key. The mathematical operation involved to build such a pair must be a reversible operation (we need both keys) but one direction must be reasonably easy whereas the opposite direction must be extremely difficult. Such an operation can be provided by integer multiplication and its reverse, prime factorization. A ``prime number'' is an integer number that cannot be divided precisely by any other number save 1 and itself. Prime numbers are 2, 3, 5, 7, 11, 13, 17, 19, 23, 29 and so on. 21 is not prime because we can write 21 = 3×7. Multiplying two prime numbers is very fast even when the primes are very large. The converse, that is, given a number, finding two primes such that their product is the given number, is extremely difficult. For primes composed of 1000 digits or so, even using the fastest computer on earth could not calculate two primes given their product before a century of continuous computing.
Now we need an encryption scheme based on a product of primes and a decryption scheme based on the two prime factors. Such schemes have been first proposed in the '70s by three researchers: Riverst, Shamir and Adleman. Taking after their surname initials, the encryption/decryption algorithms have been named ``RSA''.
To understand how the RSA encryption scheme works, one has to
understand the basics of number theory and modular arithmetic. Apart
from the notion of prime number, explained above, there is also a
weaker notion of two numbers being ``coprime''. This means that the
largest integer dividing exactly both numbers is 1.
Let us now see some modular arithmetic: given two integers
n, m the notation n mod m (read ``n modulo m'') is the
remainder of the integer division of n by m. For example, if m = 5
we have 0 mod 5, 1 mod 5 = 1, 2 mod 5 = 2, 3 mod 5 = 3 and
4 mod 5 = 4 because every number smaller than 5 gives itself as a
remainder on division by 5. We then have 5 mod 5 = 0 as
[5/5] = 1 with remainder of 0. The following table illustrates
the results of the operation mod 5 for a few more values.
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 |
0 | 1 | 2 | 3 | 4 | 0 | 1 | 2 | 3 | 4 | 0 | 1 |
Let p and q two large primes (as large as it is needed to make the encryption secure), let N = p×q be their product, let n = (p-1)(q-1) and let e be any number coprime to n. The two numbers (e, N) form the public key. This is how the encryption process works: first we transform the clear text message in a number M (this can be done in a number of ways: the most natural - but not necessarily the most effective - is to assign a number to each letter of the alphabet, for example: A=10, B=11, C=12 and so on. The message ``CAB'' is then transformed in the number 121011), making sure that M is coprime to N (if it is not, it is always possible to add a few spurious letters to the clear text message in order that its numeric encoding M is coprime to N). We then compute E = Me modN: E is the encrypted message. There are two more requirements: that Me has to be larger3 than N, and that M has to be smaller than N. If Me turns out to be smaller than N, it can be multiplied by a fixed constant (remembering that the decryption process needs to be modified to take this into account). If M turns out to be bigger than N, it can be split in chunks and each of the chunk can be encrypted separately.
Now let us see the decryption process. The decryption key is given by d = e-1 mod n (recall n = (p-1)(q-1)). Of course to calculate d one needs to know what n is, or equivalently, what the numbers p and q are. In this sense the private decryption key is based on the large primes p and q, and knowing the public key (e, N) does not help in calculating p and q unless one can factor N into p and q (but if p and q are large enough, this is a very difficult task). The ally which receives the message E (encrypted using the ally's own public key) can decrypt the message using its own private key d by calculating D = Ed mod N. It is quite easy to show that D = M, and we include a sketch of the proof in a footnote4, to be skipped by the uninitiated.
Let us see an example of RSA encryption and decryption process. We shall use very small primes for simplicity, although this means that the encryption is not secure at all. Let p = 11 and q = 7. We have N = 77 and n = (11-1)(7-1) = 60. The encryption key e must be coprime to n, and we choose e = 7 (because the largest integer dividing exactly both 5 and 24 is 1). The decryption key d is given by 7-1 mod 60, that is, 43 (because 7×43 mod 60 = 301 mod 60 = 1). Suppose the ally A has to send a ``yes'' answer to a question another ally B had sent to it previously. The ally A wants to send B a ``Y'' in encrypted form. According to the mapping A=10, B=11, etcetera, Y is mapped to the number M = 35. To encode M, the ally A looks up the public key of ally B (7,77) and calculates 357 mod 77 = 7, thus the encrypted message is 7. Ally B, on receiving 7, sets out to decript it with its private key: all it has to do is calculating 743 mod 77, which is again 35. The same numeric code is used to transform 35 into the letter ``Y''.
RSA has been considered the most important breakthrough in cryptography because it dispenses for the need of a secure channel to communicate keys. Public key cryptography can be generalized to provide secure electronic signatures in a straightforward way: suppose allies B wants to be sure that the message really comes from ally A. After all anybody could have looked up the public key of ally B and encrypted a forged message. This is how it goes: ally A first encrypts the message using B's public key, and then it encrypts it again using its own private key (notice that encryption and decryption are just modular operations so a private decryption key can be used to encrypt a message as well as decrypt it). When B receives the doubly-encrypted message, it first decrypts it using A's public key: thus B is sure that it was really A sending the message, for nobody else could have had the private key corresponding to A's public key; and then it decrypts the outcome a second time using its own private key to retrieve the clear text message.
It has been said that RSA would be the ultimate cryptographic challenge because it offers scalable security. Whenever computers get so powerful that integers with thousands of digits can be factorized in a few seconds, allies can just pick primes p and q with millions of digits, and so on - the important thing is that multiplying p by q takes an amount of time proportional to the length in digits of p and q, whereas factoring their product without knowing p or q takes an amount of time proportional to an exponential of the length in digits. This means that the two orders of magnitudes are not even comparable. As long as nobody finds a method for factoring an integer which takes an amount of time proportional to the length in digits of the integer, RSA is theoretically secure.
The reason why this feature of quantum mechanics (superposition) can
help build a quantum computer capable of factoring integers quickly is
that it is in theory possible to build a device that changes the
amplitude of a state without actually observing the observable
entity. In short, superpositions can be manipolated to a certain
extent without causing the collapse of the superposition of states on
a definite state. The algorithm for quantum factorization was invented
by Peter Shor in 1995. It is based on a theorem of number theory that
states that one can factor an integer of N by finding the period of
the function f(x) = ax mod N, where a is any integer coprime to
N. Any function which has the same set of values repeated in the
same order over and over again as its variable varies is called
a ``periodic function'', and the length of the sequence of values
before the first repetition is called the period. For example, if
N = 6 and a = 2 we see that the function g(x) = 2x mod 15 has the
values:
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | |
1 | 2 | 4 | 8 | 1 | 2 | 4 | 8 | 1 | 2 |
|
|
|
|
This allows us to calculate the period of f: the algorithm just needs to be run a sufficient number of times to ensure a sufficient success confidence.
The reason why quantum computers can't be built yet is that it is extremely difficult to keep a physical quantum register stable for a period of time long enough to perform a similar calculation.
1 l.liberti@ic.ac.uk
2 Usually denoted with the symbol ``_''.
3 If this were not the case, we would have a situation m mod n where m < n and we have seen with the example modulo 5 above that in this case m mod n is just m; that is, the encrypted message would be the same as the clear text message, and we clearly do not want that.
4 Sketch of proof that D = M.
|