本节内容包含两部分：TJ-AATA Chapt. 7与TC-CLRS Chapt. 31 (Con'd)。

## TJ-AATA 7.1 Private Key Cryptography

In *single* or *private key cryptosystems* the same key is used for both encrypting and decrypting messages.

*Cryptanalysis* is concerned with deciphering a received or intercepted message. Methods from probability and statistics are great aids in deciphering an intercepted message; for example, the frequency analysis of the characters appearing in the intercepted message often makes its dectyption possible.

Simple shift codes are examples of *monoalphabetic cryptosystems*. In these ciphers a character in the enciphered message represents exactly one character in the original message. Such cryptosystems are not very sophisticated and are quite easy to break.

Suppose that the encoding function is given by
$$f(p) = ap + b \mod 26,$$
the decoding function $f^{-1}$ exists when $a$ has an inverse or, equivalently, when $\gcd(a, 26) = 1$. In this case,
$$f^{-1}(p) = a^{-1}p - a^{-1}b \mod 26.$$
Such a cryptosystem is called an *affine cryptosystem*.

A cryptosystem would be more secure if a ciphertext letter could represent more than one plaintext letter. To give an example of this type of cryptosystem, called a *polyalphabetic cryptosystem*, we will generalize affine codes by using matrices. We can store a pair of letters $p_1$ and $p_2$ in a vector
$$\boldsymbol{p} = \begin{pmatrix}p_1 \\ p_2\end{pmatrix}.$$

Let $A$ be a $2 \times 2$ invertible matrix with entries in $\mathbb{Z}_{26}$. We can define an encoding function by
$$f(\boldsymbol{p}) = A\boldsymbol{p} + \boldsymbol{b},$$
where $\boldsymbol{b}$ is a fixed column vector and matrix operations are performed in $\mathbb{Z}_{26}$. The decoding function must be
$$f^{-1}(\boldsymbol{p}) = A^{-1}\boldsymbol{p} - A^{-1}\boldsymbol{b}.$$

## TJ-AATA 7.2 Public Key Cryptography

### RSA Cryptosystem

**The RSA Cryptosystem** is based on the difficulty of factoring large numbers. It works as follows.

Suppose that we choose two random 150-digit prime numbers $p$ and $q$. Next, we compute the product $n = pq$ and also compute $\phi(n) = m = (p-1)(q-1)$, where $\phi$ is the Euler $\phi$-function. Now we start choosing random integers $E$ until we find one that is relatively prime to $m$; that is, we choose $E$ such that $\gcd(E, m) = 1$. Using the Euclidean algorithm, we can find a numebr $D$ such that $DE \equiv 1 \mod m$. The numbers $n$ and $E$ are now made public.

Suppose now that person $B$ (Bob) wishes to send person $A$ (Alice) a message over a public line. Since $E$ and $n$ are known to everyone, anyone can encode messages. Bob first digitizes the message according to some scheme, say $A = 00, B = 02, \dots, Z = 25$. If necessary, he will break the message into pieces such that each piece is a positive integer less than $n$. Suppose $x$ is one of the pieces. Bob forms the number $y = x^E \mod n$ and sends $y$ to Alice. For Alice to recover $x$, she need only compute $x = y^D \mod n$. Only Alice knows $D$.

Now lets us examine why the RSA cryptosystem works. We know that $DE \equiv 1 \mod m$; hence there exists a $k$ such that
$$DE = km + 1 = k \phi(n) + 1.$$
There are two cases to consider. In the first case assume that $\gcd(x, n) = 1$. Then by **Theorem 6.18 (Euler's Theorem)**,
$$y^D = \left( x^E \right)^D = x^{DE} = x^{km+1} = \left(x^{\phi(n)}\right)^k x = 1^kx = x \mod n.$$
So we see that Alice recovers the original message $x$ when she computes $y^D \mod n$.

For the other case, assume $\gcd(x, n) \neq 1$. Since $n = pq$ and $x < n$, we know $x$ is a multiple of $p$ or a multiple of $q$, but not both. We will describe the first possibility only, since the second is entirely similar. There is then an integer $r$, with $r < q$ and $x = rp$. Note that we have $\gcd(x, q) = 1$ and that $m = \phi(n) = (p-1)(q-1) = \phi(p)\phi(q)$. Then, using Theorem 6.18, but now mod $q$,
$$x^{km} = x^{k\phi(p)\phi(q)} = 1^{k\phi(p)} = 1 \mod q.$$
So there is an integer $t$ such that $x^{km} = 1 + tq$. Thus, Alice also recovers the message in this case,
$$y^D = x^{km+1} = x^{km}x = (1+tq)x = x + tq(rp) = x + trn = x \mod n.$$

We can now ask how one would go about breaking the RSA cryptosystem. To find $D$ given $n$ and $E$, we simply need to factor $n$ and solve for $D$ by using the Euclidean algorithm.

### Message Verification

There is a problem of message verification in public key cryptosystems. Since the encoding key is public knowledge, anyone has the ability to send an encoded message. If Alice receives a message from Bob, she would like to be able to verify that it was Bob who actually sent the message.

Suppose that Bob's encrypting key is $(n', E')$ and his decrypting key is $(n', D')$. Also, suppose that Alice's encrypting key is $(n, E)$ and her decrypting key is $(n, D)$. Since encryption keys are public information, they can exchange coded messages at their convenience. Bob wishes to assure Alice that the message he is sending is authentic. Before Bob sends the message $x$ to Alice, he decrypts $x$ with his own key:
$$x' = x^{D'} \mod n'.$$
Anyone can change $x'$ back to $x$ just by encryption, but only Bob has the ability to form $x'$. Now Bob encrypts $x'$ with Alice's encryption key to form
$$y' = x'^{E} \mod n,$$
a message that only Alice can decode. Alice decodes the message and then encodes the result with Bob's key to read the original message, a message that could have only been sent by Bob.

## TC-CLRS 31.9 Integer Factorization

Factoring an integer is to decompose it into a product of primes.

### Pollard's Rho Heuristic

Trial division by all integers up to $R$ is guaranteed to factor completely any number up to $R^2$. For tha same amount of work, the following precedure, Pollard-Rho, factors any number up to $R^4$ (unless we are unlucky). Since the procedure is only a heuristic, neither its running time nor its success is guarenteed, although the procedure is highly effective in practice. Another advantage of the procedure is that it uses only a constant number of memory locations.

The pseudocode is written using subscripted variable $x_i$ for clarity, but the program works the same if all of subscripts are dropped, since only the most recent value of $x_i$ needs to be maintained. With this modification, the procedure uses only a constant number of memory locations.

We have good reason to expect Pollard-Rho to print a factor $p$ of $n$ after $\Theta(\sqrt{p})$ iterations of the **while** loop. Thus, if $n$ is composite, we can expect this procedure to discover enough divisors to factor $n$ completely after approximately $n^{1/4}$ updates, since every prime factor $p$ of $n$ except possibly the largest one is less than $\sqrt{n}$.

（正确性与性能分析见CLRS P976。)