Encryption

Background

Encryption has its roots in mathematics and the history of encryption stretches back to the Greeks where the Greeks, when wanting to convey secret messages would shave their heads, write the message and allow the hair to grow back. The word 'encryption' comes from the Greek word 'crypto' meaning 'hidden'.

Encryption has been applied in many forms:

Transposition - rearranging text
Substitution - changing the text, code
Cipher - moving letters, a Crib is cipher text turned into plain text, an example is a mono-alphabetic cypher (Caesar cypher) where the decryption key is the number of alphabet places moved.
Polyalphabetic - having many cipher alphabets for each message
Digraph - cipher of pairs of letters that are common
Nomenclature - substituting letters and words
One way pad - where a random key is used per sheet of a pad containing hundreds of sheets and both sender and receiver have the same pad.

Polybius Square

A classic cypher is the Polybius Square where arbitrary numbers are assigned to rows and columns of a 5 x 5 square. The English alphabet is then entered into the cells with the letters 'I' and 'J' occupying the same cell:

	4	7	3	1	9
6	A	B	C	D	E
4	F	G	H	I/J	K
8	L	M	N	O	P
2	Q	R	S	T	U
5	V	W	X	Y	Z

A word such as 'HELLO' is then given by the code 34 96 48 48 18. The problem with this type of cypher is that the same code is used for the same character each time, so it is easily decypherable.

Trithemius Progressive Key

The Trithemius Progressive Key allows a change of key for every character. This is an extension of the poly-alphabetic cipher and provided it is done in an ordered fashion then we are able to control the cypher.

We move the start by 1 character for each row selected and we progressively move down one row for each character. After 26 we start again at Row 1. The letters can be arranged in any order of course, provided that the same order is maintained throughout the table. For the word 'HELLO' we obtain the code IGOPT. Despite there being two L's the code has different letters for each.

Vigenere Key Method

Vigenere Key Method uses a keyword. The keyword dictates the row choices for encryption and decryption for each successive character in the message. Using the keyword KEY, means we successively use rows 'K', (10th row), row 'E' (4th row) and row 'Y' (24th row).

As an example, using KEYKEYKEY we can translate the word HELLO into RIJVS. The problem with this is the key is repeated.

Variations include the Vigenere auto (plain) key method, which starts with a single letter or word used as a priming key. The priming key dictates the starting row or rows for encrypting the first or first few plain text characters. Next the plain text characters of the message itself are used to determine the rows. So instead of repeating the keyword (as you do in the Vigenere cipher) you append the plain text to the keyword and use the resulting word as they key to encrypt the plain text. This means that there is no reptition of the key and the key effectively becomes as long as the message itself.

The problem with long keys is the the number of combinations and the time taken to search these combinations. For a useable system the lookups must be quicker.

Many of these techniques are being used in an extended fashion when encryting data in today's networks. Papers by Dr Horst Feistel (1970, DES), Diffie and Hellman (1976), Rivest, Shamir and Adleman (1977, RSA) and Zimmerman (1979, PGP) have formed the bedrock for encryption techniques used today. The basic idea is to encrypt data using an encryption key, send the ciphertext across a public domain and to decrypt the data using a decryption key.

You can have Symmetric encryption algorithms where the same key is used to encrypt and decrypt the data. These include DES, DES-3 (block ciphers), RC2, RC4, RC5 (stream ciphers), AES and Rijndael (up to 256 bits key length) and it can be fast, so it tends to be used for large volumes of data. The problems come when you want to share the key without compromising security. You can have Asymmetric encryption algorithms where the encryption keys and the decryption keys are different. Examples include RSA, Diffie Hellman and Elliptic Curves. Key lengths range from 512 bits to 2048 bits. Diffie Hellman is used specifically for key management. Asymmetric encryption algorithms are much slower and are therefore used for low volume encryption.

Key management is critical in the encryption environment, the security is only as good as its weakest point. It is no good sharing a key over the telephone or via E-mail if the conversation is prone to eavesdropping. Key management looks after the generation, storage, the exchange, verification and destruction of keys.

Random number generation is the basis of good key generation. Radioactive decay is one of the best sources of random numbers because the Keyspace (the number of possible keys) is very large. On average, you would have to search through half an algorithm's keyspace before finding the correct key. With symmetric algorithms, key lifetimes are typically short i.e. of the order of seconds or minutes, and 40/56 bit keys (e.g. DES) are now considered weak. 112/168 bit keys e.g. 3DES, are currently OK with 256 bit keys (e.g. AES) likely to be the norm in the future. With asymmetric algorithms 768/1024 bit keys are considered to be fine with a move towards 1536/2048 bit keys in the future. Asymmetric keys tend to last longer.

Data Encryption Standard (DES)

In the early 1970s, development of the Product Cypher System or Block Cypher as used in IBM's Lucifer System led to a number of variations. The concept was to use a Substitution-Permutation Network (SPN), which is a series of linked mathematical operations. These SP-networks consist of Substitution Boxes (S-boxes) and Permutation (P-boxes) that transform blocks of input bits into output bits. These transformations are operations that are efficient to perform in hardware, e.g. OR, Exclusive-OR (XOR) etc.

The operation called Modulo 2 addition is mainly used and this is the same as an Exclusive-OR operation on binary numbers, one digit at a time, for example:

Operand 1	1	0	1	1	1	1
+ Operand 2	1	1	0	1	1	0
= Sum	0	1	1	0	0	1

Operand 1 could be the unencrypted text, Operand 2 the key and the result could be the encrypted text. There is a problem with this which is that you could reverse the process i.e. taking the example above:

Encrypted text	0	1	1	0	0	1
+ Key	1	1	0	1	1	0
= Unencrypted text	1	0	1	1	1	1

S-boxes substitute or change input bits into output bits. A well-designed S-box will ensure that changing one input bit will change about half of the output bits. It will also have the property that each output bit will depend on every input bit. P-boxes transpose bits across S-box inputs. At each round the key is combined using a group operation such as XOR.

One variant of the Lucifer system, uses a 48-bit key and operates on 48-bit blocks. The cipher is a SP-network and uses two 4-bit S-boxes. The key selects which S-boxes are used. The cipher can operate on 24-bits at a time, and also sequentially on 8-bits at a time.

The Lucifer system variant that was based on the Feistel Network, after some modifications has led to the USA's Data Encryption Standard (DES).

The DES algorithm takes a key and performs a series of simple logical operations on it. These include permutations, substitutions etc. using SP-networks. As mentioned before, these operations are designed such that they can be implemented within hardware and therefore be fast.

A 64-bit block of plaintext is split into a left-hand 32-bit block and a right-hand 32-bit block. Of the 64 bits 56 bits are used for encryption and the remaining 8 bits are used for parity (7 x 0's and the LSB for even parity). You can get 40 bit encryption with DES where a 40 bit key is used along with 16 known bits. A section of the key called the subkey is then input into a function along with the right-hand block. The function expands the 32-bit block into 48-bits and adds the subkey. The function then performs a substitution via the S-Box and then a transposition via the P-box resulting in a 32-bit block again. The right-hand and left-hand blocks are swapped around and the whole operation repeated on the new right-hand block. This continues for 16 rounds in total until the ciphertext is produced.

DES can operate in Electronic Code Book (ECB) mode where a message of 5 x 64 bit blocks is operated on block by block. Or DES can operate in Cypher Block Chaining (CBC) mode where the output from DES operating on the first 64 bit block (of five) is fed into the algorithm along with the second block, and so on, the initial number to start off the chain is a random value called the Initialisation Vector (IV).

3DES

3DES is much stronger than DES because 2 or 3 keys are used to increase key strength. Two keys give 112 bits (2 x 56) encryption, and three keys give 168 bits (3 x 56) encryption. 3DES works by using the first key to encrypt the blocks just as in DES. Then the second key is used to decrypt the blocks, followed by the third key that encrypts the blocks again. If only two keys are being used for 112 bit encryption, then the first and third keys are the same.

Confidentiality

Data is made private via the use of encryption such as Data Encryption Standard (DES) (RFC 2405), 3DES, International Data Encryption Algorithm (IDEA) and Blowfish (RFC 2406).

The concept of Feasibility is used to describe the difficulty of cracking a particular encryption algorithm. In the world of encryption nothing is considered to be impossible, however an algorithm is considered infeasible if it is considered to be complex enough not to be solved with current computer technology or techniques in a reasonable amount of time.

There are two methods of distributing keys used in encrypting and de-encrypting the data, the Shared Key method and the Public Key method.

Shared Key

In a shared key environment e.g. for symmetric encryption or for HMACs, a single key is used for encryption and decryption of the data, therefore both peers need to know this shared key. The problem with this method is that there has to be a way of making sure both peers know this key without compromising the security of this key. Using the telephone or mailing the key is not really that secure, plus it is not scalable for large secure networks where there are many peer to peer sessions each with different keys. There needs to be a way of securing key exchange over an insecure path.

Diffie-Hellman Key Exchange addresses this problem and Internet Key Exchange (IKE) (RFC 2409) uses this Diffie-Hellman to ensure that a shared key can be generated and shared across a public connection in a way that is infeasible for anyone to work out the key. This shared key can then be used with an encryption algorithm such as DES, 3DES, IDEA etc. A summary of how Diffie-Hellman operates is as follows:

Peers P and Peer Q have been given the same publicly viewable numbers m and n.
Peer P picks a very large secret random number x and calculates m^xmod n to give P.
Peer Q picks a very large secret random number y and calculates m^ymod n to give Q.
Peer P and Peer Q exchange P and Q publicly, so anyone can see these numbers. The numbers x and y remain known only to the relevant peer and they are not transmitted.
Peer P then performs the calculation Q^xmod n to give the value K.
Peer Q then performs the calculation P^ymod n to give the value L.
Now as it happens, Q^xmod n gives the same value as m^xymod n which is the same as P^ymod n, so K and L are equal, therefore Peers P and Q have negotiated a shared secret that has not been transmitted.

The modulus arithmetic makes it very difficult for an attacker to find a value for either x or y especially if the numbers involved are very large. Calculating PBase m for instance, is not trivial when modulus is involved with large numbers. Instead an attacker will attempt to intercept the exchange and become an intermediary between the peers and negotiate two shared secrets with each of the peers and fool them into thinking that they are talking with each other.

Public Key Encryption

Each peer creates two keys, one private and one public. The public key is made available publicly and is used to encrypt data by anyone wishing to send you information. The private key is used by the receiver to decrypt the data. In a large secure environment, public keys are held in central locations for better management. It is considered infeasible to work out the private key from the public key.

Using a public/private key method to encrypt data is around 1000 times slower than using the shared key method, so the shared key method is good for large data volumes.

A popular public key encryption algorithm is Rivest, Shamir and Adleman (RSA) and is used in IKE. Key lengths can vary between 512 bits to 2048 bits.

The following is a list of Public Key Cryptography Standards (PKCS):

PKCS #1 Encryption that defines RSA DSS
PKCS #2 (ditto)
PKCS #3 (ditto)
PKCS #3 Diffie-Hellman Key Agreement (defines implementation)
PKCS #5 Password-Based Encryption (how to encrypt derived password)
PKCS #6 Extended Certificate Syntax (Extension Attributes, X.509v2)
PKCS #7 Cryptographic Message Syntax (Placement of Attributes)
PKCS #8 Private key Information Syntax (Placement of Keys)
PKCS #9 Selected Attribute Types (All Attribute types used in PKCS)
PKCS #10 Certificate Request Syntax (CMP)
PKCS #11 Cryptographic Token Interface (API, Cryptoki)
PKCS #12 Key Movement Exchange Syntax (Portable format of Keys)
PKCS #13 Elliptic Curve Cryptography
PKCS #14 Pseudo-Random Number Generation
PKCS #15 Cryptographic Token Information Syntax

Integrity

A check that ensures data has not been changed during transfer. This uses a unique hash value that acts like a fingerprint. This hash is calculated by the sender each time any data is sent, it is also calculated when received and the two values compared. MD5 hashing is used often here.

The two peers agree a hashing algorithm and use this to input the data and the shared key that they have agreed. This produces a hash value which is sent with the data. This hash value is one-way only, it is not like encryption where the resultant form can be decrypted! The receiving peer peforms the same hashing operation using the data and the shared key and if the hash value is the same the data is considered to be reliable. Using a shared key that is inputted into the hashing algorithm is called Keyed-Hashing for Message Authentication or Hash Message Authentication Code (HMAC) as detailed in RFC 2104. An HMAC combines hashing with authentication.

One of the most popular hashing algorithms, or HMACs, used today is Message Digest 5 (MD5). Whatever the size of the input data, the output has is always a 128-bit hashed value, even if the data is merely one character. If two very similar pieces of data are sent, the has values for each will be utterly different so that no pattern can be deduced, there is an avalanche effect, if the data changes a little, the digest changes alot.

The National Institute of Standards and Technology (NIST) designed another hashing algorithm called Secure Hash Algorithm (SHA-1) which is often used as well and produces a 160 bit hash.

Origin Authentication and Certification Authorities (CA)

The infrastructure that encompasses public key management is called the Public Key Infrastructure (PKI) and is explained in RFC 2510 and is to do with identifying the users rather than authenticating them. PKI servers called Certification Authority (CA) and Registration Authority (RA) servers form centres of trust for the users.

Certification Authority (CA)

We need to ensure that the public keys being used are genuinely from the sender i.e. that they are not being falsely produced by an intruder (Trust), and conversely the sender cannot get away with denying that the message came from them (Nonrepudiation). Origin authentication is about checking that the station with whom you are communicating is who they say they are. This stage uses Digital Signatures and the concept of a Certification Authority (CA) that assigns these digital signatures.

Typical well-known CAs used by e-commerce include the following:

VeriSign�
Entrust�
Netscape� iPlanet
Windows� 2000 Certificate Server
Thawte�
Equifax
Genuity

When a device (the subject) creates the private and public keys, the public key is sent along with the device details to the CA server and these are processed in a personal fashion so that they can be verified as genuine. Of course, this personal verification has to be very good for the CA to have credence. The CA is a trusted third party that hold the public keys, plus the CA produces a Digital Certificate that contains the sender's name (the subject), details such as serial numbers etc. and their public key. The X.509 standard for the certificate is that which the CA adheres to and gives you confidence that if that CA issues you somebody's public key, that the CA has verified that that certficate belongs to that somebody. X.509 is at version 3 i.e. X.509v3 and produces the certificate in one of three file formats:

Distinguished Encoding Rules (DER-encoded X509 certificate) - Binary format encoding of the certificate file in ASN.1. For example, an imported certificate from a Microsoft Windows NT IIS 4.0 server.
Privacy Enhanced Mail (PEM) - a base64 encoding of the certificate file (PEM-encoded X509 certificate). For example, an imported certificate from an Apache/SSL UNIX server.
PKCS12 - Standard from RSA Data Security, Inc. for storing certificates and private keys. For example, an imported certificate from a Microsoft Windows 2000 IIS 5.0 server.

The certificate contains the following:

Serial number of subject device
Issuer (CA) X500 name
Valid period so that there is an expiration time
Subject X500 name
Subject public key
Key and certificate usage
Any extensions
CA Digital Signature, either via RSA or DSA.

The CA uses its own private key with the first five components to produce the CA digital signature.

Once the certificate has been created, this can be downloaded from the CA in whatever format is suitable for the subject device, and stored and distributed by the owner. As described above, the certificate contains details about the owner, details about the certificate issuer, the owner's public key, validity, expiration dates, and associated privileges.

The CA digitally signs this certificate with its own private key, so that the subject device can then send this certificate to those that want its public key. Digital Signature Standard (DSS) with its Digital Signature Algortihm (DSA) and RSA Signatures are examples of algorithms used with digital signatures on the CAs. DSS was established to provide an alternative to RSA when signing certificates, this was because RSA is already use within the PKI for key exchange and it was felt a good idea to have a separate algorithm for digital signatures. As a consequence, DSS/DSA is purely used for signing certificates. Using the PKI public key cryptography, the CA keeps a private key and publishes a public key. The private key is used to encrypt the data. The receiver uses the CA's public key to decrypt the data and thereby verify it (note that this is the opposite to the earlier use of the public key to encrypt data!).

CA Servers are used to store these digital certificates and are publicly accessible. Those receiving data with the certificate need to obtain the CA's public key in order to verify the CA's signature and check that no alterations have occurred. Certification Authorities have Root CA Certificates. These root CA certificates are installed in releases of all the major web browsers such as Internet Explorer, Netscape, Opera, etc. When you use your browser you are automatically relying on a collection of root CA certificates that the browser vendor has deemed you can trust. If an SSL certificate is issued by one of the trusted root CAs to a subject device/server, then you will inherently trust that SSL certificate and the gold padlock will appear transparently during secure sessions. This eliminates the need for clients to have to go on to the Internet to obtain the CA's public key (particularly useful if the clients sit behind a firewall that denies them access to the Internet).

If the root certificate is not installed on the client's browser then the client will see a warning indicating the requirement to download a certificate because the browser does not have a certificate from the CA used to issue the device's certificate.

The clients and servers communicating with each other must have certificates digitally signed from the same CA or from the hierarchy of CAs that trust each other. A certificate is verified when the client checks it with the CA using the CA's public key within the PKI. Once verified, the client can trust the public key within the certificate for the server/subject device to which it wants to connect. The users trust the CA to distribute, revoke, and manage keys and certificates in such a way as to prevent any security breaches. Users need to be able to determine the degree of trust that can be placed in the authenticity and integrity of the public keys contained in the certificates. The information upon which such determinations can be made is documented in the Certificate Policy and the Certification Practice Statement of the CA.

In summary, the CA performs the following tasks:

Issues users with keys/Packet Switching Exchanges (PSEs) (though sometimes users may generate their own key pair, private and public keys)
Certifies users' public keys
Publishes users' certificates
Issues Certificate Revocation Lists (CRLs)
Generate the certificate based on a public key. Typically a Trust Center (SSL server or other SSL termination device offering content) generates the pair of keys on a smart card or a USB token or within the server itself.
Guarantees the uniqueness of the pair of keys and links the certificate to a particular user
Manages published certificates
Is part of cross certification with other trusted CAs in a hierarchy

Sometimes, the tasks of Certification and Registration are separated out and are carried out by different servers (Note: Entrust and Verisign do not do this!). Some CA servers use Lightweight Directory Access Protocol (LDAP) for certificate retrieval.

Registration Authority (RA)

The Registration Authority (RA) is responsible for recording and verifying all information the CA needs. In particular, the RA must check the user's identity to initiate issuing the certificate at the CA. This functionality is neither a network entity nor is it acting online. The RAs will be where users must go to apply for a certificate. A RA has two main functions:

Verify the identity and the statements of the claimant
Issue and handle the certificate for the claimant

Certificate Revocation List (CRL)

A CRL is a digitally signed list of revoked certificates that is issued by the CA. This list is updated on a scheduled basis and distributed to relying parties. Certificates have expiry times assigned to them and also some certificates become out of date due to device or personnel changes within an organisation. The CRL provides a means of revoking previously issued certificates. The older a certificate is, the more likely it is to be cracked, so it is good practice to renew certificates regularly.

Due to the latency involved in publishing a CRL, a relying party may not receive notification of a revoked certificate in time. On-line mechanisms may be used to communicate the current status of a certificate. A widely used standard for determining the on-line status of a certificate is the IETF Internet X.509 Internet Public Key Infrastructure Online Certificate Status Protocol (OCSP) (RFC 2560).

Some CAs operate a Certificate Revocation List (CRL) which is a list of certificates that have been revoked, e.g. because they have expired. The peers downloads this CRL and uses it as an added check to see if the other peer's certificate is still valid. CRLv2 has the following format:

CRL Issuer
This Update
Next Update
User Certificate serial number
Date of Revocation
User Certificate serial number
Date of Revocation
Reason for Revocation
Digital certificate of CA that issued the CRL

The CRL issuer's digital certificate is generated from the first item down to the last listed user revoked plus the CRL issuer's private key.

Anti-Replay

Sequence numbers and authentication is used to ensure that duplicate packets and old packets are rejected. This prevents an intruder copying a conversation and using it to work out encryption algorithms.

Home

Disclaimer