Page not complete

Information theory

Measurement and encoding of information

Shannon's formula of entropy 1

By convention, when the base b is not specified, we use the base 2 (binary system): b = 2.

Information Theory (IT) inequality :

for all .

The average codewords length:

We have the following property:
Let be an encoding function for a D-ary code following a distribution (random variable) . If the code is uniquely decodable, then we have:

Shannon-Fano code
For any random variable and any integer , there exists a prefix-free D-ary codefor S, such that ,

Entrypy of a symbol:

Entropy per symbol:

Cryptography

Introduction

First there are the so-called symmetric cryptosystems. Here we use a single encryption key. Let's take two parties A and B. If A wants to send an encrypted message to B he will use an encryption function with the data of the encryption key. When B will receive the message from A, he will have to use a decryption function (the "inverse" function of the encryption function) with the same key as A.

Here, the encryption and decryption functions are very fast, but a problem arises: in order for A and B to use the same key, they will have to exchange it. And during this key exchange between A and B, a malicious person might intercept the key and send it back to B. This person will then be able to decrypt the information, thus breaking the confidentiality established between A and B.

This is the origin of the so-called "Man-in-the-middle-attack".

The second system is asymmetric encryption. In this one, if A wants to send an encrypted message to B, he will have to use two keys: a public key and a private key. The public key is accessible to everyone. It is by using A's public key that B will be able to decrypt A's message that he has encrypted with his private key.

With this system, there is no key exchange, but the encryption and decryption functions are slower than with symmetric encryption. Moreover, the security of this kind of system relies on the fact that with today's computers, it would take an extremely long time to find an individual's private key from his or her public key (accessible to everyone). The point is that it is technically possible to decrypt any conversation using asymmetric encryption, but all in an unreasonable amount of time.

Different applications of cryptography

NameType of encryption
Cards and banking systemsRSA
NSA's "secret defense" filesAES
Online shoppingSSL or TLS
Wifi networksWEP, WPA or WPA2
Client-server relationshipSSH (using RSA or DSA)
EmailPGP (RSA or DSA)

Classification of certain encryption systems

Symmetric encryptionAsymmetric encryption
AESRSA
DESMcEliece
IDEADSS/DSA
Menezes-VanstonneElGamal

Not discussed here, it may be interesting to learn about homomorphic encryption and its potential use for the cloud.

In cryptography, the first names Alice and Bob are generally used to name two interlocutors. The first name Eve is used to designate a third person who is secretly listening to the conversation.

List of other names used: Wikipedia page.

Symmetric encryption systems

Historical encryption systems:

  • Caesar's encryption
  • Vigenère's encryption

Definition (Perfect Secrecy)
A cryptosystem has a perfect secret if its plaintext T and the corresponding ciphertext C are statistically independent.

Important result:
Let be a symmetric encryption system of key K. For this system to have a perfect secret, it must verify for any clear text T :

Asymmetric encryption systems

To better understand the way such systems work, it is mandatory to have some knowledge of number theory and modular arithmetic.

RSA Encryption

RSA, for Rivest Shamir Adleman is an encryption that appeared in 1977. It was created by Ronald Rivest, Adi Shamir, and Leonard Adleman.

Let's describe how this system works when Alice wants to send a message to Bob. To do so, Bob will need to create his two keys: a private key and a public key.

Steps for Bob:
  • He chooses two distinct prime numbers p and q. In order to guarantee the security of the encryption, these two numbers must be large enough. One can imagine two 1024 bits prime numbers, which is the usual practice.

  • Let us note m = pq, which will be used as a modulus: all the following operations will be performed in .

  • Then, Bob chooses an integer k, multiple of both (p-1) and (q-1).

    p and q being distinct, it is then quite appropriate to choose the Euler indicator in pq:

  • He chooses an integer e prime with k, in other words: gcd(e, k) = 1.

  • According to the Bézout identity, there are integers d and l such that :

    Thus, we have: .

    We keep d, which is therefore the inverse of e modulo k.

Creation of keys:
  • (m, e) becomes Bob's public key
  • (m, d) becomes Bob's private key
Message transmission:

Alice will send to Bob c (encrypted), the message corresponding to this transformation performed on his plaintext message t:

Therefore, Alice used the elements of Bob's public key.

Bob receives and decrypts this message using the following relation:

Thus, Bob correctly retrieves the message from Alice.


  1. Claude Elwood Shannon (April 30, 1916 in Petoskey, Michigan - February 24, 2001 in Medford, Massachusetts) was an American electrical engineer and mathematician. He is one of the fathers, if not the founding father, of information theory.