2010-12-15

SSCC - Data scrambling algorithms

A little bit of terminology

As I wrote in my previous post, when it comes to cryptography people usually use the word encryption for any data manipulation algorithm.

Although this is not completely wrong it might lead (at least for beginners) to some confusions especially when we will talk about data signing.
Since this is a beginners course and there is no need of any formal mambo-jambo we can assume that we are talking about digital data, so we can use the term bits manipulation algorithm to refer to any of the algorithms.


Symmetric-key algorithms

I have talked about symmetric keys in my previous post. So obviously the bits manipulation algorithms that use symmetric keys are called symmetric-key algorithms or ciphers. They can additionally be divided into stream and block ciphers, but we will not go that deep in this course, as for more info you can always check the appropriate wikipedia page.
Here is a list of some popular ciphers:
Usually in communication with customers you will have to decide between DES and AES, and it is always better to go with AES as it is newer and more secure algorithm.
AES stands for Advanced Encryption Standard and is often (wrongly) referred to as AES-128 since it is a block chiper with block size of 128.
AES-128 however is the name of one of the three actual algorithms covered by the standard. The 128 part of the name indicates the length (in bits) of the symmetric key used. The other two algorithms are AES-192 and AES-256, following the same naming convention.
The key used for the algorithms does not have any structure, i.e. any bit string of a proper length can be used. Usually the security APIs provide a method for secure random number to be generated which can be used as a key. Lack of randomness in the key might be a security issue.


Asymmetric-key algorithms

The bits manipulation algorithms that use asymmetric keys are called asymmetric-key algorithms, and the appropriate cryptography branch is called public-key cryptography.
As always, the appropriate wikipedia page is a great starting point :)

The asymmetric-key algorithms provide a more complex but also more flexible type of security. It covers scenarios, as I wrote in my previous post, that are not possible to be covered by use of symmetric keys. One other importnat feature is that there is no need of secure initial exchange of the keys. Of corse this last statement might look a little bit strange now, but everything should become clear by the end of this post :)


Encryption
Yes, this is right, we finally got to the point of using the word encryption.
We can define the encryption as type of bits manipulation which purpose is to make the data unreadable by anyone else than the intended audience (usually one person).
It is always best to describe things with examples, so let us use that wikipedia example here. 
The two persons involved in the communication are Alice and Bob. Let's say that Bob needs to send an encrypted message to Alice. In that case Alice needs to generate a key pair on her computer and distribute the public key to Bob.


Bob can then use the public key to encrypt the data and send the encrypted data to Alice. She can use her private key to decrypt the data and use it. 

You see, since no one else has the Alice's private key only she can read the data, and the only thing that the others can do is send her encrypted messages. That's why it is said that asymmetric-key algorithms don't need secure exchange of the keys, since the only key shared is the one used for encrypting the message and not the one for decrypting it.
One of the most widely used encryption algorithms today is RSA. The appropriate wikipedia page should give you more info that you need, including exact algorithm for key generation, encryption, decryption and whole bunch of security issues and analysis most important of which is the key length consideration.


Signing
In the example above we have shown how Bob can send data to Alice in a way that no one else than Alice can read. But let's now say that Alice wants to send some message to Bob, and Bob needs to verify that the message sent was really from her.
In this case Alice will use her private key to manipulate the bits of the data and Bob can use the public key to read the original message.


We call this type of data scrambling signing since data is not really protected from the general audience as anyone can have the public key which is now used to read the data.
Signing on its own is not much different than encryption as it is just a different type of data manipulation algorithm used for different application (verifying the sender, not protecting the data). 
Since all data scrambling algorithms are quite complex and time consuming, usually the signing is done over a smaller amount of data sent together with the whole message.
Today RSA encryption over some checksum of the data (e.g. SHA1) is mostly used for signing.


Key exchange
One last example that public keys can be used for is secure exchange of a symmetric key over an insecure communication channel. Once the key is securely shared it can be used to establish a secure connection with symmetric encryption of the data.
One such protocol is the Diffie–Hellman key exchange protocol shown on the image below.


Both Alice and Bob generate asymmetric key pairs and exchange the public keys between each other. Then they use a combined signing/encryption of the data so that the other person can receive the data encrypted and also be able to verify the sender.
This protocol is now part of the well known SSL protocol which we will talk about in some of my upcoming posts.

SSCC - Keys

For more info on cryptography keys, this wikipedia page is a good start:

Now my short summary

Q: What are keys?
A: The things that lock and unlock our doors so we can keep our home safe and yet let us in.

Q: So, what are keys regarding computer data?
A: Things that lock and unlock our data so we can keep our data safe and yet let us use it.

All the cryptography keys (or simply keys from now on) can be divided in two categories:
  • Symmetric keys
  • Asymmetric keys
If the same key can be used for locking and unlocking the data then we are talking about symmetric key.

On the image above persons A and B both have the same symmetric key K and can share a message securely between each other, as the unauthorized  person C does not have the key and can only see scrambled data.
Usually when we talk about keys one immediately thinks of encryption, but we want use this word yet as encryption might refer to several different types of data scrambling :). Since data scrambling does not sound too much profesional, let us use the term bits manipulation.
When it comes to bits manipulation algorithms that use symmetric keys we are talking about ciphering.

Ciphering is one of the earliest types of protecting data and it dates from long before computers were invented. In some cases, where less security is required cypher algorithms can be very simple. One such example is the XOR logic operator:
100101101 XOR 110011001 = 010110100
010110100 XOR 110011001 = 100101101

Now imagine that your secure zone consists of a group of more than two people. Let say a group of police officers communicating over some computer network. One of the many in the group is the chief and gives orders to the others, while the others must only receive and not be able to send orders. Obviously this restriction can not be done by using a symmetric key, as in order for the group to receive the orders they will need to share the same key, which will enable them to also send orders to the rest of the group. But if we had a key pair instead of only one key, and we use one key from the pair to lock the message, while the other to unlock it, then the officer can have the first key and hence would be able to send orders, while the group can share the second key and would be only able to receive orders.

If we use separate keys for locking and unlocking the data, then we are talking about asymmetric keys.
The key used by the chief and not shared by the others in the group is called a private key, while the shared one is called a public key.

Service Security Crash Course - Preface

In the past several months I've been challenged to work on few e-banking and e-government iPhone applications which required secure communication with the appropriate backends. This series of posts called "Service Security Crash Course" should cover what I have learned while working on these projects. The audience level is "advanced newbie" :), that is the information provided should be enough to bring anyone from a level of "I have heard about certificates" to a level of "it would be much easier if we use certificates from a valid CA" :)

The series should cover:
  • Keys
  • Data scrambling (a.k.a Encryption)
  • Certificates
  • SSL
  • XML Security
All the posts in the series would be prefixed SSCC (as short for Service Security Crash Course)