Encryption: introduction

As we touched on in our cryptography introduction, encryption is the technique of encoding a message (or series of bytes) so that it can only be read by a party that knows some "secret" about how it's been encoded. We assume for now that they can't get the secret by directly observe the encoding/decoding process or by having access to the code in any way. For example, imagine a communication between a client and a server, where an attacker can freely observe any point in the network between the two machines, but not the machines themselves: they just see the bytes flowing to and fro.

How not to do it: security through obscurity

As a first naive thought, we might wonder about creating our own "secret encoding scheme". Anybody who was once a 10-year-old ZX Spectrum programmer may well have even done this. The problem with this is that it's impossible to prove that somebody can't figure out how to break your code. If the only Bad Guys trying to crack your system are 10-year-old Commodore 64 programmers, and the data at stake is the mark you got in your last German vocab test, then just maybe you can invent something sufficiently secure. But now imagine the data you're trying to protect is this month's credit card transactions, a national database of known paedophilia victims, or the personal details of 25 million UK taxpayers. You need to encrypt that data with a scheme that's pretty damn secure. (Inland Revenue: I hope you're listening to this...) The attacker can't directly get access to your code to disassemble/observe it— remember, we said for now we'd assume that was the case. But even so, your once genial scheme of multiplying every ASCII value by two and reversing every pair of bytes suddenly sounds as though it might not cut the mustard.

And now imagine that the Bad Guy is somebody who would seek considerable financial gain— and/or cause you considerable loss— if they could decrypt your data. And imagine they have hefty computing resources to throw at the problem if necessary (bearing in mind that any old schmuck can rent pay-as-you-go processor power nowadays for about one dollar per 20GHz hour and no initial outlay...). Or imagine that it is worth their while to "persuade" a few or ten or a hundred of the country's top mathematicians to work on cracking your system? How confident are you that the scheme you invented will stand up to these threats? And how confident are you that you can invent not one such scheme, but a different scheme, each with this level of security, for every distinct party that you need to communicate with?

This is the model that cryptographers generally refer to as security through obscurity. And generally, for reasons that are hopefully now obvious, it's a no-goer¹. Pretty much every "secret" encoding system in history has been leaked or cracked sooner or later— famous cases include DVD encryption, RC4, and the A5/1 scheme used for encrypting GSM traffic. (The moral of the story in the latter case being that it's especially difficult to hide the fact that you're using a deliberately weak level of encryption...)

Next: key-based encryption

So how should we proceed? On the next page, we look at the notion of key-based encryption, where the algorithm is made public, the only secret is the key.

1. Unfortunately, there are cases where, although a no-goer, it's the most viable of the various non-viable options; for now, we're not concerned with those situations.

If you enjoy this Java programming article, please share with friends and colleagues. Follow the author on Twitter for the latest news and rants. Follow @BitterCoffey