Encryption: keys
So we've decided that security by obscurity is not a good option
for encoding our data. Essentially, it's not a good option because:
- we may need to encode independent pieces of data (be they files,
communications across a network etc), and we might want them to be encoded differently
so that just because Person A can decode File A' with our system doesn't necessarily
mean that Person B can decode File B' with our system;
- we would like our system to be "provably secure"— or at
least, since the latter is impossible in most practical cases—
we'd like some reasonable confidence that it's not insecure;
in practice, it's difficult to have several experts scrutinise an algorithm for
security problems whilst at the same time keeping it secret1;
So instead, the vastly more practical and secure option is as follows:
- design a public algorithm which
each time encrypts data using a specific key;
- design the algorithm so that essentially any random key will work
(or at least, any random key chosen according to some well-defined criteria);
- now, the only secret is the key in a particular conversation
or use of the algorithm.
For now, we can imagine that a "key" is just some specified number of random
bytes. Roughly speaking, this is a sequence of bytes that we'll "mix in" when we
encrypt a particular conversation or file. And the key can change for, say, every file or
communication that we encrypt. But the actual encryption algorithm— which defines the
way in which we "mix in" the key— will always be the same. Our premise is
essentially that "with this algorithm and any permitted randomly-chosen key,
the encrypted data will always be totally undecipherable without a key that is 100% correct".
Then, we make our encryption
system public and encourage the world's cryptography experts to deliberately try and find flaws
in our algorithm. Without the specific key that encrypted a given piece of data, the algorithm by itself is not enough to decrypt that data, so it is of no security concern (quite the opposite,
in fact) that the algorithm is public.
The AES process showed that organising a competition for a new standard (where the winning team gets fame/money
beyond their hamster's wildest dreams) can help with this: participants then have an
incentive to look for flaws in each others' submissions, and even for those outside the
competition, there'd be enough kudos in writing a paper that identified a
flaw in an emerging NSA standard that some experts would surely start looking.
Previous examples of active incentives to guage the strength of cryptographic algorithms
include the contest held by Dr Dobb's Journal to find flaws in Blowfish2,
or RSA's previous competition to factor various large numbers.
After some time, we round up the resulting attacks and analyses of
our algorithm, and assess how secure our system is. If in several years, none of the world's leading cryptanalysts has found a flaw (and we're sure they've been looking),
then this gives a high level of confidence that
our algorithm is "secure". If after 6 months, somebody publishes a paper showing how to
break a 9-round version of our algorithm and our algorithm uses 10 rounds (see
the section on block ciphers for the notion of rounds),
then we should be much more worried.
Assuming that our algorithm is secure, then as a secure, key-based algorithm,
it then has some desirable properties:
- it is secure whatever the key, except possibly for
some well-published set of "weak keys" that we can deliberately avoid;
- the only secret is the key, of which we can easily3 generate
as many as we need— it's much easier for a computer program to generate a series of 16 bytes
than to generate a unique, secure encryption algorithm on each run!— that
means that our server can encrypt Conversation A with Person A' and, with the selfsame
algorithm (but a different key), encrypt Conversation B with Person B', and A' and B' cannot
illegitimately decrypt each others' conversations;
- a more minor issue, but still useful in practice as we'll see, is that it reduces
and fixes the amount of secret information that we need to manage— there can be
just (say) 16 bytes of key information per entire file/conversation/hard disk etc.
Now, our notion of "secure" actually implies a couple of extra things. In particular, we
want decryption to be an "all-or-nothing" operation. It would be no good,
for example, if the encrypted/decrypted output with key 01234567 was similar to the output with
key 01234568 in any detectable way. If it were, then: (a) an attacker could eventually guess
our key by just "trying lots of keys", using failures as a clue as to how "warm" they were,
and thus which key to try next; (b) Person A might be able to use his/her supposedly
unique encryption key to decrypt Person B's data without authorisation. The relationship between
the key and the resulting encrypted text have to appear "random".
Guarantee of security?
Unfortunately, there's probably no practical encryption system whose
security can be guaranteed: at any given time, we deem a system "secure" in the sense
that the world's cryptanalysts have so far not found a flaw. But no mathematical
proof will generally tell us that a flaw can't ever be found, or that one will
even be needed with some future technology4.
Next: symmetric key encryption
On the next page, we begin looking at symmetric key
encryption in Java. This is arguably the simplest case, where the key used to
encrypt is the same as that used to decrypt, and is thus a secret between the parties
involved.
1. Even if you think you can just "pay enough experts" to look at it while signing
a non-disclosure agreement, it's extremely difficult to know that it's been seen by
"enough" of the "right" experts. Think about how the Internet has made certain sites explode in
popularity virtually overnight. Your home-grown scheme, or the one that you paid one expert to
look at for one week, may appear to offer adequte security now. But if your program
suddenly gains in popularity, that situation could be reversed too rapidly for you to
do much about the insecurity before it's too late.
2. See Bruce Schneier's 1995 article: Blowfish: One Year Later.
3. Compared to most uses, we actually need to be careful about
how to generate random numbers: see the discussion of
SecureRandom.
4. For example, all of our estimates of the effort required to guess a key will be
incorrect once quantum computers are mainstream.
If you enjoy this Java programming article, please share with friends and colleagues. Follow the author on Twitter for the latest news and rants.
Editorial page content written by Neil Coffey. Copyright © Javamex UK 2021. All rights reserved.