the cryptanalysis blues

neural5

The field of cryptanalysis (code breaking) has historically been the territory of academia and intelligence agencies. In stark contrast to most other fields of information security, self-taught hackers have had a very limited role in the development of a very rich corpus of cryptographic attacks and countermeasures.

The general perception within the hacking community is that cryptography is “hard”, the learning curve is steep, and it involves a lot of mathematics. But I dispute this.

The issue I have isn’t that academic cryptographer’s haven’t done a great job of the math (they have), but that the math is peripheral to the conceptual understanding and implementation of the attacks. In the production of ideas and the implementation of attacks, hackers are an untapped resource in cryptography.

Let me take you for a casual stroll through the mathematical content of the famous “differential cryptanalysis”, and hopefully you’ll see why. Differential cryptanalysis has inspired literally dozens of cryptanalyses against both symmetric ciphers and cryptographic hash functions, so it should be an excellent measure of what it takes to build a good cryptanalysis by any standard.

I’m going to dissect chapter 2 (“Introduction to differential cryptanalysis”) of “Differential Cryptanalysis of DES-like Cryptosystems”. This is where the core concept and mechanisms of the technique are espoused.

  • The first couple of pages (8 and 9) set about rigorously defining some notation for the various building blocks of a symmetric algorithm. Academics love to spend a lot of time defining notation (I’m fairly certain that it’s written in stone somewhere that any two academics are strictly forbidden to use a consistent notation), but there’s no deep mathematical content here. It’s only syntax.
  • On page 9 is our first definition, something called “independent keys”. It turns out that this definition isn’t hugely important – pretending that there’s no key schedule just makes things “simpler” – but of course we only learn this several pages later. “Abandon hope all ye who enter here”
  • Diagrams, diagrams. In this case, a generic overview of how DES works on pages 10 and 11. (N.B. Cryptographers must be very careful in their use of visualizations, otherwise their papers may become interesting and understandable. This can usually be handled by forbidding any sort of creativity in these visualizations, and instead resorting to Spartan blocks and lines. Color is, of course, strongly discouraged.)
  • Finally (on pages 11 and 12) we start to get to some properly meaty ideas. The trick to reading this paper is simple – start thinking of XOR as “the difference between”. The idea being put forth here is that most of the components used in DES are linear, i.e. they don’t stop us from working out “difference”. For example, if we know how a pair of plaintexts differ prior to encryption, we can still work out the exact difference in algorithm state after expansion permutation and key mixing. The same thing applies to the final permutation and round chaining steps. The only exception to this rule is the non-linear “S-boxes”.
  • S-boxes are actually just specially crafted lookup tables. They give a substitution primitive – one input is substituted for some specific output, and the same input will always be substituted with the same output. The crux of differential cryptanalysis is a simple observation about S-box behavior:
    “for any particular input XOR not all the output XORs are possible, the possible ones do not appear uniformly, and some XORed values appear much more frequently”
    This simply says that if we know the difference between two S-box input states, then we have a much better probability than expected when guessing what the output difference might be.
  • The next couple of pages just lay down some terminology and examples around how to express relations between input and output differences. We also get a first look at a “distribution table” for an S-box. This table is actually very simple; each cell just counts the total number of ways that a particular output difference can occur given a certain input difference. As you can see, this table is not as uniform as you would hope.
  • So a picture has emerged of how the attack will progress:
    1. we assume prior knowledge of plaintext differences (a model called known-plaintext attacks)
    2. we pair the known plaintext differences with the resulting ciphertext differences
    3. and then we exploit the statistical properties of the S-box to some up with some good guesses for what key values are most likely to satisfy the differences we have.
  • Pages 15 through 17 give some examples on how to actually recover key bits. It really is as simple as narrowing down the possible key values to a smaller set based on the likelihood of an input difference causing a particular output difference.
  • So far all discussion has been focused narrowly on analyzing the basic S-box primitive. The next couple of pages (18 and 19) extend the results on S-boxes to the overall round function. Its intuitive enough that the round function probability is the product of all of the S-box probabilities – the math is not dissimilar from calculating the probability that a coin flip will be heads eight times in a row.
  • About the only problem left is to extend the attack across multiple rounds. The DES cipher goes for 16 rounds, and with each round the probability that a set of differences will hold true decreases. To find out the best way to do this, the paper introduces the concept of an “n-round characteristic” (page 22). Good characteristics have a better chance of getting “right” plaintext pairs, and each “right” plaintext pair give us a better shot at guessing the actual key value. In essence, a characteristic just describes an abstract path through the cipher that’s more likely to occur than most other paths (given a certain difference in the plaintext).
  • From here, the rest of the chapter (up to page 29) is essentially all about finding the best kinds of characteristics. Characteristics that can be stretched to the full number of cipher rounds with a good probability are the aim. Ultimately, deciding whether a characteristic is good or not comes down to the signal-to-noise ratio on how often the correct key comes up versus nonsense.

In eleven simple bullet points you have the fundamental basis of a full-fledged attack. No abstract algebra, no first-order logic, no number theory. Just a good idea coupled with a few keen observations. Sprinkle in some basic probability theory on to show that it all really works, and you have created a masterpiece. It really is a masterpiece – it just simply isn’t the math that makes it so.

Now, the counter example du jour is public key cryptography and the presumed requirement of an advanced degree in number theory to get anywhere close to breaking the primitives there. I don’t disagree, but unfortunately the hardness of these primitives cannot save the day – in practice we have hybrid cryptosystems that rely on unbroken symmetric algorithms and hash functions just as much they do integer factorization or discrete logarithms.

And to be honest, our understanding of what makes a symmetric algorithm or hash function secure seems rather… superficial.

Oded Goldreich has argued that the mathematical rigours of provable security are the only way to keep cryptography within the realms of science. Perhaps then its the duty of hackers to also keep cryptography within the realms of conceptual art?

[email protected] (@benhawkesrss

P.S. I highly recommend Neal Koblitz’ “The Uneasy Relationship Between Mathematics and Cryptography”