“Crypto intro for assembler programmers”
Basic cryptography concepts explained in ten minutes
Virtually anybody who has done any computer programming at all could easily grasp the concept of a simple shift cipher, but trying to make the jump to understanding even the most primitive of ‘real’ crypto algorithms such as RSA or DES can seem very daunting.
However, if you've done any serious assembler programming, or even low-level systems programming in C, then you're already familiar with many of the key concepts used in modern digital cryptography systems.
Does this come as something of a surprise? Quite possibly.
Much of the reason for this confusion is simply due to the fact that the available literature often assumes that the reader has a background in mathematics, and as such, introduces the concepts in mathematical terms. Ironically, some of these concepts which are non-trivial to explain within the field of mathematics, are fairly simple fundamental principles of computer science that you probably already understand.
The upshot of all this is that if you've been put off trying to make sense of it all, or struggling with having it explained in a way that is unfamiliar to you, then relax. It's probably easier than you thought. This intentionally short primer is perfect as a ten-minute lunch break read to get you started!
Modular arithmetic
𝓍 MOD 256x % 256and $0xff, %eaxModular arithmetic
If you've got the value 0xFFFFFFFF stored in an unsigned 32-bit register, (or an unsigned int in a C program), and increment it by one, what happens? Of course, it wraps to 0x00000000. Whenever you mask off the high-order bits of a value and then do some operation on it, that's modular arithmetic.
Given that in assembler programming, all arithmetic is modular arithmetic, it probably seems like stating the obvious to say things like, the value 0x1FFFFFFF0 is the same as 0x2FFFFFFF0, if you're working in a 32 bits. That's to say, 0x1FFFFFFF0 + 0x000000001 and 0x2FFFFFFF0 + 0x000000001 will yield the same result.
It's probably also fairly obvious that there are an infinite number of values that evaluate to the same result when you only look at the lower 32 bits. If you're counting bytes of network traffic in an unsigned 32-bit value, then it will flip over to zero every 2^32 bytes, and this will continue happening forever.
For anything involving a bitstream, such as cryptography, the modular arithmetic is usually going to be a power of two. If you have a background in computing, and are accustomed to looking at numbers in hexadecimal and binary, then this is all trivial to understand.
Explaining and proving the general case for these properties in terms of mathematics, where the modulo value isn't necessarily a power of two, can get quite verbose and complicated, and may quickly start to sound quite confusing to anybody with a computing background. Just remember that in practice, essentially all you're doing is working within a fixed-width register, masking some high-bits of the input because their value doesn't matter.
You'll hear this all the time, and it's certainly a very important part of just about all modern crypto systems. All it is, though, is bit-swapping. A simple permutation operation on an 8-bit number might be to swap bit 0 with bit 1, bit 2 with bit 3, bit 4 with bit 5, and bit 6 with bit 7. So 01010101 becomes 10101010.
Whereas you might try to avoid doing tedious operations such as this in software, this kind of operation is very fast when implemented in dedicated hardware, as it's literally just connecting the outputs from one set of gates to the inputs of another in a non-linear order.
Ss-boxx[i]mov (%rbx,%rcx,1), %allookup-tableEasy to implement in C or assembler
S-boxes, or substitution boxes, are another term you'll frequently encounter, and the good news is they are basically lookup tables.
All that you are doing with an s-box is mapping one set of input values to another fixed set of output values. The values may or may not have the same bit-length, but obviously for the operation to be reversible the substitution has to be un-ambiguous.
Fields and finite fields
Just like with modular arithmetic, the concept of finite fields is essentially performing arithmetic operations within a constrained set* of numbers, rather than all of the numbers that exist.
* Note that the word set here is used in the non-technical standard English sense. The same word has a very specific and well-definited meaning in mathematics.
Unlike modular arithmetic, though, the numbers or values that make up a field are not necessarily obvious or easy to work with. If you're working on 8-bit values in an 8-bit register, you probably think of them as 0 to 255, or maybe -127 to 128, and that maps very neatly on to the idea of modulo 256 arithmetic.
If you were doing computations on values in a field with 256 distinct possible values, then you could just as easily store those values as 8 bits in an 8-bit register. However, the normal rules of math wouldn't necessarily apply.
It's broadly similar to the situation if you were to store two 4-bit values in the upper and lower bits of an 8-bit register. You couldn't then expect to get meaningful results out of it by doing shift operations, for example, and then taking the whole 8-bit value. Any specific combination of those eight bits will have a well-defined meaning, but you can't do normal math operations on them. Any operation that you want to do on the data in this register will have it's own set of rules for manipulating the bits.
Our ten-minutes are up, so that's it for today's short introduction. You'll meet the concepts that we've touched on above almost immediately once you start looking at any material to do with digital cryptography, but as you can hopefully see, you don't need to start from zero.