Anatomy of the Heap-based Overflow
The classic heap-based exploit relies upon a macro used in free memory chunk list manipulations in glibc's heap management code. By overflowing a chunk, an attacker can set the values of the fd and bk pointers of the next chunk header in memory. When the unlink macro is subsequently called on the overflown chunk header, a write to a memory location specified by the attacker with a value also specified by the attacker can occur. If the attacker is careful enough to supply the correct values to the chunk list pointers, the attacker can modify a function return address, a GOT or PLT entry, or an application-level pointer in order to subvert the control flow of the process.
Other variants of this attack exist, including using the frontlink macro or exploiting the size fields of memory chunk headers in conjunction with fake chunk headers, but they operate on the same general principles.
Technique
The technique employed in our heap protection patch is an adaptation of canary-based schemes used by various stack protection patches. This is based off of the observation that by placing a canary word in front of the data to be protected, one can determine whether the protected data has been modified before the attacker has a chance to gain control of the process. In the case of a buffer overflow, the canary word must necessarily be modified if the protected data is modified. If the attacker is able to jump over the canary, he is in general able to write to arbitrary memory locations and thus already has the ability to subvert the process' control flow, e.g. by directly attacking the process GOT or PLT.
An important detail to note is that the value of the canary word must be difficult for the attacker to guess. Otherwise, the attacker could simply overwrite the canary with the expected value and thus complete his attack undetected. Some systems have opted to use a series of string-terminating NULL bytes, derived from the observation that if the attacker must include these in his overflow string, he cannot use the standard C library string functions to overwrite the protected data. In our patch, we currently use a random seed which is generated at process start and subsequently protected from further writes using a call to mprotect(). This prevents the attacker from setting the seed to a known value in some way, reducing the canary to a guessable value.
For glibc, we had to modify the structure of a memory chunk in order to implement our scheme. The original structure of a memory chunk in glibc consisted of a previous size field, a size field, and optionally two pointers, bk and fd. These pointers are only in use when the chunk has been freed, as they reside in the user data section of the chunk as an optimization. Otherwise, only the previous size and size fields are used.
The modified structure of a memory chunk has two fields prepended to the original structure, a canary field called magic and a padding field used to enforce the constraint that the size of a memory chunk header must be a power of two.
The remainder of the technique consists of setting canaries for allocated and freed blocks and inserting consistency checks into glibc's heap management code at appropriate points. Consistency checks on the canary are performed before each manipulation of a chunk's pointer fields; thus, an attacker has no opportunity to corrupt memory. Canaries are set after any operation which changes the size or pointer fields of the chunk.
An evaluation of the performance and detection capabilities of the current implementation is available, as well as the latest patches and binary packages.