kernelthread.com

A Taste of Computer Security

© Amit Singh. All Rights Reserved. Written in June 2004


Securing Memory

Secure Programming

Perhaps the most effective defense against a variety of attacks would be something that is subjective, open-ended, hard to attain, costly (in terms of time and money), and clichéd: good design and good programming.

It is usually difficult for a programmer to check for vulnerabilities in his own software.

Secure Languages

A more realistic, though not always practical, option is to use "safe" programming languages. If a language is both type-safe and easy to use, it effectively aids the programmer in writing safer code. An example is that of Java.

Compared to C, the probability of making subtle errors is smaller in Java. The code run by the Java Virtual Machine (JVM) has structural constraints that are verified through theorem proving. An important part of verifying a Java class file is verification of the Java bytecode itself. In this process, each method's code is independently verified: instruction by instruction. If an instruction has operands, they are checked for validity too. For example, the data-flow analysis performed during this verification would make sure that if an instruction uses operands (from the operand stack), there are enough operands on the stack, and that they are of the proper types. The verifier also ensures that code does not end abruptly with respect to execution. There are various other steps in this comprehensive verification process.

Moreover, the "sandboxing" properties of the JVM aid in security as well. Depending on whether it is local or remote code (an application vs. a downloaded applet, say), the restrictions placed by the Java runtime can vary greatly.

Another example of a safe language is ML. There even exist safe(r) dialects of C.

Note that a secure languages does not guarantee security. There could be flaws in the design or implementation of the language, or it may be used insecurely, or used in an insecure context.

While programmers could consciously not use functions that are vulnerable in any reasonable context, one has to contend with the vast bodies of code already existing. Even with automation (such as programmatically searching for vulnerable functions), auditing and reviewing large amounts of code is a difficult task — vulnerabilities can be created when normally secure functions are used in insecure ways, often subtly. For example, it is important to anticipate all kinds of data that will be presented to the program. Suppose you disallow a path separator character in the input ('/' or '\'), in this age of internationalization and Unicode, you must ensure that you are accounting for all representations of the undesirable characters.

There has been a great deal of security-related activity both in academia and the industry in recent years, and rightly so: security is an extremely lucrative and glamorous part of the industry.

Making Buffers Non-executable

A relatively early solution to the stack overflow class of attacks was to make the stack unexecutable. This could be generalized to making any kind of buffers unexecutable, so that even if an attacker manages to introduce rogue code into a program's address space, it simply cannot be executed. Making the entire data segment unexecutable does not work too well though — it is common for operating systems to legitimately generate code dynamically (for performance reasons, say) and place it in a data segment.

Note that we observed earlier that exploitable code does not necessarily have to be "injected" via a payload. Such code may already be present (legitimately) in the program's address space. Nonetheless, making the stack segment unexecutable does make things better, even though the defense is far from perfect.

One minor complication is that some systems might use executable stacks for implementing certain functionality. An often cited example is that of the Linux kernel, which emits signal delivery code on a process's stack (refer to the setup_frame function in arch/i386/kernel/signal.c). The approach used by a popular Linux unexecutable-stack patch is to make the stack executable for the duration of the above operation.

Another example is that of GCC trampolines. A trampoline is a small piece of code that is created at run time when the address of a nested function is taken. It normally resides on the stack, in the frame of the containing function. It is not very common to use this feature of GCC though.

libsafe

Baratloo, Tsai, and Singh [Transparent Run-Time Defense Against Stack Smashing Attacks, 2000] used two dynamically loadable libraries to defend against stack smashing. The first, libsafe, intercepts all library calls to vulnerable functions, and reroutes them to alternate versions that provide the same functionality as the original versions, but additionally ensure that any buffer overflows would be contained within the current stack frame — by estimating a safe upper limit on the size of buffers automatically. In this scheme, local buffers are not allowed to extend beyond the end of the current stack frame. The second library, libverify, implements an idea similar to StackGuard (see below), but instead of introducing the verification code at compile time, libverify injects the code into the process's address space as it starts running. Therefore, libverify can be used with existing executables, without recompiling them.

Both these libraries are used via LD_PRELOAD.

Monitoring Instructions

An approach similar to Java's bytecode verification could be to monitor those instructions — at runtime — that cause transfers of control flow. Although it sounds rather heavy-handed (and expensive performance-wise), Kiriansky, Bruening, and Amarasinghe presented a technique [Program Shepherding, 2002] for verifying every branch instruction to enforce a security policy. Their goal was to prevent the final step of an attack: the transfer of control to malicious code. They used an interpreter for this monitoring, although not via emulation (it would likely be unacceptably slow), but using a dynamic optimizer (RIO), which results in much better performance than would be possible through emulation.

Bounds Checking

An effective, though performance-wise expensive and difficult to implement (for C) technique to prevent buffer overflows is checking bounds on arrays. Note that in order to do range checking on buffers at run time, we would need the size of buffers in the executable. Lhee and Chapin [Type-Assisted Dynamic Buffer Overflow Detection, 2002] modify the GNU C compiler to emit an additional data structure describing the types of automatic and static buffers. With some exceptions, the types of such buffers are known at compile time.

Stack Shield

Stack Shield is a GNU C compiler modification that works by having a function prologue copy the return address to the beginning of the data segment, and having a function epilogue check if the current value of the return address matches the saved one.

FormatGuard

FormatGuard is a GNU C library patch that redefines printf and some related functions (fprintf, sprintf, snprintf, and syslog) to be C preprocessor macros. Each of these macros includes a call to an argument counter, the result of which (the argument count) is passed to a safe wrapper, the __protected_printf function, which determines if the number of % directives is more than the number of provided arguments.

Applications must be re-compiled from source to make use of FormatGuard.

PointGuard

PointGuard is a modification to the GNU C compiler to emit code for encrypting pointer values while they are in memory, and decrypting them just before they are dereferenced. The idea is that pointers are safe while they are in registers (because they are not addressable), and they are safe in memory if encrypted, as it would not be possible for an attacker to corrupt a pointer so that it decrypts to a predictable value. Consequently, PointGuard requires the use of load/store instructions (that is, a pointer is dereferenced only through a register) to be effective.

The encryption scheme is simply a XOR operation with the key, a word sized value generated from a suitable entropy source (such as /dev/urandom). Each process has its own encryption key, which is chosen at exec() time, and is stored on its own read-only page.

Note that statically initialized data is taken care of by re-initializing it, with pointer values encrypted, after the program starts running.

StackGuard

StackGuard is a mechanism that can be built into the GNU C compiler for detecting corrupted control information on the stack (in procedure activation records). StackGuard adds a "canary" word to the stack layout.

The canary is an allusion to the practice among Welsh coal miners of carrying a canary with them as they went down — the canary was more sensitive to poisonous gas than a human being.

The canary holds a special guard value, which may be a terminator character (such as NULL, CR, LF, and EOF), or a random number (as in OpenBSD's ProPolice). Control information may even be XOR-encrypted. A piece of prologue code generates the canary, and a piece of epilogue code verifies it. A recent StackGuard version has included the saved registers and saved frame pointer (in addition to the return address for a procedure) to the set of guarded entities.

Although two implementations of the same idea, StackGuard and ProPolice differ in several subtle ways.

Microsoft's Visual C++.NET has a similar feature, using the /GS switch. It can also compile builds with another runtime checking mechanism enabled (the /RTCs switch) that uses guard blocks of known values around buffers.

RaceGuard

RaceGuard is a kernel enhancement for protecting against vulnerabilities that result from race conditions during temporary file creation — a typical TOCTTOU (Time of Check To Time Of Use) problem. The underlying idea is to distinguish between the following two sequences:

1 Does the file exist? 1 Does the file exists? 2 Create the file 1.5 Attacker creates a link 2 Create the file

In the column on the right, the attacker exploits the race condition by creating, say, a symbolic link, to an already existing sensitive file, or even a non-existent file whose existence could be misused.

RaceGuard uses a per-process filename cache in the kernel. During a request for opening a file, if the result of step 1 is "NO", RaceGuard caches the filename. If step 2 encounters an existing file, and the filename is in the RaceGuard cache, the open attempt is designated as a race attack, and is aborted. If step 2 is successful and there are no conflicts, a matching entry in the RaceGuard cache is evicted.

PaX

PaX is an enhancement for Linux that uses several hardening techniques aimed at preventing address-space (memory) related exploits for a number of processor architectures. In particular, PaX implements non-executability of memory pages and randomization of address space layout.

Depending on whether a platform has a per-page executable bit in hardware, PaX can use or emulate this bit. In the particular case of IA-32, which does not have this bit in hardware (referring to mmap()/mprotect(), both PROT_EXEC and PROT_READ are the same bit), PaX emulates similar functionality by dividing user-level address space (normally 3 GB on Linux/IA-32) into two equal parts, with both the user code segment and the user data segment getting 1.5 GB each. A technique termed VM area mirroring duplicates every executable page in the lower half (user data) to the upper half (user code). Instruction fetch attempts at addresses located in the data segment that do not have any code located at its mirrored address will cause a page fault. PaX handles such page faults, and kills the task.

Another feature of PaX, Address Space Layout Randomization (ASLR), randomizes locations of objects such as the executable image, library images, brk/mmap managed heaps, user and kernel stacks. Data in the kernel can be made non-executable, and some critical kernel objects (the system call table, IDT and GDT on the x86, etc.) can be made read-only, although at the cost of having no loadable kernel module support.

In order to use fully position independent code, PaX uses a special type of ELF binary, ET_DYN, for relocation of the binary at a random location. Such binaries are mmap'd into memory just like regular shared object libraries. Note that this requires recompilation and relinking of all applications.

Red Hat's ExecShield is similar to PaX in many respects. It uses the Position Independent Executable (PIE) format to randomize executable images.

Some relevant links are listed below:

Randomization

Bhatkar, DuVarney, and Sekar [Address Obfuscation: an Efficient Approach to Combat a Broad Range of Memory Error Exploits, 2003] have proposed address obfuscation to randomize the absolute locations of code and data, and to randomize the relative distances between different data items. They use a combination of techniques to achieve this: randomizing the base address of memory regions, permuting the order of variables and routines, and introducing random gaps between objects (for example, random padding into stack frames, between successive malloc requests, between variables in the static area, and even within routines, along with jump instructions to skip over these gaps).

OpenBSD

OpenBSD, with its emphasis on proactive security, uses a combination of several techniques to harden the memory subsystem. Some of these are:

Some other security-related features (and philosophies) of OpenBSD include:

<<< Defeating Memory main Access Control >>>