This collection contains I/O traces taken at the system-call level in 2000 and 2001 as part of a security research project. The traces cover thirteen computers used for software development by CS researchers. The longest trace (machine02) runs from June 2000 to August 2001. -------------------------------------- NOTICE: These traces have been converted from the Seer format to ASCII for easier analysis and to prevent data decay. They are slightly larger than their binary counterparts but do not require Seer tools to read. The tools used to the convert these traces are available at: http://www.lasr.cs.ucla.edu/geoff/seer_traces.html and on the iotta trace repository tools section. Following is the content of the README.TXT file originally distributed with these traces. Since the traces have already been converted, no tools have to be run: ====================================================================== This is the content of the README.TXT I. Layout of the disks II. How to run the tool III. How to read the output ASCII files from the tool IV. Some example output ASCII of the trace V. Sanitizing trace --------------------------------------- I. Layout of the disks: There are two things: _ trace file in binary format (sanitized section V) _ the tool to convert from binary to readable ASCII We stored the trace file in binary format to save space. The trace from each machine are put in seperate folder. In each folder, there are multiple trace files recorded for that machine. For example, in folder machine01, you may see these files: machine01.07021720.001.gz machine01.07131926.001.gz The first trace file is started on July 2nd, at 17:20 The second trace file is started on July 13th, at 19:26 There are several reason why the traces are cut into seperate file: a) When the machine is restarted/reboot, new trace file are created b) When the trace file gets large. (001 mean the 1st fragment, 002 means the second fragment) The tool is stored in directory "tool" --------------------------------------- II. How to run the tool: The tool is stored in folder "tool" All trace files are in binary format and thus, we need the tool "dumpobs" to convert it into readable ASCII format. Here is the usage for the "dumpobs" ./dumpobs file1 [file2][.....] where file1, file2, ... are the binary trace files. The ASCII will be output to the stdout. Example: ./dumpobs machine01.07131926.001.gz > /tmp/machine01.07131926.001.txt This will convert the binary trace file into ASCII format and output to /tmp/machine01.07131926.001.txt Example: ./dumpobs machine01.07021720.001.gz machine01.07131926.001.gz > /tmp/machine01.txt This will convert the binary trace files into ASCII format and stored in /tmp/machine01.txt Note: You need gzcat or zcat inorder to run the dumpobs utility. The dumpobs will call gzcat so if you have zcat, you need to create a softlink to gzcat. --------------------------------------- III. How to read the output ASCII files: a) Overview Each system call is logged in the trace file. Each will be printed in a seperate line in the ASCII file. Following is one line example from an ASCII file machine1.06070909.001.txt 32 UID 0 PID 812 ?? B 1008288727.376532 execve("/sbin/syslogd") = 0 The above line is the log for the "execve" system call. Sometime there will be a line to report the status of the machine when it is started up. For example: 0 RESTART (scratch) at 1008288727 Thu Dec 13 19:12:07 2001 The above line logged when the trace deamon started (i.e the machine just restarted). In this example, the machine is rebooted and the deamon is restarted at time stamp 1008288727 Thu Dec 13 19:12:07 2001 b) Format of the output This part will explain each field of the system call recorded in the trace file. 1. Ofsset of the entry: The offset of this entry (system call) from the beginning of the file. 2. UID: ID of the user that executes the sys call 3. PID: ID of the process that makes the system call 4. Process Name: the name of the process. Sometime the field is left as "??". That means the trace utility doesn't know the name of the process. This happens when these processes (like initd) are forked before the utility started. 5. A/B: "A" means the entry is recorded after the system call is executed where as "B" means otherwise. 6. Timestamp: The time when the system call is called in second and millisecond (see struct time_t in unix man page) 7. System call and the corresponding arguments 8. The return value of the system call 9. When a certain system call is repeated, there will be an optional field which specifies the number of repetitions For example: 252 UID 0 PID 812 /sbin/syslogd A 1008288727.437706 open("/lib/i686/libc.so.6", O_RDONLY, 5772268) = 3 The offset of this entry is 252 byte from the beggining of the file Root (UID 0) owned this process (PID 812) /sbin/syslogd. This process execute the "open" system call and the entry is logged when the system call finished (A) at timestamp 1008288727.437706 The system call argument: open the file /lib/i686/libc.so.6 as READONLY. The system call return 3 as the file handle for the opened file. --------------------------------------- IV. Some example output ASCII of the trace: 1) The following is a snapshot of normal trace data when the computer starts up: 0 RESTART (scratch) at 1008288727 Thu Dec 13 19:12:07 2001 32 UID 0 PID 812 ?? B 1008288727.376532 execve("/sbin/syslogd") = 0 76 UID 0 PID 812 /sbin/syslogd A 1008288727.437433 execve("") = 0 104 UID 0 PID 812 /sbin/syslogd A 1008288727.437576 open("/etc/ld.so.preload", O_RDONLY, 0) = -1 (2) 160 UID 0 PID 812 /sbin/syslogd A 1008288727.437633 open("/etc/ld.so.cache", O_RDONLY, 78748) = 3 216 UID 0 PID 812 /sbin/syslogd B 1008288727.437650 close(3, 78748) = 0 252 UID 0 PID 812 /sbin/syslogd A 1008288727.437706 open("/lib/i686/libc.so.6", O_RDONLY, 5772268) = 3 308 UID 0 PID 812 /sbin/syslogd A 1008288727.437723 read(3, 1024) = 1024 344 UID 0 PID 812 /sbin/syslogd B 1008288727.437844 close(3, 5772268) = 0 380 UID 0 PID 812 /sbin/syslogd A 1008288727.439369 chdir("/") = 0 412 UID 0 PID 812 /sbin/syslogd A 1008288727.439558 open("/var/run/syslogd.pid", O_RDONLY, 0) = -1 (2) 472 UID 0 PID 813 /sbin/syslogd A 1008288727.439762 fork(812) = 0 504 UID 0 PID 813 /sbin/syslogd A 1008288727.442123 open("/var/run/syslogd.pid", O_RDONLY, 0) = -1 (2) 564 UID 0 PID 813 /sbin/syslogd A 1008288727.442205 open("/var/run/syslogd.pid", O_CREAT|O_RDWR, 0) = 0 624 UID 0 PID 813 /sbin/syslogd A 1008288727.442593 write(0, 4) = 4 660 UID 0 PID 813 /sbin/syslogd B 1008288727.442605 close(0, 4) = 0 696 UID 0 PID 813 /sbin/syslogd A 1008288727.442911 open("/etc/resolv.conf", O_RDONLY, 94) = 0 752 UID 0 PID 813 /sbin/syslogd A 1008288727.443008 read(0, 4096) = 94 788 UID 0 PID 813 /sbin/syslogd A 1008288727.443057 read(0, 4096) = 0 824 UID 0 PID 813 /sbin/syslogd B 1008288727.443077 close(0, 94) = 0 860 UID 0 PID 813 /sbin/syslogd A 1008288727.443280 open("/etc/nsswitch.conf", O_RDONLY, 1750) = 0 916 UID 0 PID 813 /sbin/syslogd A 1008288727.443333 read(0, 4096) = 1750 952 UID 0 PID 813 /sbin/syslogd A 1008288727.443468 read(0, 4096) = 0 2) Examples of Traces When Attack Occurred The following are examples of traces of a buffer-overflow attack for crond. a) The deamon crond got overflowed and forked off "/tmp/cron_root" to overwrite the /etc/passwd file: 785516 UID 0 PID 546 /tmp/cron_root A 1015623540.370271 open("/etc/passwd", O_APPEND|O_CREAT|O_WRONLY, 744) = 4 785564 UID 0 PID 546 /tmp/cron_root A 1015623540.370459 dup2(4, 1) = 1 785600 UID 0 PID 546 /tmp/cron_root B 1015623540.370501 close(4, 744) = 0 785636 UID 0 PID 546 /tmp/cron_root A 1015623540.370659 write(1, 44) = 44 b) After that, it attempted to clean up the evidence by removing all temporary files: 789296 UID 0 PID 547 /tmp/cron_root B 1015623540.392235 execve("/bin/rm") = 0 789632 UID 0 PID 547 /bin/rm A 1015623540.395261 lstat("/tmp/cron_echo") = 0 789676 UID 0 PID 547 /bin/rm A 1015623540.395367 access("/tmp/cron_echo") = 0 789720 UID 0 PID 547 /bin/rm A 1015623540.395504 unlink("/tmp/cron_echo") = 0 789764 UID 0 PID 547 /bin/rm A 1015623540.395555 lstat("/tmp/ce") = 0 789800 UID 0 PID 547 /bin/rm A 1015623540.395594 access("/tmp/ce") = 0 789836 UID 0 PID 547 /bin/rm A 1015623540.395639 unlink("/tmp/ce") = 0 789872 UID 0 PID 547 /bin/rm A 1015623540.395681 lstat("/tmp/cron_root") = 0 789916 UID 0 PID 547 /bin/rm A 1015623540.395719 access("/tmp/cron_root") = 0 789960 UID 0 PID 547 /bin/rm A 1015623540.395765 unlink("/tmp/cron_root") = 0 --------------------------------------- V. Sanitizing trace: To protect the privacy of the users, these traces are sanitized in the sense that there will be no information indicating who each user in the trace is All system user like root are preserved (not sanitized). All human users are sanitized. Some filenames of human users are sanitized but we still preserved all the extension of the files (for ex: IRS-TAX.TXT will become |j2kn0p|.TXT) For more information on the sanitizing algorithm please contact: Dr. Geoff Kuenning: geoff@cs.hmc.edu