strace: Confirming shred

In this article we present a practical use of strace. shred is a tool used to securely delete files by overwriting them with random data several times. The theory is that when erasing files normally a ghost of the data is left behind. Overwriting the file several times is supposed to clear this ghost, and render the file unrecoverable. Of course today we have logging filesystems, SSDs and backup systems that all may retain data of the file, so shred is not that useful.

If we give the -u and -z arguments to shred it should at least do the following things:

  • Overwrite the entire contents of the file three times (the default).
  • Rename the file multiple times.
  • Truncate the file, and remove it when it is done.

All of these things are operations that can’t be done in userspace and require system calls to the kernel. This means that strace is an efficient tool for confirming this behaviour.

In this example we run the command shred -u -z /tmp/test under strace. We will start looking at the strace log from the point where all the libraries are setup. The full output is captured in this file if you wish to see the full output.

open("/dev/urandom", O_RDONLY)          = 3
read(3, "\307O\217\234=}\377X\333bo\265]\247\300;\315I\347\246\30\np\256\4e\211V\24:v\366"..., 2048) = 2048
close(3)                                = 0

Here shred is opening the Linux source of pseudo random numbers, reading 2048 bytes of randomness (of which it gets all) then closing the file. Here we do see that shred is reading some randomness, but its important to not that we cannot know if it used it. Any algorithms that use this randomness later will happen entirely in userspace, so they wont appear in the strace output.

open("/tmp/test", O_WRONLY|O_NOCTTY)    = 3
fstat(3, {st_mode=S_IFREG|0664, st_size=62, ...}) = 0
lseek(3, 0, SEEK_SET)                   = 0
write(3, "\275\4\17\241vk6MC#\261\336\313\0069)\177\25%\220ZZ4\224{\210TQ`\266\370c"..., 62) = 62
fdatasync(3)                            = 0

The file we are trying to delete is open()ed. Then fstat() is used to read its characteristics from its file descriptor. Next we lseek() to the start of the file, even though we were likely there already since we just opened the file. 62 bytes that appear to be random are written to the file, the old file was exactly 62 bytes as we saw from the fstat() call so it is entirely overwritten. As we said earlier the data appears to be random, but we don’t know what algorithm has produced it, or if it is derived from the earlier read of /dev/urandom. Finally fdatasync() is called to flush the data out to disk.

lseek(3, 0, SEEK_SET)                   = 0
write(3, "\0\213\236\251tG\0\0;m\302O\22\221\371\20\250\244R\37w\376f\347\354\267:\375\220\301\377N"..., 62) = 62
fdatasync(3)                            = 0

Once again shred uses lseek() to go back to the start of the file. Then makes a write() of 62 bytes of random data. Finally it uses fdatasync() to make sure the bytes are written to disk.

lseek(3, 0, SEEK_SET)                   = 0
write(3, ")B\215J$\372$\222\251Id\327r\0379\236\375\311^\35h\333P\335;\36n\246\24\307\t\354"..., 62) = 62
fdatasync(3)                            = 0

The same thing as last time, with different random data.

lseek(3, 0, SEEK_SET)                   = 0
write(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 62) = 62
fdatasync(3)                            = 0

The same thing, except the string is all null characters (or zeros). The null characters are represented by \0 in the string.

lseek(3, 0, SEEK_SET)                   = 0
write(3, ",\217'\364\254\260\362\20d{\2\207\211-\224\300\333:\220k3\331\233|\234m\36\3027\17 S"..., 4096) = 4096
fdatasync(3)                            = 0
lseek(3, 0, SEEK_SET)                   = 0
write(3, "\212\221\304(\315d^\3705:\235\331\343?\340\376\335\214z2+\245\255\331y\226LO<\f\304O"..., 4096) = 4096
fdatasync(3)                            = 0
lseek(3, 0, SEEK_SET)                   = 0
write(3, "z\274\236\n\237E\22\261\6\310e\f\351\326;J U\374*\331{\210`\237|W\321\265\201_V"..., 4096) = 4096
fdatasync(3)                            = 0

Three writes of random data like last time, except this time 4096 bytes are being written. This is likely done to mask the original file size.

lseek(3, 0, SEEK_SET)                   = 0
write(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = 4096
fdatasync(3)                            = 0

A final write of 4096 null characters.

ftruncate(3, 0)                         = 0
close(3)                                = 0

shred calls ftruncate() to change the size of the file to zero. Effectively making the file empty and discarding all the data contained. The file is then close()ed.

open("/tmp", O_RDONLY|O_NOCTTY|O_NONBLOCK|O_DIRECTORY) = 3

shred opens the /tmp directory. On Linux you cannot guarantee a rename has been written to disk without opening the folder that contains and performing a fsync() or fdatasync() on its file descriptor. You can read about this on the fsync() man page. shred will use this file descriptor to make sure the renames it is about to do get written to the disk.

lstat("/tmp/0000", 0x7ffd7c8d5750)      = -1 ENOENT (No such file or directory)
rename("/tmp/test", "/tmp/0000")        = 0
fdatasync(3)                            = 0

First shred checks for the existance of a file named /tmp/0000 to avoid accidentally overwriting it. It then renames /tmp/test to /tmp/0000, and forced that data to be written to the disk.

lstat("/tmp/000", 0x7ffd7c8d5750)       = -1 ENOENT (No such file or directory)
rename("/tmp/0000", "/tmp/000")         = 0
fdatasync(3)                            = 0

The same as the last, except the file is one character shorter.

lstat("/tmp/00", 0x7ffd7c8d5750)        = -1 ENOENT (No such file or directory)
rename("/tmp/000", "/tmp/00")           = 0
fdatasync(3)                            = 0
lstat("/tmp/0", 0x7ffd7c8d5750)         = -1 ENOENT (No such file or directory)
rename("/tmp/00", "/tmp/0")             = 0
fdatasync(3)                            = 0

Two more renames each making the filename one character shorter, until only on character remains.

unlink("/tmp/0")                        = 0
fdatasync(3)                            = 0

Finally the file is deleted using unlink(). This is force written to disk too.

close(3)                                = 0
close(1)                                = 0
close(2)                                = 0

the directory is closed, followed by stdout and stderr.

exit_group(0)                           = ?
+++ exited with 0 +++

Finally shred exits.

So it looks like we underestimated shred. It wrote random data to the file six times instead of three. Half of those times it overwrote every byte of the exiting file (62 bytes), and the other half it wrote 4096 bytes. It did the same thing with the writes of all zeros, but only once at each size. In total it renamed the file four times before truncating and removing it. After each operation it did it used fdatasync() to make sure the data was written to disk before continuing.

Overall shred exceeded our expectations.