1 Comment

Restoring Deleted Files in Linux from the ext3 Journal

Deleting Computer Files
Someone just `rm -rf *`-ed from `/` on a production server.

Fortunately, you have backups. Unfortunately, the server included a database with important business data that was written just before the disaster. That most recent data is not included in the last database backup.

You panic: “This is a Linux server, there’s no Trash, no Recycle Bin, no ‘undelete’…” Is there any chance of recovering all of the data?

Breathe.

Think.

“This is a Linux server… there should be plenty of ways to access those bits and bytes more directly…”

You start to think about trying to `grep` through the block device, but you don’t know exactly what you’re looking for — and much of it is likely to be binary data.

Can ext3 Retrieve the Deleted Files?

Suddenly you remember that ext3’s big advantage over ext2 is its journaled filesystem. It would be unorthodox, but…

If the files you need were accessed recently enough, there’s a chance the block pointers to the files might still exist in the filesystem journal.

While you shut down the wounded VM and attach the virtual disk to another VM to begin investigating, you also search the Internet with the hope that someone besides DenverCoder9 has been in these same circumstances

Lo and behold! Carlo Wood created just the tool you had imagined! ext3grep can reconstruct (many) deleted files based on entries from the filesystem journal.

ext3 and Deleted Files

Like many UNIX filesystems, ext3 represents files with a data structure called an inode. The inode contains metadata such as what user owns a file and the last time it was modified. It also contains pointers to the “blocks” where the contents of the file actually reside. When a file is deleted from disk, the blocks containing the file contents are not modified immediately; only the inode is changed. (The blocks are simply freed up to be overwritten as space is needed in the future.) On ext2 filesystems, the inode is marked as deleted, but the pointers are left intact. Ext3 actually zeroes-out the inode pointers, making it impossible to retrieve the file contents from a deleted inode.

Inode Pointer Structure

Inode Pointer Structure

The brutish method of `grep`-ing through the disk directly works for small files where you know some unique contents (accidentally deleted configuration files, for example) because there’s a good chance that the blocks have not yet been overwritten. However, trying to restore large or non-plaintext files via this method (e.g. MySQL binary logs) is a recipe for sorrow.

But if the block pointers are zeroed-out in the inode, how can we reconstruct the blocks into a complete file?

As you might have guessed from the title of this post, the answer is “from the journal”. By default on most modern Linux systems, ext3 is configured to log all metadata changes (like file creations and deletions) to the journal. Carlo Wood’s ext3grep utility facilitates reading these journal entries and using them to reconstruct files. Whether this will contain the block pointers we need depends on how big the journal is and how recently the files were last modified, but in our case, the database binlogs are contained there and can be reconstructed in their entirety.

Using the Journal to Restore Files

First we’ll look in the journal for deletion events. If we know the approximate time that tragedy struck our poor server, this step will be much easier. We’ll find a huge glut of deletions and look at the first set, working back until we’re confident that we know the time when the filesystem was in its happier state.

root@datarecovery:~# ext3grep /dev/sdb3 --histogram=dtime --after 1335555802 --before 1335555805
Running ext3grep version 0.10.1
Only show/process deleted entries if they are deleted on or after Fri Apr 27 15:43:22 2012 and before Fri Apr 27 15:43:25 2012.

Number of groups: 156
Minimum / maximum journal block: 1544 / 35884
Loading journal descriptors... sorting... done
The oldest inode block that is still in the journal, appears to be from 1335458082 = Thu Apr 26 12:34:42 2012
Journal transaction 19354469 wraps around, some data blocks might have been lost of this transaction.
Number of descriptors in journal: 30581; min / max sequence numbers: 19354439 / 19359432

Only show/process deleted entries if they are deleted on or after 1335555802 and before 1335555805.
Only showing deleted entries.
Fri Apr 27 15:43:22 2012  1335555802        0
Fri Apr 27 15:43:23 2012  1335555803     1575 ===================================================
Fri Apr 27 15:43:24 2012  1335555804     3029 ====================================================================================================
Fri Apr 27 15:43:25 2012  1335555805
Totals:
1335555802 - 1335555804     4604

Also note that this output tells us the oldest inode block still in the journal. We have a shot at restoring any files accessed after that time, but not necessarily things that have not been accessed after that time.

If we were looking for a particular file, we could attempt to recover it using --recover-file

  --restore-file 'path' [--restore-file 'path' ...]
                         Will restore file 'path'. 'path' is relative to the
                         root of the partition and does not start with a '/' (it
                         must be one of the paths returned by --dump-names).
                         The restored directory, file or symbolic link is
                         created in the current directory as 'RESTORED_FILES/path'.

Be prepared to wait. This process attempts to find inodes matching the path provided and (if there are multiple matches) reason about which one to attempt restoring.

We can also have ext3grep attempt to recover all the files it can by using --restore-all:

  --restore-all          As --restore-file but attempts to restore everything.
                         The use of --after is highly recommended because the
                         attempt to restore very old files will only result in
                         them being hard linked to a more recently deleted file
                         and as such polute the output.

For example:

root@datarecovery:~# ext3grep /dev/sdb3 --restore-all --after 1335555800
Running ext3grep version 0.10.1
Only show/process deleted entries if they are deleted on or after Fri Apr 27 15:43:20 2012.

Number of groups: 156
Minimum / maximum journal block: 1544 / 35884
Loading journal descriptors... sorting... done
The oldest inode block that is still in the journal, appears to be from 1335458082 = Thu Apr 26 12:34:42 2012
Journal transaction 19354469 wraps around, some data blocks might have been lost of this transaction.
Number of descriptors in journal: 30581; min / max sequence numbers: 19354439 / 19359432
Loading sdb3.ext3grep.stage2..............................

If it works, the files will be restored to ./RESTORED_FILES/.

Further Reading

Related Tools