• On September 14, 2016
  • By

Recovering Deleted Windows Files [Breakdown and Theory]

When you delete a file on Windows, specifically Windows 10 Professional x64, it is removed from the user’s view and placed in the recycle bin for permanent deletion. Once that permanent deletion is performed, some ask the question “is the file really gone?” The answer is no, not immediately, and even after permanent deletion it’s just marked as unused space and available to be overwritten. Let’s start from the beginning and I’ll clear everything up by the end of this article.

If you’re not familiar with how NTFS works please refer to this breakdown since it is far too complex to cover in this single article, but is important to understand to grasp the breakdown and theory of this post. We’ll be referencing the file system and how it stores and organizes file data toward the end of this post. But to start, we’ll just break down what happens when you simply delete a file.

When you hit delete after highlighting a file and pressing the Delete key, Windows invokes a routine in the UserMode module kernelbase.dll called DeleteFileW. The first thing that happens is that the system must get a handle to the file, it does so by using NtOpenFile to open the existing file, directory, or resource and obtain a handle to operate on. If the file exists, and a valid handle is obtained, the system queries the file attributes using the FileAttributeTagInformation enum value as the FileInformationClass argument with the system routine NtQueryInformationFile. The reason for this call is to check whether the file has a reparse point which is used to locate external information that is associated with the file.

Following the call to NtQueryInformationFile the procedure continues by releasing the native pathname, which could look similar to ??C:UsersDaaxDesktoptest.exe. It does this to ready the file for deletion and cleanup any temporary data that could remain. After calling the appropriate routines, it calls NtSetInformationFile which is used to set the file disposition information to ask that the file be deleted. This request is accepted once it is packaged with an IRP (I/O Request Packet) and sent to the appropriate device object for completion. The routine that completes this is IoCallDriver which sends the IRP off to the device object for processing.
Once all of the above is complete and the file is out of view from the user, the DeletePending field in the _FILE_OBJECT structure associated with the file is set to true and the file is marked for deletion when the last handle to it is released. The actual deletion of the file includes deallocating allocated clusters and the IRP_MJ_CLEANUP is processed by the driver. The processing of IRP_MJ_CLEANUP means that the last handle to the object has been closed but outstanding I/O requests for the file or object may still be queued which means that until an IRP_MJ_CLOSE is processed and accounted for, the file is still accessible to incomplete I/O requests. This means that even after cleanup, data still remains, which is what allows for recovery. In fact, even after the last handle to the file is closed and IRP_MJ_CLOSE is processed, data still remains available for reference on disk. It’s just not available by conventional means of opening a file. I’ll break down how one can unconventionally recover a deleted file, so long as it hasn’t been wiped or overwritten…

Theoretical Recovery

Since I’ve already had experience with the practical side of recovering deleted files I figured I’d start off with a little theory that I had when I first started looking into the possibility of recovering deleted files on an NTFS volume.
I believed it would be possible to iterate the Master File Table on an NTFS volume and use a hex editor to search each of the records in the MFT for specific file format headers such as PNG, MZ, DS, etc. However, this approach would be extremely cumbersome because the average PC user has hundreds of executables on their machine, likely a massive amount of images (PNG or JPEG), and other common file types that have file format headers that I would first look for. If I wanted to recover this file of interest, I’d need to go through each partition on a drive, every sector, every record in the MFT, copy the file data of that record with a common file format header, and create a new file with the same data until I had something more concrete. I’m sure there are other ways to go about it, but I quickly realized that would be ridiculously time-consuming and ineffective.
So I took a more practical approach and wrote a recovery tool that iterates the records in the MFT, stores the information in a neatly formatted way, and provides a rough UI for interacting with the records of interest and allowing me to write the data for the specific record back to a new file for analysis.

Practical Recovery

I’ll briefly lay out the thought process behind creating this file recovery tool. I’ll also note that this is where knowledge of the file system is helpful because even after the file is deleted and the clusters deallocated there is still information left behind. This is why it’s noted as possible to recover deleted files, so long as their cluster hasn’t been overwritten by new data yet. I will briefly mention that the Master File Table contains attributes, flags, and the name of every file in an NTFS volume. Unless the contents of the file have been wiped or overwritten, the data still remains until another file overwrites the cluster.
In theory, and practice (I just don’t have a build worth releasing), you can recover deleted files by first opening and reading the physical drive (\.PhysicalDriveN) to acquire information about the partition tables present on disk. Next, you’d scan the partitions of the physical disk and verify the partition formatting, in this specific case, we’d want NTFS. You could also check the extended partitions, and I would recommend it. After obtaining sector information, one could point to the beginning of the NTFS volume, read the boot sector for Master File Table (MFT) information, load the entire logical MFT cluster, read the first record which will always be the $MFT record, and then extract the information from the record. One could easily create a structure or class to hold all of this information, and it’s all laid out in the MSDN reference linked here
 
I do apologize in advance for any typos or run-ons, it’s rather late and I’m pushing this post out quickly. I’d love to hear your theories, practices, or experiences related to file recovery.
As always: any comments, questions, or feedback are welcome.

Author

Leave a Reply