Today I’m going to talk about an advanced data recovery case. Data recovery is a business that’s traditionally been shrouded in mystery to most IT professionals, and for good reason. For years the big box recovery companies have been using deceptive advertising and outlandish claims to make it seem like they have abilities no one else has. Today though, I’m going to break down a tough data recovery case in full detail of what was done to recover the data. This is a case that many companies, including the big box guys, would have likely given up on and deemed unrecoverable. It’s pretty much a worst case scenario barring total platter destruction.
The Advanced Data Recovery Case Study
Diagnosis Phase
Hard drive was delivered to our lab via commercial carrier. It was described by customer as at first not spinning, however after customer attempted replacing PCB it is now clicking and spinning down. The customer provided both the original PCB with missing U12 ROM chip as well as the replacement board to our lab. The hard drive labeled specs are as follows:
HDD Specs
Model: WD40EFRX-68WT0N0
Serial: WCC4E………
Firmware: 80.00A80
Capacity: 4TB
DCM: HGNNNTJMGB
Series: WD Red NAS Drive
Interface: SATA III
Country of Manufacture: Thailand
Date of Manufacture: 04 Jun 2014
Because the drive was described as clicking, the first order of business was to perform a visual inspection internally. This is done before power is ever applied to the drive. A drive which is clicking could have physical damage to the read/write heads and be causing physical platter damage with each click.
The visual inspection was performed within an ISO 3 rated clean environment and revealed no visible damage to the heads or platters. It is determined that it is safe to power on the drive for further diagnosis.
Drive is powered on and a sound check is performed using the donor PCB provided by the customer. The drive clearly is spinning up, attempting to access the service area, then spinning down. After a few seconds it repeats this process a few times before stopping and reaching a ready ATA state. Initial diagnosis appears to be failed read write heads or possibly service area damage.
Although PCB failure seems unlikely, the fact that it wasn’t replaced by our lab makes us uncertain of it’s functionality. Also the REV of the board seems unlikely to be correct given the date of manufacture, and the board shows signs of overheating. As a precaution another board better matched by our company is programmed using the ROM code from the patient ROM. The PCB provided by the customer is then tested on a known good drive to check it’s functionality. It is determined that the board was indeed overheated and is non-functional.
Despite the new, good, PCB the drive continues to click and spin down. Attempts are then made to gain access to the drives service area for firmware diagnosis. This included force loading ATA overlays and modules directory into the PCB RAM, and attempting to ready any of the four service area copies. Despite these efforts, no copies of the SA are accessible from any copy. This confirms that the drive indeed has failed read/write heads.
Job is quoted as Tier 2 Hardware Level Recovery (should have been Tier 3 as it was a more advanced data recovery case than expected)
Quoted Cost of Recovery: Tier 2 Base ($650) + Over 2TB ($100) + Donor Drive (Provided by Customer)
Total Quote: $750 – Quote Approved by Customer
Customer is informed that there is the possibility of of needing multiple donor hard drives to complete the recovery.
Data Recovery Phase
After customer has signed quote the advanced data recovery phase begins. First hard drive is taken into clean room for preamp replacement. New read/write head actuator assembly is safely installed and the original set is put into donor hard drive for safe storage until re-assembly after the project completes.
At this point the drive is connected to PC-3000 for firmware backup and diagnostics. Upon power up the drive is found to still be clicking and spinning down as before. This leads us to suspect service area damage in addition to failed read/write heads. To rule out failure of the read write heads during the transplant ATA overlays and modules directory from donor hard drive service area is again force loaded into the PCB’s RAM.
Despite this, the drive is still unable to read the SPT (ariel density) which is usually an indicator of failed heads or catastrophic service area damage. At this point many labs would have likely given up and considered the case a lost cause, but not us.
Our next step was to see if we could read service area tracks by using hot swap method. A donor hard drive with donor PCB is powered up, then put into a special spin down mode. As this is a WD Red drive, the typical standby command does not result in stopping the spindle only dropping the RPM level down. However using the tech command, the drive is able to spin down, after which the suspend command is also given.
Now the PCB is carefully moved over to the patient hard drive without powering off. A recalibration command is issued, and to our surprise succeeds. This verifies that the heads are able to read servo data on the platters and confirms that the head replacement was a success.
Next unnecessary heads which don’t contain a service area are disabled in the PCB ram (to avoid any unnecessary recalibrating). Using read by ABA a composite copy of the hard drives service area tracks is read. Both copies contain bad areas, however by using composite read a composite copy of each track is successfully made. These composite tracks are then loaded onto a donor hard drive for further analysis. All service area modules check out when read by composite, so a good copy of the service area has been obtained.
The Advanced Data Recovery Phase
Now’s when it gets really tricky to work on this case. Our first effort is to see if it’s possible to repair the service area of the patient drive using the good copy of the modules we have. Each of the two primary copies from head 0 and head 1 are analyzed individually. The result is that there are a number of damaged firmware modules in both copies. Most modules are simply re-written using write by ABA commands resulting in a readable copy. However two modules, module 01 and module 32, are unreadable despite multiple efforts to re-write the modules.
Module 32 is easy enough to re-locate, as it can simply be moved to another are of the service are, and the modules directory updated to reflect this. By shrinking the size of another unnecessary log module, space is made and this module is moved. The entry in our modules directory read by composite is also updated to reflect this new location.
Module 01 is another story. As this module is the directory of modules, it cannot be moved by simply making a change in the directory of modules. Normally in such a case a procedure called a smart hot swap would be performed. Basically it’s a procedure where the entire patient service area is written to a donor hard drive, used to load the SA onto the PCB RAM, and then transferred to the patient for reading. However in this case such procedure won’t be possible. The drive has approx. 3Tb of data on it and bad sectors scattered on all platter surfaces. Each time this series of drive hits a bad sector it’ll access the service area (including the directory) to make log entries. This makes the drive unstable and goes right back to clicking and spinning down each time.
If only a few files were necessary, this procedure might have worked. However the customer is insistent that all the data is needed not just selected files. So another solution is necessary or it would take virtually forever to recover the drive.
It is determined that the location of the modules directory on this model is actually specified in the PCB’s ROM code. Using a technique we devised, a good (modified) copy of the modules directory was written to an area of the service area tracks that was determined to be good in both copies. Afterward the ROM code was modified to direct the board to look for module 01 in the new location.
Success! – The drive is now able to initialize on it’s own.
Advanced Imaging Phase
It’s now been almost two weeks that we’ve been working on this drive trying various data recovery techniques to get it functional (there’s a lot we did which isn’t mentioned in the summary above). We finally have a functional hard drive and are ready to start imaging.
For the imaging process we elect to use Data Extractor software along with a PC-3000 Express system. This system provides the greatest control over the imaging process. A headmap is created so that data can be selectively imaged by read write head and individual heads can be disabled as necessary.
After the first few million sectors are read the NTFS $BMP file is analyzed to build a map of only the allocated sectors. This will limit the imaging to only the areas where data is actually stored.
At the beginning of the imaging process all 8 heads are reading quite well despite the occasional cluster of bad sectors on each platter. The read timeouts are intentionally kept very low ~300ms to prevent strain on the heads early on during the imaging. The first 40% of the drive is read without any major incident.
Around 40% into the imaging process head 5 is observed to be reading very slowly. It’s possible that the head is becoming dirty or weak. For the time being it is temporarily disabled and reading continues from the other heads. At around 60% head number 6 also becomes unstable and struggles to read sectors and is also disabled. The other heads continue imaging to the end of the data area specified in the NTFS $BMP file. Despite having bad sectors over 99% of the data is read from the remaining 6 heads.
Now heads 5 and 6 are individually imaged so read timeouts can be adjusted to optimize reading from each. Head 5 continued reading although hitting an increasing number of bad sectors until around 55% where it completely died and couldn’t read sectors any longer. Head 6 lasted until around 70% then also failed completely. Heads were then removed from the drive and cleaning of them was performed. Platter surfaces were also wiped down to remove any possible contaminants. Despite this cleaning, no further data could be read with the affected heads.
At this point around 85% of the data has been recovered and a file listing is provided to customer for review. The customer confirms that most data is recovered, however would still like to try for some more of the lost data. Customer agrees to cover the cost of a second donor hard drive which must be purchased at an additional cost of $129. Second donor hard drive is ordered and arrives the following week.
A second head replacement is performed, again the drive is able to initialize and function on it’s own. Approximately 30GB of additional data was able to be read from head 5 before it again failed. It’s assumed there must be physical damage to the platter surface which is causing any head reading that surface to prematurely fail. Head 6 however was able to read the majority of the remaining data previously unread. The result after second read/write head replacement is that over 95% of all sectors were read successfully from the drive.
Logical recovery process is then performed to extract the files onto the return media. Files affected by bad sectors or unread sectors are also extracted for possible file repair, and are stored in a separate folder on the return media.
Total Time Spent Working on Project: Nearly 1 month of non-continuous work. Approx. 80 man hours actually dedicate to it.
Final Cost to Customer: $750 Labor + $129 for Second Donor Drive + First Donor & Return Media (provided by customer at unknown cost).
Profit to Data Medics: Total Loss of Profit + Additional Labor Cost Losses
Advanced Data Recovery Summary
As you can see, an advanced data recovery case like this is one where everything that could have gone wrong did. The case took around a month to complete when it was anticipated to only require about 1 week. The customer had to shell out over $800, and we all know how bad it sucks to have to pay that to just get your own data back. Yet it’s likely that it cost our company $4000-$5000 in equipment time and labor to get it done. Fortunately advanced data recovery cases aren’t all quite so complex, and the lessons we learned in this case will be useful on future cases. We look at cases like this as an investment in research and development. To actually turn a profit on an advanced case like this we should have quoted more like six or seven thousand dollars. However as a company that puts it’s customers first we stuck to our original quote and accepted the losses.
Unfortunately not all data recovery companies are willing to take on such advanced data recovery challenges. To maintain profitability, they’d have deemed it “unrecoverable” and given up. Some have been known to even sabotage the drives before returning them. Thus ensuring no other company can recover the data after they give up on it.
If you have an advanced data recovery case please contact us to see what we can do to help you get back your data. Even if it’s been looked at or worked on by another company we’ll provide a free evaluation to see if there’s anything that can be done to get your files back.
Data Medics | Meeting the Challenge of Advanced Data Recovery
Good Job!
hard work and knowledgfull