Hard drive failure indicators

Discussion in 'other software & services' started by GroomLake, Dec 19, 2006.

Thread Status:
Not open for further replies.
  1. GroomLake

    GroomLake Registered Member

    Joined:
    Jul 13, 2006
    Posts:
    116
    What are the indicators that a hard drive failure is eminent? And should one run the manufactures diagnostic program on a regular basis?
     
  2. GroomLake

    GroomLake Registered Member

    Joined:
    Jul 13, 2006
    Posts:
    116
    In other words you better take and emergency backup while you still can.
     
  3. GroomLake

    GroomLake Registered Member

    Joined:
    Jul 13, 2006
    Posts:
    116
    My system stated hanging up and the only way to get it going was to depress reset and reboot. This would happen every other day or so for a month. I never power down my drives up 24x7x365. However I did power them off for about 12 hours on day. After powering them on “C” would not boot. I ran Western Digitals diagnostic tool and it reported back that there was something wrong with the heads. Swapped out hard drive and did a restore and was back in business. Sent hard drive in for free replacement. They told me if I had not shut the drive off it probable would have had intermittent failures for months on end.
     
  4. GroomLake

    GroomLake Registered Member

    Joined:
    Jul 13, 2006
    Posts:
    116
    Wetern Digital diagnosic error codes:

    Data Lifeguard Tools 11 Error Codes
    and
    Data Lifeguard Diagnostic Error Codes

    ** If you encounter the same error code more than once after re-testing, we recommend for you to create an RMA.

    Error code Explanation Definition Status
    000 No Errors Found Successful operation. The drive is defect free. No Errors
    100 No Errors Found Successful operation. The drive is defect free. No Errors
    101 Unknown Error An unknown error has occurred during testing. This may be an anomaly. Check connections and retest. If the error repeats, replace the drive. Re-Test Drive
    102 Seek Timeout A Seek command did not complete in the time allotted for its completion. This may be an anomaly or a defect with the drive. Retest. Replace the drive if the error repeats. Re-Test Drive
    103 Write Fault Error A Write command during the test has failed to complete. This may be due to a media or read/write error. It may also be due to a defective connection. Retest after checking the connections. Replace the drive if the error repeats. Re-Test Drive
    104 Drive Not Ready The drive did not properly respond to test commands. This may be due to a defect with the drive or the drive may not have responded properly due to a bad connection. Check cabling and retest. Replace the drive if the error repeats. Re-Test Drive
    106 Track 0 Error Track 0 was not properly detected. Track 0 on the drive must be accessed to perform parts of various internal tests. Track 0 also holds information about the drive. This error may have internal and external reasons. Retest. Replace the drive if the error repeats. Re-Test Drive
    107 Check Sum Error Accumulated test data on the drives is corrupted. Check cabling & retest. Replace the drive if the error repeats. Re-Test Drive
    108 Seek Not Complete A Seek command did not complete in the time allotted. This may be an anomaly or a defect with the drive. Retest. Replace the drive if the error repeats. Re-Test Drive
    112 IRQ Timeout Interrupt ReQuest Interrupt signal not received. An interrupt command to perform a specific task failed to complete. This may be due to an internal error or to a failed connection which did not allow the interrupt command to be sent to the drive properly. Check cabling and retest. Replace the drive if the error repeats. Re-Test Drive
    115 ICRC Error Ultra DMA CRC error. Data sent between the host computer and the drive has been corrupted. If the system cannot properly handle a drive running a specific Ultra ATA rate such as ATA100, the data may become corrupted. To run in ATA 66 and ATA100 rates, an Ultra ATA 80 -conductor cable must be used. Check cabling & retest. May need to run DLGUDMA to set the drive to a slower speed or replace the IDE cable. You may also want to reroute your IDE cable away from sources of electronic bus noise such as your CPU, Power Supply, etc. Re-Test Drive
    116 IDNF Error Address Not Found Error. The Identify Drive Command has not received an acceptable response from the drive. This may be due to a defect. Ensure that you are using the latest version of diagnostic utility and that your cable is in good working condition. Retest. Replace the drive if the error repeats. Re-Test Drive with latest diagnostic utility
    117 Uncorrectable ECC Error Uncorrectable Error Correction Code (ECC) Error. There could be media errors present on this drive. If the automatic repair feature is unable to repair these errors, replace the drive. Replace drive if unable to correct error
    118 DAM Error Data Address Mark (DAM) Error. There may be media errors present on this drive. If the automatic repair feature is unable to repair these errors, replace the drive. Replace drive if unable to correct error
    120 Unknown Error An unknown error has occurred during testing. This may be an anomaly. Check connections and retest. If the error repeats, replace the drive. Re-Test Drive
    121 Servo Error Servo error. This error is most likely an internal malfunction of the drive and not related to the condition of the cables etc. Retest and replace the drive if the error repeats. Re-test
    132 Command Error Command Aborted. Please ensure that you are using the version of diagnostic utility corresponding to either newer or older Western Digital drives. Re-test Drive with appropriate diagnostic utility
    133 Illegal ID FW overlay not found. The information file, which holds the data pertaining to this drive, is corrupted or missing. Replace the drive. Re-test Drive
    134 Busy Timeout Timeout from checking busy bit. The drive has not responded back in the time allotted. This may be due to a defect with the drive or a bad connection. Check cable & retest. Replace the drive if the error repeats. Re-test Drive
    135 DRQ Timeout Timeout from checking Data ReQuest Timeout (DRQ) bit. The drive has not responded back in the time allotted. This may be due to a defect with the drive or a bad connection. Check cable & retest. Replace the drive if the error repeats. Re-test Drive
    136 Bad Sector Sector Marked Bad Error. There may be repairable media errors on a platter. The automatic repair feature can attempt a repair if possible. You may need to rescan to ensure that the repairs were effective. Replace the drive if the repair fails. Re-test Drive
    137 Relocated Sector Sector Relocated. There may be repairable media errors on a platter. The automatic repair feature can attempt a repair if possible. You may need to rescan to ensure that the repairs were effective. Replace the drive if the error repeats. Re-test Drive
    138 Still Busy Timeout Timeout from checking busy bit. The drive has not responded in the time allotted. This may be due to a defect with the drive or a bad connection. Check cable and retest. Replace the drive if the error repeats. Re-test Drive
    148 Not Selected Drive not selected. The drive may not have been accessed properly possibly due to a bad connection. Replace your cable and retest. Replace the drive if the error repeats. Re-test Drive
    159 SMART Error Self Monitoring, Analysis, and Reporting Technology (SMART) Error returned during SMART Status/Self Test Command. The drive is defective. Replace. Replace Drive
    163 Unknown Error Queued command timed out. The command set to be executed has timed out. This may be a drive issue, however it may be related to a defective connection. Replace your cable and retest the drive. Further errors indicate a defective drive. Replace. Replace Drive
    200 Drive Not Tested Supported WD drives are initialized with this status. This is the pretest initialization code which indicates a drive is ready to be tested, but has not yet been tested. Re-test Drive
    201 Non-WD Drive The drive does not have a WD serial number. This error can occur on non-Western Digital drives. It may also happen when the wrong version of diagnostic utility is used. It is also possible to see this error on defective Western Digital drives. Re-test Drive with appropriate diagnostic utility
    202 Drive Not Supported Older Western Digital drives are not supported by the diagnostic utility version being used. Use the appropriate version. Version 4.12 for older drives and version 5.00 for current drives. Non-WD drives are not supported. Re-test Drive with appropriate diagnostic utility
    204 Missing Log File The log file that existed at startup of diagnostic utility has been moved or no longer exists. Create and use a new DLG Tools diskette. Re-test Drive with appropriate diagnostic utility
    205 Aborted By User Test was aborted by the user (Alt-X pressed during a test). Re-test Drive
    206 Memory Allocation Error Unable to allocate memory for program structures. Please ensure that you are not loading any other DOS level files from the floppy prior to running diagnostic utility. This error may also appear if the floppy has been infected by a hidden virus. Check your floppy and retest. Check Floppy and use appropriate diagnostic utility
    207 Critical Resource Error Unable to locate and/or use a system resource (e.g. - printer). The printer may not be attached or turned on. This error may also be related to memory issues. Please ensure that you are not loading any other DOS level files from the floppy prior to running diagnostic utility. This error may also appear if the floppy has been infected by a hidden virus. Check your floppy and retest. Check Floppy and use appropriate diagnostic utility
    209 Self Test Failed To Run SMART Self Test failed to start when executed. The drive has not responded to the SMART test request. The drive has failed and must be replaced. Replace Drive
    210 Self Test Incomplete SMART Self Test failed to complete (e.g. -- timed out). The failure to complete the test indicates a failed drive. Replace. Replace Drive
    211 2-9 Uncorr ECC Errors Error Correction Code (ECC) 2 through 9. A number of ECC errors (between 2 through 9) have been detected. ECC is a hardware correction technique that corrects errors. If ECC occurs, use Data Lifeguard repair option for additional error correction. Retest the drive with Data Lifeguard. Re-test and Repair Option diagnostic utility
    212 10+ Uncorr ECC Errors Error Correction Code (ECC) 10+. At least ten ECC errors have been detected. ECC is a hardware correction technique that corrects errors. If ECC occurs, use Data Lifeguard for additional error correction. Re-test the drive with Data Lifeguard. Re-test Drive
    213 2-9 DAM Errors Data Address Mark (DAM) 2-9. Several instances of information on data positioning and location could not be found. Drive should be replaced. Replace Drive
    214 10+ DAM Errors Data Address Mark (DAM) 10+. Ten or more instances of information on data positioning and location could not be found. Drive should be replaced. Replace Drive
    215 2-9 IDNF Errors Identified Data Not Found (IDNF) 2-9. Several instances of information on data positioning and location could not be found. Drive should be replaced. Replace Drive
    216 10+ IDNF Errors Identified Data Not Found (IDNF) 10+. Ten or more instances of information on data positioning and location could not be found. Drive should be replaced. Replace Drive
    217 2-9 SERVO Errors SERVO 2-9. SERVO is data on track location. Several instances of track information could not be found. Drive should be replaced. Replace Drive
    218 10+ SERVO Errors SERVO 10+. SERVO is data on track location. Ten or more instances of track information could not be found. Drive should be replaced. Replace Drive
    219 Drive Cable Error Failure during cable test. The cable is loose, broken, or not plugged in. Recheck your connections and replace the cable. Retest the drive. Re-test Drive
    220 Drive is Locked Security feature of the drive reports locked status. Some vendors use the security feature to ensure the usage of only specific drives in their system, or the drive may have been locked by a user using a third party utility to enable this feature. The same utility and the original code used to lock the drive are necessary to unlock this drive. Please contact the system vendor for the above-mentioned information. Contact System Vendor
    221 Test Not Supported Certain older drives do not support certain SMART Self Tests. Please ensure that you are using the proper version of diagnostic utility. Re-test with appropriate diagnostic utility
    222 Drive Failed the Test The drive has failed the SMART test. Replace the drive. Replace Drive
    223 Errors Repaired Errors found, but have been repaired successfully. There were media errors that were within the repair capabilities of diagnostic utility. The drive should now be defect free. Test complete Defect Free
    224 Errors not Repaired Errors found and have not been repaired. There were too many errors found on this drive to be repaired. Replace the drive. Replace Drive
    225 Too Many Errors Found Error count reached a threshold value. There are too many errors detected on this drive to be repaired. Replace the drive. Replace Drive
    226 Sector Relocation Error Failure to relocate a sector during drive repair. The drive has to be replaced. Replace Drive
    227 SMART Not Supported Self Monitoring, Analysis, and Reporting Technology (SMART) Certain older drives do not support SMART. Please ensure that you are using the appropriate version of diagnostic utility. If the errors continue, replace the drive. Re-test with appropriate diagnostic utility
    0001 - 0008, 0015 SMART Error Self Monitoring, Analysis, and Reporting Technology (SMART) Error returned during SMART Status/Self Test Command. The drive is defective. Replace Drive
    0009 - 0014 SMART Error Self Monitoring, Analysis, and Reporting Technology (SMART) Error returned during SMART Status/Self Test Command. Retest the drive. Replace the drive if the error repeats. Re-test Drive
     
  5. ccsito

    ccsito Registered Member

    Joined:
    Jul 27, 2006
    Posts:
    1,579
    Location:
    Nation's Capital
    There are a few hard drive status tools that you can use on MajorGeeks.com.

    As for running the drive all year, I don't think that is necessary. A few years ago, it was thought that powering down (and up) will cause a strain to your PC (especially the motherboard). Leaving your drive constantly on makes the drive hardware to be in constant motion (the read/write head). I think this will generate more problems than if you shut down the PC daily after use. Once I left my system on for several days and I noticed that it was doing a ScanDisk command all by itself. Apparently, the OS detected that the drive was working for a rather long period of time and that a maintenance check of the drive should be done.

    In your case, since you had a head problem, shutting the system down would not have helped. You probably would have experienced off and on read/write errors.
     
  6. MudCrab

    MudCrab Imaging Specialist

    Joined:
    Nov 3, 2006
    Posts:
    6,483
    Location:
    California
    Most of the hard drives that have failed on me started with having problems when the computer was turned on. It would take several tries to get them to spin up. One just clicked and shut down durring use and then begain acting strangely. Another one made a very loud squealing noise when starting. I don't thing tech support really believed me that it was the drive, but it was and they replaced it. I even recorded it so they could hear it. Very wierd.

    I periodically service an old 386 that sits out in a barn. Sometimes the hard drive "freezes" up. A few hard slaps on the side of the case and it's going again.

    The best thing is to have a solid backup strategy in place. That way you don't even have to worry about it.
     
  7. GroomLake

    GroomLake Registered Member

    Joined:
    Jul 13, 2006
    Posts:
    116
    According to Western Digital the worst thing you can do to a drive is power it up from a cold start. Western Digital recommended leaving it on. Give them a call and see what they say. While you are at it order their diagnostic tool.
     
  8. ThunderZ

    ThunderZ Registered Member

    Joined:
    May 1, 2006
    Posts:
    2,459
    Location:
    North central Ohio, U.S.A.

    Many say that the initial start-up "jolt" is harder on a hdd then leaving them spinning constantly. This is the first I have ever heard a manufacturer say so. My 2 towers are on 24/7. The laptop is on most of the time.
    The debate "on or off" continues.
     
  9. GroomLake

    GroomLake Registered Member

    Joined:
    Jul 13, 2006
    Posts:
    116
    I only have cooling problems. Those cheap fans keep going out on me. It is a bother to replace the power supply fan.
     
  10. ThunderZ

    ThunderZ Registered Member

    Joined:
    May 1, 2006
    Posts:
    2,459
    Location:
    North central Ohio, U.S.A.

    Heat is a killer of all PC components and can cause instability of the OS as well. Keep all fans and vents clean and clear by using the cans of compressed air that can be gotten almost anywhere. Upgrade your processor fan and even heat sink. If you do not have one already add the largest rear case fan you can turned so it draws the air out. This will help your PS fan and help keep all components cooler as well. Your PC will thank you.
     
  11. GroomLake

    GroomLake Registered Member

    Joined:
    Jul 13, 2006
    Posts:
    116
    A lawyer friend of mine runs with his covers off and a 12 inch fan blowing on the motherboard. He is a problem solver.
     
  12. nadirah

    nadirah Registered Member

    Joined:
    Oct 14, 2003
    Posts:
    3,647
    Interesting topic. My hard disk drive has a strange habit whereby when I restart my computer sometimes, at the ending part where the screen goes black, my hard disk stops spinning for a few seconds then starts spinning again.
    So far, I can't find any problems with it. It's like that by nature. My OEM is Packard Bell and the hard disk is a maxtor 6Y080L0.
     
  13. Roger_

    Roger_ Registered Member

    Joined:
    May 7, 2006
    Posts:
    89
    Location:
    Portugal
    I had several disk failures before connecting the system to an UPS with AVR (automatic voltage regulation), and was not running 24/7/365 (shut the system down every day)
    Through the UPS software I came to check that everyday between 7-10 p.m. my energy supplier is putting out voltages well under the normal (sometimes at around 190V when there should be at least 220V), so that might be the culprit for many failures, even worse in those always-on systems!
     
  14. Roger_

    Roger_ Registered Member

    Joined:
    May 7, 2006
    Posts:
    89
    Location:
    Portugal
    I'd add: put the largest possible fan in front of your drives stacker, pulling fresh air from the outside over them (the box must have the proper ventilation holes, and should also be filtered to avoid dust going over the drives) to keep temps down.
    I have solved some friends HD problems with this technique...
     
  15. Mrkvonic

    Mrkvonic Linux Systems Expert

    Joined:
    May 9, 2005
    Posts:
    8,697
    Hello,

    Solution to good long HDD life:
    Good brand
    Good fans
    Good air circulation - keep the two HDD separate with an empty slot in between if possible.
    Good UPS that regulates the voltage; I'm amazed how many tiny outages and spikes there are daily - UPS is a definite life saver.
    Keep the PC on 24/7.

    Mrk

    P.S. On my biggest gaming rig, the drives have temps of 27 and 33 deg C rather constantly, even in mid summer with temps raging around 40 deg C outside. Cooled by a 120mm fan right on the front of tower, another 120mm in the back, plus extra fans for PSU and other components. Even the network card is fanned.
     
  16. ThunderZ

    ThunderZ Registered Member

    Joined:
    May 1, 2006
    Posts:
    2,459
    Location:
    North central Ohio, U.S.A.
    Have heard pros and cons on direct HDD cooling. The direct passing of air may cause the build up of static electricity. Have spoke with the Techs. from my local shop about any and all forms of cooling. While they have not documented it, they feel they see a higher rate of HDD failures when dedicated cooling is in place. I totally agree with placing a quality fan in every available \designed for opening in a case. (I never use anything smaller then a mid-size case when building, prefer full-size. Size, when it comes to case cooling dose matter.) There are passive HDD coolers (heat pipe technology) available which get pretty good reviews. I plan on using one when I upgrade to the WD Raptor.
     
  17. Cerxes

    Cerxes Registered Member

    Joined:
    Sep 6, 2005
    Posts:
    581
    Location:
    Northern Europe
Loading...
Thread Status:
Not open for further replies.