sieve-x Posted February 18, 2009 Posted February 18, 2009 (edited) This thread aims to gather more information and research the current issues for those that repairedand/or updated their affected drives to newer firmware. It also includes a experimental failureanalysis for those still running old firmware and power-cycling (despite the risks of drive failure).WHY- Check reports of problems after the drive was updated to new firmware (with or without repair).- Check reports from some people that repaired their drives using different methods and are now having issues (ex. BSODs due to read/write errors, bad sectors, corrupted data, unable to format).- Check first symptoms before drive became BSY/0 LBA (ex. failed read/write operations).- Verify variables (like temperature and short DSTs) that can affect drive internal event log count.- Have a better understanding and more in depth research of failures/current issues.MINIMUM REQUIREMENTS1. Have an affected model: Seagate 7200.11, ES2.1, SV35.3, SV35.4, DiamondMax22,2. Have or had affected firmware: AD14, SD15, SD16, SD17, SD18, SD19, MX15, ...3. Serial number check at Seagate website tells drive is affected (use uppercase).TESTINGIf your drive was bricked and then recovered and/or updated to latest firmware but nowyou're experiencing problems (ex. corrupted files, stop 0x0000008E/C000009C BSODs, unable to read some files, etc) please fill the appropriate fields below and reply the topic:PREVIOUS ISSUE SYMPTOMS: (ex. none, just rebooted and drive became 0 MB)CURRENT ISSUE SYMPTOMS: (ex. bad sectors, corrupted files)LISTED AS AFFECTED: Y/N (put Y if Seagate serial number check tool says it´s affected)REPAIRED: Y/N (if it's a drive recovered from BSY/0 MB)METHOD: NONE / PAID DR SERVICE / SEAGATE DR (ex. i365) / OTHER: (ex. nickname method)APPLIED WRONG FIRMWARE BEFORE: Y/NUPDATED FW AFTER/BEFORE REPAIR: A/BPREVIOUS FIRMWARE: (ex. SD15)CURRENT FIRMWARE: (ex. SD1A)EXTERNAL DRIVE (USB/ESATA/1394): Y/NFollow the optional steps below to provide a more detailed scope of the problem(s):1. Get HD Tune 2.55 (for a simple and quick read-only surface-scan).2. Get smartmontools (for dumping S.M.A.R.T attributes and error logs): Stable 5.38 release (Win32 at end of page): http://smartmontools.sourceforge.net/download.html Latest 5.39 (20090303) release from CVS (may be unstable) compiled for Win32: http://smartmontools-win32.dyndns.org/smartmontools/ smartmontools-5.39-0-20090303.win32-setup.exe MD5: 1707c505724e71c24fe023b630e7d4fa 3. Create a batch file, copy 'n' paste content below and save it as smartchk.bat: Using smartmontools 5.38 release @echo off smartctl -s on -a -q noserial /dev/pd0 >> c:\smartchk.txt Using smartmontools 5.39 CVS 2009/02/08 (recommended because it provides more information) @echo off smartctl -s on -a -q noserial /dev/pd0 >> c:\smartchk.txt smartctl -l sataphy /dev/pd0 >> c:\smartchk.txt smartctl -l scttemp /dev/pd0 >> c:\smartchk.txt smartctl -l xerror /dev/pd0 >> c:\smartchk.txt pd0 for 1st physical drive, pd1 for 2nd and so on. Check under Computer Management > Disk Management. Support for external drives and RAID is limited. Try adding -d option (option = 3ware, areca, hpt, sat, usbcypress).4. Run the surface error QUICK scan in HDTune and provide a tiny screenshot (thumbnail).5. Run the smartchk.bat and provide the information it collects in smartchk.txt.Links for learning more about S.M.A.R.T and how to interpret attributes/results:http://en.wikipedia.org/wiki/S.M.A.R.T.http://www.almico.com/sfarticle.php?id=2http://www.drivehealth.com/attributes.htmlhttp://www.hdsentinel.com/smart/index.phpS.M.A.R.T DRIVE SELF-TESTS (DST) (fully optional, do them at your own risk)1. Off-Line Data Collection: smartctl -t offline /dev/pdX (wait until it completes) smartctl -l selftest /dev/pdX (to read self-test log)2. Short Self-Test: smartctl -t short /dev/pdX (wait until it completes) smartctl -l selftest /dev/pdX (to read self-test log)3. Long Self-Test smartctl -t short /dev/pdX (wait until it completes) smartctl -l selftest /dev/pdX (to read self-test log)4. Conveyance Self-Test (commonly used for testing new drive for shipping damage) smartctl -t conveyance /dev/pdX (wait until it completes) smartctl -l selftest /dev/pdX (to read self-test log)The X from pdX is physical drive number from 0-99. These are the same drive self-tests (DST) as in Seatools and can be used for defect reallocation. It's possible to do it under Windows but driveshould be idle (preferably with no mapped letter) because disk activity can cause test to abort/fail.NOTE: DST tests don't write anything to existing data and are generally safe (as long firmware don'thave bugs in this aspect) but they stress the drive (ie. no need run a long DST test on a daily base)and backup is always recommended before running any kind of test (same goes for Seatools).EXPERIMENTAL FAILURE ANALYSIS (only for drives still under old firmware affected by BSY/0MB)Those with an affected drive (model + serial number check on Seagate web) still workingunder old firmware (AD14, SD15, SD16, SD17, SD18, SD19, etc) and power-cycling (ie. fora firmware update or because don't wanna apply new firmware anyway) despite risks canhelp a small research on failure analysis with the following procedure:1. Get latest smartmontools 5.39 from CVS or download already compiled for Win32: http://smartmontools-win32.dyndns.org/smartmontools/ smartmontools-5.39-0-20090303.win32-setup.exe MD5: 1707c505724e71c24fe023b630e7d4fa2. Create a batch file, copy 'n' paste content below and save it c:\smartdmp.bat @echo off smartctl -s on -a -q noserial /dev/pd0 >> c:\smartlog.txt smartctl -l sataphy /dev/pd0 >> c:\smartlog.txt smartctl -l scttemp /dev/pd0 >> c:\smartlog.txt smartctl -l xerror /dev/pd0 >> c:\smartlog.txt smartctl -l gplog,0xa1,0+19 /dev/pd0 >> c:\smartlog.txt pd0 for 1st physical drive, pd1 for 2nd and so on. Check under Computer Management > Disk Management. Support for external drives and RAID is limited. Try adding -d option (option=3ware, areca, hpt, sat, usbcypress).3. Run smartdmp.bat everytime before a shutdown or setup do so automatically (ex. Use group policy tool Start > Run > gpedit.msc and under Computer Configuration > Windows ... Settings > Scripts click on Shutdown and point it to the batch file). Others may help here.4. If drive fails and them gets repaired please provide the smartlog.txt contents for analysis. This may help to pinpoint a pattern (from raw attribute values, logs, incorrect checksum, etc) for failure prediction and/or workaround (IMPORTANT: Only provide the log if your drive freezes and PM or file service is prefered because the log can become large until a drive failure occurs).IMPORTANT- Drive serial number is NOT needed and dump does not collects it (-q noserial option)but you should check it against Seagate web tool. Otherwise it may be a different problem.- All smartctl tool command-line options are case sensitive and some are release dependent.- I'm recommending HD Tune 2.55 (not 3.0) over more complex tools (ie. HDDScan, Victoria, etc)because it's very simple to use, does not includes any dangerous option (ie. erase/write) and alsoprovides a temperature monitoring and screenshot feature. Use more complex tools at your will.- You should be aware drive may already have some issue before the firmware update or repair.- Although procedure can be considered safe it's provided AS IS without any warranties/assistance. Edited March 9, 2009 by sieve-x
sieve-x Posted February 18, 2009 Author Posted February 18, 2009 (edited) ************** THIS IS ONLY AN EXAMPLE *******************PREVIOUS ISSUE SYMPTOMS: MFT errors, rebooted and became BSY.CURRENT ISSUE SYMPTOMS: Loud clicking noises (not present before)LISTED AS AFFECTED: YREPAIRED: YMETHOD: PAID DRAPPLIED WRONG FIRMWARE BEFORE: NUPDATED FW AFTER/BEFORE REPAIR: APREVIOUS FIRMWARE: SD15CURRENT FIRMWARE: SD1AEXTERNAL DRIVE (USB/ESATA/1394): N=== START OF INFORMATION SECTION ===Model Family: Seagate Barracuda 7200.11Device Model: ST3500320ASFirmware Version: SD1AUser Capacity: 500 107 862 016 bytesDevice is: In smartctl database [for details use: -P show]ATA Version is: 8ATA Standard is: ATA-8-ACS revision 4Local Time is: Sat Jan 24 17:08:18 2009 RSTSMART support is: Available - device has SMART capability. Enabled status cached by OS, trying SMART RETURN STATUS cmd.SMART support is: Enabled=== START OF READ SMART DATA SECTION ===SMART overall-health self-assessment test result: PASSEDGeneral SMART Values:Offline data collection status: (0x82) Offline data collection activity was completed without error. Auto Offline Data Collection: Enabled.Self-test execution status: ( 0) The previous self-test routine completed without error or no self-test has ever been run.Total time to complete Offline data collection: ( 642) seconds.Offline data collectioncapabilities: (0x7b) SMART execute Offline immediate. Auto Offline data collection on/off support. Suspend Offline collection upon new command. Offline surface scan supported. Self-test supported. Conveyance Self-test supported. Selective Self-test supported.SMART capabilities: (0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer.Error logging capability: (0x01) Error logging supported. General Purpose Logging supported.Short self-test routine recommended polling time: ( 1) minutes.Extended self-test routinerecommended polling time: ( 117) minutes.Conveyance self-test routinerecommended polling time: ( 2) minutes.SCT capabilities: (0x103b) SCT Status supported. SCT Feature Control supported. SCT Data Table supported.SMART Attributes Data Structure revision number: 10Vendor Specific SMART Attributes with Thresholds:ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x000f 117 099 006 Pre-fail Always - 147144821 3 Spin_Up_Time 0x0003 096 094 000 Pre-fail Always - 0 4 Start_Stop_Count 0x0032 100 100 020 Old_age Always - 158 5 Reallocated_Sector_Ct 0x0033 100 100 036 Pre-fail Always - 1 7 Seek_Error_Rate 0x000f 037 036 030 Pre-fail Always - 10810444186323 9 Power_On_Hours 0x0032 099 099 000 Old_age Always - 908 10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail Always - 2 12 Power_Cycle_Count 0x0032 100 100 020 Old_age Always - 207184 Unknown_Attribute 0x0032 100 100 099 Old_age Always - 0187 Reported_Uncorrect 0x0032 100 100 000 Old_age Always - 0188 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 0189 High_Fly_Writes 0x003a 100 100 000 Old_age Always - 0190 Airflow_Temperature_Cel 0x0022 069 066 045 Old_age Always - 31 (Lifetime Min/Max 31/31)194 Temperature_Celsius 0x0022 031 040 000 Old_age Always - 31 (0 11 0 0)195 Hardware_ECC_Recovered 0x001a 044 032 000 Old_age Always - 147144821197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 0198 Offline_Uncorrectable 0x0010 100 100 000 Old_age Offline - 0199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 0SMART Error Log Version: 1No Errors LoggedSMART Self-test log structure revision number 1Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error# 1 Short offline Completed without error 00% 871 -# 2 Extended offline Completed without error 00% 851 -# 3 Short offline Completed without error 00% 801 -# 4 Short offline Completed without error 00% 0 -SMART Selective self-test log data structure revision number 1 SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS 1 0 0 Not_testing 2 0 0 Not_testing 3 0 0 Not_testing 4 0 0 Not_testing 5 0 0 Not_testingSelective self-test flags (0x0): After scanning selected spans, do NOT read-scan remainder of disk.If Selective self-test is pending on power-up, resume after 0 minute delay.=====This example tells drive may have suffered shipping or manipulation damage. Seek Error is too close to threshold. Edited February 18, 2009 by sieve-x
eli2k Posted February 21, 2009 Posted February 21, 2009 PREVIOUS ISSUE SYMPTOMS: Put into hibernate, brought out of hibernate 1min later, and became BSY.CURRENT ISSUE SYMPTOMS: noneLISTED AS AFFECTED: YREPAIRED: YMETHOD: RS232 controller to send commands to fix itAPPLIED WRONG FIRMWARE BEFORE: NUPDATED FW AFTER/BEFORE REPAIR: APREVIOUS FIRMWARE: SD15CURRENT FIRMWARE: SD1AEXTERNAL DRIVE (USB/ESATA/1394): N=== START OF INFORMATION SECTION ===Model Family: Seagate Barracuda 7200.11Device Model: ST3500320ASFirmware Version: SD1AUser Capacity: 500,106,780,160 bytesDevice is: In smartctl database [for details use: -P show]ATA Version is: 8ATA Standard is: ATA-8-ACS revision 4Local Time is: Sat Feb 21 02:30:14 2009 PSTSMART support is: Available - device has SMART capability. Enabled status cached by OS, trying SMART RETURN STATUS cmd.SMART support is: Enabled=== START OF ENABLE/DISABLE COMMANDS SECTION ===SMART Enabled.=== START OF READ SMART DATA SECTION ===SMART overall-health self-assessment test result: PASSEDGeneral SMART Values:Offline data collection status: (0x82) Offline data collection activity was completed without error. Auto Offline Data Collection: Enabled.Self-test execution status: ( 0) The previous self-test routine completed without error or no self-test has ever been run.Total time to complete Offline data collection: ( 650) seconds.Offline data collectioncapabilities: (0x7b) SMART execute Offline immediate. Auto Offline data collection on/off support. Suspend Offline collection upon new command. Offline surface scan supported. Self-test supported. Conveyance Self-test supported. Selective Self-test supported.SMART capabilities: (0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer.Error logging capability: (0x01) Error logging supported. General Purpose Logging supported.Short self-test routine recommended polling time: ( 1) minutes.Extended self-test routinerecommended polling time: ( 122) minutes.Conveyance self-test routinerecommended polling time: ( 2) minutes.SCT capabilities: (0x103b) SCT Status supported. SCT Feature Control supported. SCT Data Table supported.SMART Attributes Data Structure revision number: 10Vendor Specific SMART Attributes with Thresholds:ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x000f 110 099 006 Pre-fail Always - 29346917 3 Spin_Up_Time 0x0003 094 094 000 Pre-fail Always - 0 4 Start_Stop_Count 0x0032 100 100 020 Old_age Always - 10 5 Reallocated_Sector_Ct 0x0033 100 100 036 Pre-fail Always - 0 7 Seek_Error_Rate 0x000f 100 253 030 Pre-fail Always - 47028 9 Power_On_Hours 0x0032 100 100 000 Old_age Always - 355 10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail Always - 0 12 Power_Cycle_Count 0x0032 100 100 020 Old_age Always - 144184 Unknown_Attribute 0x0032 100 100 099 Old_age Always - 0187 Reported_Uncorrect 0x0032 100 100 000 Old_age Always - 0188 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 0189 High_Fly_Writes 0x003a 100 100 000 Old_age Always - 0190 Airflow_Temperature_Cel 0x0022 075 069 045 Old_age Always - 25 (Lifetime Min/Max 16/25)194 Temperature_Celsius 0x0022 025 040 000 Old_age Always - 25 (0 14 0 0)195 Hardware_ECC_Recovered 0x001a 027 022 000 Old_age Always - 29346917197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 0198 Offline_Uncorrectable 0x0010 100 100 000 Old_age Offline - 0199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 0SMART Error Log Version: 1No Errors LoggedSMART Self-test log structure revision number 1No self-tests have been logged. [To run self-tests, use: smartctl -t]SMART Selective self-test log data structure revision number 1 SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS 1 0 0 Not_testing 2 0 0 Not_testing 3 0 0 Not_testing 4 0 0 Not_testing 5 0 0 Not_testingSelective self-test flags (0x0): After scanning selected spans, do NOT read-scan remainder of disk.If Selective self-test is pending on power-up, resume after 0 minute delay.
Gradius2 Posted February 23, 2009 Posted February 23, 2009 Keep in mind to use the LAST firmware available, as I reported here:http://www.msfn.org/board/index.php?showto...st&p=836051
Recommended Posts
Please sign in to comment
You will be able to leave a comment after signing in
Sign In Now