Jump to content

Seagate Barracuda 7200.11 Troubles


Recommended Posts

That's certainly true for the chance of hitting when you are near the boundary. But if you consider that you are approaching the boundary with three times the speed and there is always a next boundary to hit (initially at 320, then at every multiple of 256), you are getting close to those boundaries three times as often.

In the long run, that amounts to 1/3 (chance of hitting when near) * 3 (frequency of being near the boundary) = 1, so the overall probability would be the same.

Still, not entirely my point, it's a bit hard to explain myself. :unsure:

the 1/3, in the hypothetical system depicted, makes things a bit different depending on values with which the initial counter is set in factory.

First 3 "critical values" are:

  1. 320 =(320*0+256)
  2. 576 =(320*1+256)
  3. 832 =(320*2+256)

Since 320 cannot be divided by 3, IF the counter is initially set to 0, AND all logs are made in three entries, you have 0% probabilities to hit 320, as

106*3=318 (n=106)

and

107*3=321 (n=107)

next occurrence is 256 steps away, but since 320+256=576 can be divided by 3, you have 100% probabilities (read certainty) ;) to hit 576 (with n=192), and so on.

If the counter is intially set to 1, you have 0% probabilities to hit 320, as well as 0% probabilities to hit 576, but you will hit 832 with (with n=277)

If the counter is intially set to 2, you have 100% probabilities to hit 320, (with n=106)

If the counter is initially set to 3, same as if it were set to 0, only first hit will come at corresponding n-1: 192-1=191

...and so on.

Of course, just as in the case of a single event log per power cycle (where if you started with value 319 you were dead on first shot), if you start with 317 you are dead, but if you start with 319 you have 171 chances before hitting next "critical value".

Best value to start with is 1 with which you have 277 cycles, but starting with 4, 7, 10 .... will only lessen your cycle life by 1 , 2, 3, etc.

If you are unlucky and start with 2, you have only 106 cycles......

Roughly the difference between 3 and 9 months of "life".

If the logs are written sometimes in triplets, sometimes in couples, sometimes in single "lines", I don't think there is a way to calculate probabilities, but however the probabilities should be lower than what calculated for all single "lines".

jaclaz

Link to comment
Share on other sites


Still, not entirely my point, it's a bit hard to explain myself. :unsure:

OK, I hope now I've got it :thumbup !

You're saying that there are certain logging patterns, which might decrease the probability of hitting the critical counter values, right?

If so, I'd say, it's entirely possible, though unlikely that such patterns exist. But certainly, we cannot know for sure.

If there is some variance in the number of log entries written per power cycle, the probability of drive failure should be along the curves already presented. The lesser the average number of log entries, the higher the chance of failure, but the overall magnitude does not change much.

On another topic: I happen to own a Dell OEM drive (ST3500620AS). It currently runs Dell's OEM firmware DE12. Dell has issued an updated version DE13 to address the Barracuda bug, but the update's batch file calls the updater's executable with a parameter "-m BRINKS". In contrast, Seagate's SD1A firmware update for that drive is called "MOOSE...". What happens if the Dell folks inadvertently published the wrong version and I incorrectly apply a BRINKS firmware to a MOOSE drive? Will it just stop working or will it get even worse (silently and slowly altering data, for example)?

Link to comment
Share on other sites

Many of you are still making outrageous statements about the depth of the problem. Try getting the exact number of 7200.11 disks seagate shipped rather than relying on a previous post that takes my 1,000,000,000 annual drive forecast for 2014 and dividing by 3 as a starting point. You will find that the actual number of 7200.11s is much smaller.

Secondly, consider that only some of the disks with the same firmware are affected, and that the failure isn't due to mechanical failure. This means the secret sauce relies on something that happened to SOME of the disks either in QC or as part of the manufacturing process. One can independantly deduct this by considering that you have to give seagate the serial number. So think about how many QC test stations there are on the floor and consider that it is more likely that only one of them was configured to leave the trigger code on the disks.

The anger at Seagate (and me for daring to put things in perspective) is misguided. As I posted on the storagesecrets.org site,

"So, as they say, I feel your pain. A lot of people are having this issue? Millions? I dont see Dell, HP, SUN, IBM, EMC, and Apple making press releases about how Seagate burned them. You dont think apple would drop Seagate in a heartbeat if they felt Seagate had a real-world, high-risk problem? Not to say any particular manufacturer is more in-tune with their customer base than others but do you think Apple or anybody else would have signed with Fujitsu by now if this was a problem that they were concerned with?

Sorry to be so blunt, but cant you concede that if there was a large-scale risk that was even a fraction of the AFR of disk drives due to head crashes and component failure that the PC vendors wouldnt be setting up major programs to do damage control with their customers? It isnt happening. The only way to explain the quiet from the PC vendors is that the risk is profoundly low.

The dirty little secret Disk vendors tell their top customers about such issues almost as soon as they find them. It is likely everybody knew about it. Put 2 + 2 together and ask why there is no story among the PC vendors, and why nobody jumped from Seagate."

So all of these other vendors HAD to have known about the problem from the beginning. It would not be unreasonable for them to also receive the complete lists of affected serial numbers (But I am not saying they were given the list as fact, it is my opinion that they were given lists of the affected drives that Seagate shipped them).

========

Now for something else ... all you 7200.11 owners, to help counteract the flames many of you sent me in privacy or online somewhere .. i am not some stealth P/R firm hired by Seagate. Here is a nice little post that shows you that the 7200.11 disks you all have are "rated" for only 2400 hours use per year. http://storagesecrets.org/2009/01/failure-...usage-annually/. These disks are desktop disks and were never designed for 24x7 use, or even high duty. Even though I have some myself, I use them as part of a RAID group, so when one of them dies I will just replace it. if any of you have non-RAIDed (RAID with parity, not RAID0) 7200.11s, then you need to make sure you backup often.

Edited by dlethe
Link to comment
Share on other sites

Hello:

I am new here and am wondering if anyone from Canada is here? I have this exact same problem everyone is having, but this time, Seagate tells me to contact HP, as its a problem with HP and not related to Seagate.

HP then tells me, they will ship another HD (did not specify which one...wonder why)

Now the important thing is to get the data back. Some data recovery services are asking anywhere between $1000 - 1500.

You guys are great here, and it looks like several choices are available.

My question is (base don the 2 types of issues with the 72000.11):

How do I know which problem I have. The only thing my HDD does and not boot up. It worked up until this morning, and after a reboot, got stuck at "Boot System Failure", when I ran a diagnostics, it did not find the "HDD".

Also would the HD doctor mentioned here come with both hardware and software.

Any help would be appreciated.

Thanks

Tony

Edited by tontano68
Link to comment
Share on other sites

dlethe,

few things that i'd like to point out:

1. majority of people here ARE those affected by this problem. that means working today, reboot, and it goes poof.

2. majority of the people here, when contacting Seagate, was told that there were no problems and to send in the drive for RMA. for data recovery please pay. it has since changed, as seagate is now admitting a problem with the hard disk and has asked people to update firmware and contact them for free recovery (incidentally, has anyone actually got their drives repaired for FREE??)

3. it doesn't matter if the problem affect hundreds or thousands or millions of hard disks. what matters is we have 100+ people from all over the world reporting this problem. we are not even talking about those who just RMA-ed their drives thinking it's a 'real' hardware fault. i personally know a friend who just RMA-ed the drive didn't expecting it to be a firmware problem. i agree that 100+ is very small, but the fact that it happened around the same time, plus there must be tons of people out there who just RMA-ed or returned the drives (see news about how certain stores are CLEARING OUT the 7200.11 with deep discounts??) so the 100+ could be just the tip of the iceberg. no one will ever know exactly how many disks were affected, except seagate themselves and i'm 100% sure they will NEVER disclose that info.

4. for you to come in here and defend seagate, it's just plain stupid - especially if you are not paid by them, and you are disclosing your info and business. please do yourself and your future paychecks a favour, and stop coming in here to defend Seagate. NO ONE here has any good opinions of seagate after how they dealt with us and banned us from their forums.

5. if you wish to defend seagate, i suggest you do it in your storagesecrets AND believe me when i say this - do it in the seagate forums. not only will people be more receptive to you, the seagate moderators would probably love you to death.

6. your statements about none of the other major vendors making press releases - don't you think if they really encounter this problem, it will be much better for them to settle this quietly with seagate. why would they want to make a big fuss out of this and THREATEN their own PC/notebook sales? you think people will buy HP/Dell/Apple/<insert whatever brand name here> if that company went public to complain about Seagate OEM disks which they use in their PC/notebooks giving THIS problem?

7. also, do you know which pc/notebook vendors are using the 7200.11 F/W SD15 in their products?? how do you know that they are using this 7200.11 problem batch or a more stable 7200.10 batches? if they are not using 7200.11 F/W SD15, don't you think that explains the lack of this problem with the major PC/notebook vendors??

8. your statement about 2400 hours use per year - i can fully accept it if the disks start to show more bad sectors, or start to have read/write problems when they are used beyond the 2400 hours but for it to just disappear on next reboot??? surely that 2400 hours has got nothing to do with this problem. besides, 2400 hours is equal to 100 days. we have people whose hard disks failed LESS than 100 days. still a valid point then??

just believe me when i say this - i don't see why anyone would disclose their identify and company, and come out to defend seagate with no clear benefits to themself/their company. as someone mentioned before, you are just giving yourself and your company a black eye by doing that.

Many of you are still making outrageous statements about the depth of the problem. Try getting the exact number of 7200.11 disks seagate shipped rather than relying on a previous post that takes my 1,000,000,000 annual drive forecast for 2014 and dividing by 3 as a starting point. You will find that the actual number of 7200.11s is much smaller.

Secondly, consider that only some of the disks with the same firmware are affected, and that the failure isn't due to mechanical failure. This means the secret sauce relies on something that happened to SOME of the disks either in QC or as part of the manufacturing process. One can independantly deduct this by considering that you have to give seagate the serial number. So think about how many QC test stations there are on the floor and consider that it is more likely that only one of them was configured to leave the trigger code on the disks.

The anger at Seagate (and me for daring to put things in perspective) is misguided. As I posted on the storagesecrets.org site,

"So, as they say, I feel your pain. A “lot” of people are having this issue? Millions? I don’t see Dell, HP, SUN, IBM, EMC, and Apple making press releases about how Seagate burned them. You don’t think apple would drop Seagate in a heartbeat if they felt Seagate had a real-world, high-risk problem? Not to say any particular manufacturer is more in-tune with their customer base than others … but do you think Apple or anybody else would have signed with Fujitsu by now if this was a problem that they were concerned with?

Sorry to be so blunt, but can’t you concede that if there was a large-scale risk that was even a fraction of the AFR of disk drives due to head crashes and component failure that the PC vendors wouldn’t be setting up major programs to do damage control with their customers? It isn’t happening. The only way to explain the quiet from the PC vendors is that the risk is profoundly low.

The dirty little secret — Disk vendors tell their top customers about such issues almost as soon as they find them. It is likely everybody knew about it. Put 2 + 2 together and ask why there is no story among the PC vendors, and why nobody jumped from Seagate."

So all of these other vendors HAD to have known about the problem from the beginning. It would not be unreasonable for them to also receive the complete lists of affected serial numbers (But I am not saying they were given the list as fact, it is my opinion that they were given lists of the affected drives that Seagate shipped them).

========

Now for something else ... all you 7200.11 owners, to help counteract the flames many of you sent me in privacy or online somewhere .. i am not some stealth P/R firm hired by Seagate. Here is a nice little post that shows you that the 7200.11 disks you all have are "rated" for only 2400 hours use per year. http://storagesecrets.org/2009/01/failure-...usage-annually/. These disks are desktop disks and were never designed for 24x7 use, or even high duty. Even though I have some myself, I use them as part of a RAID group, so when one of them dies I will just replace it. if any of you have non-RAIDed (RAID with parity, not RAID0) 7200.11s, then you need to make sure you backup often.

Link to comment
Share on other sites

No, you don't have the exact same problem. You have symptoms of a dead disk drive. The problem could be anything from media failure to bad power supply, chip failure, SATA interface failure, bad cable. That is big point of my earlier posts. People think the boot-of-death issue is root cause for 7200.11 drive failures. How do you find out? You ask (or pay) for a failure analysis. There is no way to know for sure root cause for a failure such as yours where nothing is returned unless you do a post-mortem .. unless you have a lot of equipment at your house and know how to use it.

Anyway, you can probably get much better pricing for data recovery services. Perhaps under $500. Some services will tell you what is wrong and will quote you recovery price for free, once you send them the disk. If price is too high, then you pay postage and they return it to you.

Link to comment
Share on other sites

hi tony,

these 2 threads has the details on how to fix this - IF AND ONLY IF your hard disk is suffering from this BSY and 0 LBA problem.

http://www.msfn.org/board/index.php?showtopic=128807

http://www.msfn.org/board/index.php?showtopic=129263

You'll need to get those RS232 components to connect to your hard disk to check.

Good luck!

Hello:

I am new here and am wondering if anyone from Canada is here? I have this exact same problem everyone is having, but this time, Seagate tells me to contact HP, as its a problem with HP and not related to Seagate.

HP then tells me, they will ship another HD (did not specify which one...wonder why)

Now the important thing is to get the data back. Some data recovery services are asking anywhere between $1000 - 1500.

You guys are great here, and it looks like several choices are available.

My question is (base don the 2 types of issues with the 72000.11):

How do I know which problem I have. The only thing my HDD does and not boot up. It worked up until this morning, and after a reboot, got stuck at "Boot System Failure", when I ran a diagnostics, it did not find the "HDD".

Also would the HD doctor mentioned here come with both hardware and software.

Any help would be appreciated.

Thanks

Tony

Link to comment
Share on other sites

You're saying that there are certain logging patterns, which might decrease the probability of hitting the critical counter values, right?

If so, I'd say, it's entirely possible, though unlikely that such patterns exist. But certainly, we cannot know for sure.

If there is some variance in the number of log entries written per power cycle, the probability of drive failure should be along the curves already presented. The lesser the average number of log entries, the higher the chance of failure, but the overall magnitude does not change much.

Yep :).

All in all, I would say that most probably chances are a bit lower than what initially estimated, but still in the same order of magnitude.

Many of you are still making outrageous statements about the depth of the problem. Try getting the exact number of 7200.11 disks seagate shipped rather than relying on a previous post that takes my 1,000,000,000 annual drive forecast for 2014 and dividing by 3 as a starting point. You will find that the actual number of 7200.11s is much smaller.

Just for the record, my previous post took "your" 1,000,000,000 and divided it, for several reasons, by several factors:

3 (maybe current production)

3 (maybe number of drives in the "family")

1.1 (rounding by defect)=>at this stage the number of drives in the possibly "affected" family is 100,000,000, i.e. 1/10 of "your" 1,000,000,000

10 ("safety factor")=>at this stage the number of drives in the possibly "affected" family is 10,000,000, i.e. 1/100 of "your" 1,000,000,000

500=1/0.002 (to take into account Seagate statement)

(3*3*1.1*500*10)=49,500

i.e. 1,000,000,000/49,500=20,202 => 20,000

In other words, the hypothesys is that throughout 2008 Seagate manufactured between 10,000,000 and 100,000,000 drives of the "family".

Then, the lower number is taken and multiplied by the smallest possible incidence of "affected drives" (per Seagate statement) 0.002, i.e. that 1/500 of the drives in the family may be affected.

Since "some percentage" can mean ANY number <1, the found 20,000 can easily come out by a lesser number of drives "in the family" manufactured multiplied by a higher percentage:

10,000,000*0.002=20,000 (1/500)

5,000,000*0.004=20,000 (1/250)

2,500,000*0.008=20,000 (1/125)

1,000,000*0.02=20,000 (1/50)

while still within the same definition of "some percentage", and at the lower end of it......

Without official data, and as clearly stated, the above numbers are just speculative, and, while they might be inaccurate, the order of magnitude seems relevant enough to rule out that the 100÷150 reports here on MSFN, represent NOT a significant fraction (1/7 or 1/8) of all affected drives.

jaclaz

Edited by jaclaz
Link to comment
Share on other sites

*Snip the braying and neighing of barnyard animals.*

The anger at Seagate (and me for daring to put things in perspective) is misguided. As I posted on the ONE MONTH OLD site,

...

*Snip more animal sounds.*

Now for something else ... all you 7200.11 owners, to help counteract the flames many of you sent me in privacy or online somewhere .. i am not some stealth P/R firm hired by Seagate...

*Snip.*

Now, see, I just plain don't like you because you come across as a pompous scumbag figurehead.

Maybe if your "website" was more than a month old and filed with something recognizable as more than elaborate filler material, I'd think more of you and your "credentials." Maybe if it was hosted by a company that put more than ads into their WHOIS data. Maybe if you didn't constantly belittle the amount of effort put into this problem by CONSUMERS regardless of the outright failures and lies of the Seagate PR department.

Who knows - maybe I'm just too picky when it comes to people. I'd like to think that people who'd post on this topic would either be people afflicted with the problem, owners of hardware they are concerned will soon fail, or people trying to help by posting constructive information - not people that site there and link to their month old depository of Seagate handjobs at EVERY POSSIBLE CHANCE.

So either you're being paid or you hope to be paid.

Also, make up your mind. You say on your site, and I quote, "...ships 10,000,000 Barracuda disk drives a month..." - and yet you chide jaclaz for his numbers. Your 120,000,000 drives a year and his 111,111,111 drives a year are remarkably similar, don't you think? Actually, your number is higher. Huh. :blink:

You estimate a 1:65,536 chance at failure. You honestly think there's only about 1,800 drives that will ever be affected by this issue? Seagate wouldn't even have an intern look at a problem that small, let alone stonewall for months, release firmware updates, and offer free data recovery. They'll end up spending a lot more than the $10,000 they made selling those drives.

Anyways.

Link to comment
Share on other sites

If it's true it's a 1:65,536 chance at failure - I think I should go buy some lottery tickets or something. I'm bound to win SOMETHING if I'm that 'lucky'..... For those with two or more 7200.11 bricks, I suggest you STOP reading now and go buy your tickets or enter some contests or whatever! the odds mentioned might just be true....

You estimate a 1:65,536 chance at failure. You honestly think there's only about 1,800 drives that will ever be affected by this issue? Seagate wouldn't even have an intern look at a problem that small, let alone stonewall for months, release firmware updates, and offer free data recovery. They'll end up spending a lot more than the $10,000 they made selling those drives.

Anyways.

Link to comment
Share on other sites

2. majority of the people here, when contacting Seagate, was told that there were no problems and to send in the drive for RMA. for data recovery please pay. it has since changed, as seagate is now admitting a problem with the hard disk and has asked people to update firmware and contact them for free recovery (incidentally, has anyone actually got their drives repaired for FREE??)

I did get my drive repaired for free by Seagate data recovery (confirmation on the bill) but still I am waiting to get it back (should arrive within days).

Link to comment
Share on other sites

Also, make up your mind. You say on your site, and I quote, "...ships 10,000,000 Barracuda disk drives a month..." - and yet you chide jaclaz for his numbers. Your 120,000,000 drives a year and his 111,111,111 drives a year are remarkably similar, don't you think? Actually, your number is higher. Huh. :blink:

Well, you cannot do 12*10,000,000, you have to take into account some holidays....;)

jaclaz

Link to comment
Share on other sites

Many of you are still making outrageous statements about the depth of the problem.
I'd say some of us made attempts at judging the affected drive population. They put all their numbers and assumptions on the table for further discussion.
Try getting the exact number of 7200.11 disks seagate shipped
That's not a published number, right? Are you saying this simply to disparage any other attempt to estimate that number?
So think about how many QC test stations there are on the floor and consider that it is more likely that only one of them was configured to leave the trigger code on the disks.
Now you're making wild guesses without any factual basis. You don't know the percentage of test stations writing the "trigger code". That's a number Seagate didn't dare to publish so far and that might be for a reason.
A “lot” of people are having this issue? Millions? I don’t see Dell, HP, SUN, IBM, EMC, and Apple making press releases about how Seagate burned them. You don’t think apple would drop Seagate in a heartbeat if they felt Seagate had a real-world, high-risk problem?
Pure speculation. You claim to be a technical expert but you're making assumptions based on corporate psychology. In addition, you're ignoring two little facts: (1) OEMs may be legally responsible for damages incurred by their customers. (2) There are not that many disk drive manufacturers around that a large scale product buyer would light-heartedly agree to reduce the number of competitors.
The only way to explain the quiet from the PC vendors is that the risk is profoundly low.
That's the only way you can imagine. We might or might not agree. Anyway, it's probably just too early to tell.
So all of these other vendors HAD to have known about the problem from the beginning. It would not be unreasonable for them to also receive the complete lists of affected serial numbers (But I am not saying they were given the list as fact, it is my opinion that they were given lists of the affected drives that Seagate shipped them).
You still believe this though Seagate took several attempts to publish a working online serial number check?
Here is a nice little post that shows you that the 7200.11 disks you all have are "rated" for only 2400 hours use per year.
You are misstating the facts. Seagate simply states the usage patterns employed for AFR and MTBF calculations (2400 power-on-hours, 10,000 start/stop cycles). That does not mean at all, that desktop drives have a higher probability of failure when used 4800 hours per year or any other number for that matter. You cannot tell. Seagate didn't publish data for alternative usage patterns. So you're the one spreading FUD here.

BTW, in some respects server disk drives operating 24/7 can have a weaker design compared to desktop, let alone notebook drives: They don't need to withstand the high number of start/stop cycles. So higher price point doesn't necessarily mean more robust design for every usage scenario.

Link to comment
Share on other sites

The 7200.11 MTBF is 750,000 hours and the AFR is 0.34%. These numbers are supposedly based on 2400 hours use per year (that's about 6 hours every day over 1 year).

Even if they are calculated based on 2400 hours per year or 8760 hours per year, it doesn't mean anything to us.

Firstly, AFR + MTBF values are for HARDWARE failures. Not firmware defects.

Secondly, assuming the initial 2400 hours usage per year:

750000/2400=300 hours

1/300*100=0.33% (which matches the AFR rate quoted by Seagate at 0.34%.

This means over one year, 0.34% of all 7200.11 drives will fail.

Now assuming we use the hard disks 24x7 or 8760 hours per year.

750000/8760=85.61 hours

1/85.61*100 = 1.16%

An increase in failure rates by almost 3 times. Assuming Seagate does ship 120,000,000 cudas in a year, we're talking about:

408,000 disk failures in one year out of 120,000,000 disks - 0.34% AFR

1,392,000 disk failures in one year out of 120,000,000 disks - 1.16% AFR

while the absolute numbers look huge, they are still a small fraction out of the 120,000,000 disks. Still does not explain how an AFR/MTBF calculation based on 2400 hours is supposed to 'justify' the apparently 'higher rates' of failures.

Besides, as mentioned, MTBF and AFR are quoted for HARDWARE FAILURES. Not firmware defects. If you have a firmware defect with the proper conditions, the AFR is going to be 100% - why do you think Seagate is releasing firmware SD1A to fix this problem?

Now for something else ... all you 7200.11 owners, to help counteract the flames many of you sent me in privacy or online somewhere .. i am not some stealth P/R firm hired by Seagate. Here is a nice little post that shows you that the 7200.11 disks you all have are "rated" for only 2400 hours use per year. http://storagesecrets.org/2009/01/failure-...usage-annually/. These disks are desktop disks and were never designed for 24x7 use, or even high duty. Even though I have some myself, I use them as part of a RAID group, so when one of them dies I will just replace it. if any of you have non-RAIDed (RAID with parity, not RAID0) 7200.11s, then you need to make sure you backup often.
Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...