Storage Review Drive Reliability Survey

Adcadet · Dec 3, 2002

hey all -
forgive me if this has been discussed here before. But had SR disclosed what exactly goes into their "analysis engine"?

Ad

Tannin · Dec 3, 2002

Not so far as I know, but I can tell you: nothing of significance bar the psudo-scientific sound of the phrase "analysis engine", which ranks up there with "ectoplasm" and "astrological conjunction". You can't analyse nonsense, no matter what you do with it.

Adcadet · Dec 4, 2002

I just received an email from Davin tonight, freeing me of any previous obligation to silence.

Some of you may know that I am doing an MPH in epidemiology, and when I started thinking of the SR Drive Reliability Survey (DRS) I realized that my coursework covered the basics of doing this type of survival analysis. I contacted Davin and Eugene and offered my help resurecting the DRS, which they readily accepted. We had some conversations discussing the best way to run the DRS, I talked to a few professors to figure out what would be most scientifically valid and robust. I got busy in school, didn't hear from D&E for a while, then began to think that if D&E didn't bother to contact me regarding the DRS, then it wasn't worth working with them on it, as they obviously either a)didn't care about the DRS, or b)didn't care about my work towards it. So I didn't bother contacting them, they didn't bother contacting me, and then, months later, I see that SR has the DRS back up.

So....has anybody actually registered their drives at SR and gotten the opportunity to read their material? I'd be currious to see if they go into more depth than what they allude to in the rest of the site. I find it particularly ironic (painful, in fact) that SR claims to be so scientifically rigid in their performance testing methodology, and yet they make a claim about reliability without showing HOW they came to that conclusion. Unless I'm missing something entirely, they never disclose much about their methods, just refering to it as an "analysis engine." I realize that proprietary methodology has it's place, but I never thought I'd see it at SR.

Handruin · Dec 4, 2002

Adcadet said:
I just received an email from Davin tonight, freeing me of any previous obligation to silence.

Some of you may know that I am doing an MPH in epidemiology, and when I started thinking of the SR Drive Reliability Survey (DRS) I realized that my coursework covered the basics of doing this type of survival analysis. I contacted Davin and Eugene and offered my help resurecting the DRS, which they readily accepted. We had some conversations discussing the best way to run the DRS, I talked to a few professors to figure out what would be most scientifically valid and robust. I got busy in school, didn't hear from D&E for a while, then began to think that if D&E didn't bother to contact me regarding the DRS, then it wasn't worth working with them on it, as they obviously either a)didn't care about the DRS, or b)didn't care about my work towards it. So I didn't bother contacting them, they didn't bother contacting me, and then, months later, I see that SR has the DRS back up.

So....has anybody actually registered their drives at SR and gotten the opportunity to read their material? I'd be currious to see if they go into more depth than what they allude to in the rest of the site. I find it particularly ironic (painful, in fact) that SR claims to be so scientifically rigid in their performance testing methodology, and yet they make a claim about reliability without showing HOW they came to that conclusion. Unless I'm missing something entirely, they never disclose much about their methods, just refering to it as an "analysis engine." I realize that proprietary methodology has it's place, but I never thought I'd see it at SR.

I registered my drives when they re-released the DRS. I found it bothersome to view the reliability ratings. For me, I will continue to value word of mouth over a their division of failures over time method.

I'm guessing that they assign values to the length of time and also some type of gradual curve to the numbers included in the survey. I truthfully don't care about the DRS and what it concludes.

So what does it mean to you when you see that:
"Fireball Plus LM Second quarter 2000 98 percentile " This drive is more reliable then 98% of the drives reviewed???

Should I conclude that: Too bad they don't make this drive any longer... ?

blakerwry · Dec 4, 2002

I thought it was clear... You look up a drive make and model and you will see a percentage. This percentage tells you how this drive ranks compared to the other drives entered into the survey...

example: you lookup the Cudda ATA IV and see a percentage of 48%, this means that based on the survey, this drive was more reliable than 48% of the drives entered in the survey.

AFAIK They have not realeased their methods for deciding when there is enough of a particular model to enter it into the pool and have also not realeased their methods for trying to throw out bad data.

Adcadet · Dec 4, 2002

blakerwry said:
I thought it was clear... You look up a drive make and model and you will see a percentage. This percentage tells you how this drive ranks compared to the other drives entered into the survey...

example: you lookup the Cudda ATA IV and see a percentage of 48%, this means that based on the survey, this drive was more reliable than 48% of the drives entered in the survey.

AFAIK They have not realeased their methods for deciding when there is enough of a particular model to enter it into the pool and have also not realeased their methods for trying to throw out bad data.

But how do they calculate that Drive X is more reliable than 48% of all drives surveyed? What are the other drives surveyed? How on earth do they factor time into the equation? How can you begin to compare a drive that's been out for one month against a drive that's been out for three years? Using proper survival analysis techniques you often can answer these questions fairly well. But SR never discloses their techniques, and the language they use makes me question the rigor of their analysis.

blakerwry · Dec 4, 2002

maybe we should push eugene to disclose his secrets.

Adcadet · Dec 4, 2002

blakerwry said:
maybe we should push eugene to disclose his secrets.

I doubt that would get very far. But perhaps SF can issue a position statement saying that full disclosure is the heart of scientific communication, and that SR must disclose it's full methodology for the findings to be properly evaluated.

Prof.Wizard · Dec 4, 2002

Tannin said:
Not so far as I know, but I can tell you: nothing of significance bar the psudo-scientific sound of the phrase "analysis engine", which ranks up there with "ectoplasm" and "astrological conjunction". You can't analyse nonsense, no matter what you do with it.

Your stance against the Reliability Database is notorious... I've followed your arguments and I don't really think it's the case to be so "against" the whole endeavor.

Tannin · Dec 4, 2002

If you don't understand the fundamental absurdity of trying to derive percentage reliability numbers from non-random, self-selecting sample data Prof, you have not 'followed my arguments'.

Tannin · Dec 4, 2002

And another thing: "enough of a particular model" - what hocus-pocus nonsense. There is no such thing as a sufficient quantity of sample when the quality of sample is nonexistant. Psudo-science, pure and simple.

Buck · Dec 4, 2002

I've been waiting for Tsar Tannin to speak more about this topic. If you want reliable reliablity ratings, you need much different information then Eugene is asking for. Only the manufacturer has that data.

Adcadet · Dec 5, 2002

Buck said:
I've been waiting for Tsar Tannin to speak more about this topic. If you want reliable reliablity ratings, you need much different information then Eugene is asking for. Only the manufacturer has that data.

I'm sure there's a way to get some good reliability data other than the manufacturer.

Anybody here actually interested in doing a REAL reliability study?

Prof.Wizard · Dec 5, 2002

Tannin said:
non-random, self-selecting sample data

Eh?
Why is that data non-random and self-selecting?! I would say quite the opposite.

Prof.Wizard · Dec 5, 2002

BTW, I once had an e-mail from Eugene where he was telling me that the 2nd version of the SR Reliability Database (after the "loss") would have been with the guidance and suggestions of a local college's Master program (I guess in Statistics, Computer Science or something like that).

Excerpt from the mail:
"We're currently redesigning the database under the auspiciouses of a universities' masters program. As a result, much of the proposed data collection is out of my hands right now. I'm not sure whether such a question will be included. Keep in mind, however, that every additional required field from a participant exponentially complicates the survey."

In that e-mail he was also explaining me your other unbased criticisms, Tony, regarding SR's methodology practices. I can always forward you that e-mail, Tony. Dated 4th of June 2002.

Tannin · Dec 5, 2002

I confess to having absolutely no idea why an otherwise intelligent person like you would say that, Prof. I daresay I could think of even less random and more self-selecting samples if I put my mind to it, but it wouldn't be easy.

The first thing you must do when choosing a sample is determine the population of interest. In this case, I presume, it's the global population of all hard drives sold. (Though it's possible that SR have a different target population in mind, and very likely, I fear, that they don't actually have any target population defined at all. The inflated claims and error-ridden methodology of the DRS as so far revealed to us make me suspect the worst.)

Then, having identified the target population, you develop a method to sample from it. The sampling method must be completely uncorrelated with the variables of interest if you wish the resultant data to have any validity.

Finally, having identified the target population and selected your sample, you must attempt to extract the relevant information in a way such that the results are unbiased. This (of course) is the hardest task of all, and in the social sciences, one that is never as good as one could wish it to be.

I've not seen evidence to show that the SR DRS has been successful in achieving any one of these three steps, and their attempt to deal with the critical second step has been particularly unscientific.

Tannin · Dec 5, 2002

We cross-posted. I'd be interested to read that, Prof, but it would be a clear breach of confidence for you to forward it to me without permision. If SR really had a valid sampling methodology, I should have thought that they would have been only too keen to publish it, so as to deflect the major criticisms that I (and many others with some training in the field) have expressed. But they haven't published it, and the reason they haven't published it is because it doesn't exist.

Prof.Wizard · Dec 5, 2002

Well yes, that's textbook statistics.

But... if we assume that random or frequent users (like you and me) just post the real results of their HDs in possession then you can extract a pattern if a fair number of drives has been submitted. Of course, there is always the possibility of a [let's say] Seagate employer to come and inflate the database with fake working-in-perfection drives... I admit that, but even double-blind drug clinical trials aren't without statistical faults.

Just have faith and the pattern will arrive. Especially for the 75GXP...

I personally submitted 2 REAL results. What about the rest of you?

Adcadet · Dec 5, 2002

Prof.Wizard said:
Well yes, that's textbook statistics.

But... if we assume that random or frequent users (like you and me) just post the real results of their HDs in possession then you can extract a pattern if a fair number of drives has been submitted. Of course, there is always the possibility of a [let's say] Seagate employer to come and inflate the database with fake working-in-perfection drives... I admit that, but even double-blind drug clinical trials aren't without statistical faults.

Just have faith and the pattern will arrive. Especially for the 75GXP...

I personally submitted 2 REAL results. What about the rest of you?

I do believe that if the study is done well enough, you don't actually need a random sample to get meaninful results. But I think there are deeper flaws than that even.

In the old reliability survey at least, methods were used to filter out results that were thought to be bogus. So if one person came and registered 50 drives of a particular model that all failed, that result would likely get thrown out. The idea of a filter was considered key for any revision of the DRS. Of course, D&E are affraid that disclosing the filter rulles, they will just be worked around, and they were very coy about giving me the specifics of what they had used before. I seem to remember D&E mentioning a few cases in which an individual entered some drives apparently in an attempt to sabotage the results. I think a filter may have its place, but these things can be taken too far. What if I decided to enter my drives today, which happens to include three 75GXP failures. Would they they consider my entry bogus? (I actually did have three 75GXPs fail). What if I also mention that I had two Viking IIs fail on me (I did)? Would they get thrown out too? Would the Viking II AND 75 GXP results be thrown out? Ok, wha if I then specify a few other drives that haven't failed? Would they all still get thrown out?

The idea of a filter is not inherently wrong (IMO), but until SR publishes the exclusion criteria, they can't claim full disclosure.

Prof.Wizard · Dec 5, 2002

Tannin said:
We cross-posted. I'd be interested to read that, Prof, but it would be a clear breach of confidence for you to forward it to me without permision.

He doesn't say something bad or evil about you. He just counters your positions.
I wouldn't need permission for that, but I would like it to remain FYEO for reasons of fair play.

Prof.Wizard · Dec 5, 2002

Adcadet said:
In the old reliability survey at least, methods were used to filter out results that were thought to be bogus. So if one person came and registered 50 drives of a particular model that all failed, that result would likely get thrown out. The idea of a filter was considered key for any revision of the DRS. Of course, D&E are affraid that disclosing the filter rulles, they will just be worked around, and they were very coy about giving me the specifics of what they had used before. I seem to remember D&E mentioning a few cases in which an individual entered some drives apparently in an attempt to sabotage the results. I think a filter may have its place, but these things can be taken too far. What if I decided to enter my drives today, which happens to include three 75GXP failures. Would they they consider my entry bogus? (I actually did have three 75GXPs fail). What if I also mention that I had two Viking IIs fail on me (I did)? Would they get thrown out too? Would the Viking II AND 75 GXP results be thrown out? Ok, wha if I then specify a few other drives that haven't failed? Would they all still get thrown out?

It's amazing because in the same paragraph you praise and condemn the filter mechanism for the same reason: its inherent working you so-clearly explained!

I doubt that three 75GXP failures in a row would trigger the filter though. Maybe 50 yes, but three... bah... I find it pretty safe these mechanisms remain undisclosed only to D&E and probably a handful of Masters students in that collaborating college.

Adcadet · Dec 5, 2002

Prof.Wizard said:
BTW, I once had an e-mail from Eugene where he was telling me that the 2nd version of the SR Reliability Database (after the "loss") would have been with the guidance and suggestions of a local college's Master program (I guess in Statistics, Computer Science or something like that).

Excerpt from the mail:
"We're currently redesigning the database under the auspiciouses of a universities' masters program. As a result, much of the proposed data collection is out of my hands right now. I'm not sure whether such a question will be included. Keep in mind, however, that every additional required field from a participant exponentially complicates the survey."

In that e-mail he was also explaining me your other unbased criticisms, Tony, regarding SR's methodology practices. I can always forward you that e-mail, Tony. Dated 4th of June 2002.

Somehow I missed this earlier. I blame it on the spotty internet connection here at work.

I don't want to sound presumptuous (about myself) or condescending (towards Eugene), but I suspect Eugene may be referring to the relationship that I had with SR. At least that is the simplist explanation from what I know. June 4 was after I had stopped communicating with D&E, but it probably wasn't long after. He also makes the relationship seem a bit more than it was (if this is what he's referring to) in that I was the one running the DRS and I was at a University but the University (U of Minnesota) was not officially involved.

Regarding this part:

[iWe're currently redesigning the database under the auspiciouses of a universities' masters program. As a result, much of the proposed data collection is out of my hands right now.[/i]

this does sortof sound like the situation I had with D&E. At the time I was doing my Master's at a University (in epidemiology, and I still am), and was considered in charge of the data collection.

Adcadet · Dec 5, 2002

Prof.Wizard said:
Adcadet said:

In the old reliability survey at least, methods were used to filter out results that were thought to be bogus. So if one person came and registered 50 drives of a particular model that all failed, that result would likely get thrown out. The idea of a filter was considered key for any revision of the DRS. Of course, D&E are affraid that disclosing the filter rulles, they will just be worked around, and they were very coy about giving me the specifics of what they had used before. I seem to remember D&E mentioning a few cases in which an individual entered some drives apparently in an attempt to sabotage the results. I think a filter may have its place, but these things can be taken too far. What if I decided to enter my drives today, which happens to include three 75GXP failures. Would they they consider my entry bogus? (I actually did have three 75GXPs fail). What if I also mention that I had two Viking IIs fail on me (I did)? Would they get thrown out too? Would the Viking II AND 75 GXP results be thrown out? Ok, wha if I then specify a few other drives that haven't failed? Would they all still get thrown out?

Click to expand...

It's amazing because in the same paragraph you praise and condemn the filter mechanism for the same reason: its inherent working you so-clearly explained!

I doubt that three 75GXP failures in a row would trigger the filter though. Maybe 50 yes, but three... bah... I find it pretty safe these mechanisms remain undisclosed only to D&E and probably a handful of Masters students in that collaborating college.

I think I'm missing something here. Let me try to address a few things.

I don't think the idea of a filter is inherently bad. In any large study you need to verify that the data is correct (I sometimes hear it called "trimming"). But unless you give the specifics of how this was done, the reader and critics will have no way of knowing if this was implemented properly. If the filter throws out my 3 75GXP failures I would probably not trust the study at all. Until D&E disclose ALL of the methods, we shouldn't trust the DRS to be anything more than voo-doo magic. At least that's my view of science.

In regards to my previous post (which I bet I was writting as you were posting your last post, Prof, I doubt there are some masters students working on the DRS. Again, full disclosure would be nice, rather than just relying on some students (who may not even exist).

blakerwry · Dec 5, 2002

you can double check if your data has been entered or thrown out by checking the results of the survey and looking into the comments section for any comments you made when entering the drive. There will be an ID number assigned to each user (this number changes according to what drive model/size you are looking at so it is hard to track from model to model) But you can easily see (if you entered a bunch of a specific model number) that the results are either there or have been thrown out.

I had 2 1000JB's that were DOA, 1 1200BB that just works, 2 75GXP's that work, 1 ATA IV that works, 1 120GXP that failed and I think I entered my other 120GXP(the refurbished replacement for the original) as stopped using.

I really should have returned that refurb... but I was too lazy.

Tannin · Dec 5, 2002

Thanks Constantine, but I think I'd rather not read something that would put me in an awkward position. As a noted critic of the DRS methodology, I'd prefer not to be privy to private communications from supporters of it which I would be honour bound not to disclose. It would tie my hands, and on this particular issue - the issue that prompted me to register with SR and make my very first post back before the dawn of time - I want to be free to go on landing punches. I despise psudo-science pretending to be the real thing, for in this world the scientific method is the one and only proven way to gather genuine knowledge and improve ourselves, and things which attempt to hitch-hike their way to popular respectability by pretending to a relevance they do not have (such as astrology, creationisim, TV show 'opinion polls' and the SR DRS) deserve no mercy until they either (a) start genuinely producing valid results, or (b) cease to exist.

Tannin · Dec 5, 2002

On trimming, Andrew, quite so. It is not just allowable to manually remove outliers, in most situations it is a requirement. (As you know, of course.)

Adcadet · Dec 5, 2002

Tannin said:
Thanks Constantine, but I think I'd rather not read something that would put me in an awkward position. As a noted critic of the DRS methodology, I'd prefer not to be privy to private communications from supporters of it which I would be honour bound not to disclose. It would tie my hands, and on this particular issue - the issue that prompted me to register with SR and make my very first post back before the dawn of time - I want to be free to go on landing punches. I despise psudo-science pretending to be the real thing, for in this world the scientific method is the one and only proven way to gather genuine knowledge and improve ourselves, and things which attempt to hitch-hike their way to popular respectability by pretending to a relevance they do not have (such as astrology, creationisim, TV show 'opinion polls' and the SR DRS) deserve no mercy until they either (a) start genuinely producing valid results, or (b) cease to exist.

Tannin, you are one wild, Science-zealot. And I LOVE you for it.

I entered an "honor-bound" situation when I began consulting with D&E with the DRS. And since then I've said next to nothing on the subject. But I decided that I've been silent long enough, emailed D&E, and told them that I no longer feel an obligation to keep silent. Davin responded back without protest, so here I am.

LiamC · Dec 6, 2002

On trimming, Andrew, quite so. It is not just allowable to manually remove outliers, in most situations it is a requirement. (As you know, of course.)

So Tannin, how much faith do you have in Winbench or Winstone (or SANDRA or SYSMark) scores for that matter. They all report the extreme outlier as "the score" - which IMHO is utter crap, and why I don't do it in my articles

Tannin · Dec 6, 2002

About as much faith as I have in the SR DRS, Bill. Which is to say not very much.

I should repeat my usual disclaimer here: I don't think the SR DRS is completely without value, and were there not a thousand methodologically clueless fanboys saying how wonderful a thing it is, I'd point out some of its good features myself from time to time. But there is a difference: in the case of Winbench, I don't have anything better, and probably don't have anything else that's even a reasonable substitute to hand. However, in the case of the SR DRS I have two substantially better methods available to me: (a) my own experience with thousands of drives (which gives me very reliable results but only for those drives that we happen to sell in reasonable numbers - so for example, I can't know anything at all about current Maxtor models because we have not sold any), and (b) the experiences of people I know and trust: if a Mercutio, a Skallas, or a Fugushi makes an observation about drive reliability, I take that very seriously.

Notice first the similarity between my method (b) and the SR DRS - that both are based on self-report, experiential data and thus potentially flawed. Second, notice the difference between the SR DRS and my method (b) - that in the case of method (b) I can judge for myself the reliability and the relevance of the comments made by my informant, using the ability to judge human nature and truth or falsehood that, by virtue of being a human myself, I have been learning for the past 43 years.

My biggest single beef with the SR DRS is that, without in any way improving the quality of the data, it takes experiental self-reports and turns them into 'statistics', thus leaning on the centuries of public faith that the scientific endeavour has earned for its credibility, but conspicuously failing to deliver the hard data and quantifiable error proportions that earned that respect for scientific investigation in the first place.

Prof.Wizard · Dec 6, 2002

Would you be happy if there was a disclaimer about you fears, Tannin?

Hey, don't take me for a pseudo-science zealot guys. I almost have a portrait of Descartes in my room. I like logic and methodology and I detest fake or tentative results... but: the SRD is a unique offering web-wide. And since it was one of the few things I was really unhappy when the incident happened on 27th of December 2001, I have been really pro its reappearance in a 2nd version, in a better, more scientific way.

I even considered subscribing here...

http://www.skeptic.com

Adcadet,
you were probably the college student the mail says. Epidemiology gives you great insights about Statistics. But unfortunately that can't be proved. The excerpt I posted is almost all Eugene says about the SRD, since the rest is advocating about IPEAK methodology.

Adcadet · Dec 7, 2002

Prof.Wizard said:
Adcadet,
you were probably the college student the mail says. Epidemiology gives you great insights about Statistics. But unfortunately that can't be proved. The excerpt I posted is almost all Eugene says about the SRD, since the rest is advocating about IPEAK methodology.

Indeed. Again, if SR only disclosed their methodology, perhaps it wouldn't be as important.

Storage Review Drive Reliability Survey

Storage Freak

Storage? I am Storage!

Storage Freak

Administrator

Storage? I am Storage!

Storage Freak

Storage? I am Storage!

Storage Freak

Wannabe Storage Freak

Storage? I am Storage!

Storage? I am Storage!

Storage? I am Storage!

Storage Freak

Wannabe Storage Freak

Wannabe Storage Freak

Storage? I am Storage!

Storage? I am Storage!

Wannabe Storage Freak

Storage Freak

Wannabe Storage Freak

Wannabe Storage Freak

Storage Freak

Storage Freak

Storage? I am Storage!

Storage? I am Storage!

Storage? I am Storage!

Storage Freak

Storage Is My Life

Storage? I am Storage!

Wannabe Storage Freak

Storage Freak