The great hard drive cache myth

Tannin

Storage? I am Storage!
Joined
Jan 15, 2002
Messages
4,448
Location
Huon Valley, Tasmania
Website
www.redhill.net.au
OK, now you've gone and pushed one of my buttons.

In another thread i said:
Cliptin said:
There is no technical reason not ot increase an ATA drives disk cache to 64M or 128M really. If they can write firmware algorithms to take advantage then a highbandwidth interface becomes more important.

I don't understand much about the internal workings of hard disks, so here goes: wouldn't increasing the on-disk buffer to something that high increase the chances of serious data loss after a power failure?

You say, "not really ... the operating system could just as easily be caching 64 Mb worth of data." To which I reply, "but what about a journaling file system?" If it's the OS that's doing the caching, at least it has the chance to manage the journal information such that data loss will be minimized. But if you put that cache on the brainless disk, well ... you're screwed.

Am I close?

Cliptin, please step foward, salute and accept the much coveted Tannin Order of Merit (with crossed f-nodes). You, sir, have said the first sensible thing about hard drive cache I've read since at least Tuesday.

This is the whole point about hard drive cache. It ain't a cache, it's a buffer.

What's the difference? Cache is a small amount of high-speed storage which is organised in such a way that the outside world (i.e., the external device) "thinks" the entire storage device is actually as fast as the cache is.

A buffer, on the other hand, is a small amount of high-speed storage that provides elasticity to the interface between to different devices.

Hard drives don't have caches - they have buffers.

Huh? Isn't that just two different words for the same thing, Tannin?)

(It's two different purposes, Tea.)

(But it's still the same object: some RAM and a bit of clever firmware. So what if people want to call it cache? It's just a word, it doesn't change the function of the object, or it's nature.)

(Yes it does. And what's more, if you misunderstand the purpose of an object (such as a hard drive buffer), you will quite likely go right ahead and misuse that object.)

(You are splitting hairs.)

(No I'm not. Purpose is integral to the object. You have to understand the purpose or else you can't understand the object you're thinking about. And if you can't understand it, then you can't use it properly. Look: what's that thing over there?)

st157.jpg


(A hard drive.)

(No it's not.)

(It's an ST-157A, a Seagate 40MB stepper drive. Six heads, IDE interface, Type 17, if I remember correctly.)

(And from that, you learn what?)

(That this is an expensive object with a useful working life of about four years that you can plug into a computer and store data on, that it is delicate and static-sensitive, and that it wasn't one of the great drives when it was new in 1991 and by any rational standards you ought to throw the damn thing away.)

(And that's where you are wrong, Tea. It used to be a hard drive, but like any other class of object, you can't understand it properly unless you look at it in the proper context of its actual purpose. Look again.)

st157a.jpg


(It's not a hard drive. It's a doorstop. And a very good doorstop too. It has an estimated useful working life of over fifty years, it cost absolutely nothing at all, it won't ever be plugged into a computer, it is not useful for storing data, it is completely insensitive to static electricity, and you could drop it onto a concrete path without damaging it in the slightest.)

(OK. I take your point. So what you are saying is that the purpose of an object is critical to understanding that object.)

(Yes.)

(And that by treating a hard drive's buffer as if it were actually a cache, we risk making the same sort of mistake that I would make if I were to try to use that Seagate doorstop of yours as a hard drive, and your X15 as a doorstop?)

(Oh my God! Don't even think about it!)

(Oh I won't. I'd hate to see you go down in history as "Tannin the Fratricide".)
 

Tannin

Storage? I am Storage!
Joined
Jan 15, 2002
Messages
4,448
Location
Huon Valley, Tasmania
Website
www.redhill.net.au
Varying the amount of "cache" on a drive has little or no effect on the drive's performance.

At least that's what Storage Review discovered some years ago, using a pair of Maxtor drives. And then more recently they suddenly recanted, this time using their shiny new (and I think rather suspect) methodology.

(But they got two different results, Tannin. Seems to me that while you are quite justified in casting doubt on the claim that a bigger buffer improves performance, you are not justified in then simply deciding that buffer size doesn't matter. To say that, on the basis of two different Storage Review articles which contradict each other, is just as bad as uncritically accepting whatever view happens to be fashionable at the moment.)

Quite right, Tea.

(And I know you think the new SR methodology is suspect. But didn't you always say that the old methodology even more suspect?)

Indeed I did. I don't trust IPEAK to speak of, but I know Winbench produces bad results as mere routine.

But it seems to me there is a very simple way to estimate the effect of hard drive buffer size on performance: one that requires no measurements whatever, simply observation and a little logic.

(Hmmm?)

Well, you'll grant me that the hard drive market is ultra-competitive?

(Of course.)

And that companies like Seagate and Maxtor pay a lot of money to some very highly skilled engineers who devote their working lives to dreaming up ways to make their drives just that little bit faster than their competitors' products?

(Yes.)

And that a small quantity of RAM is so cheap that if adding an extra 50MB or so would make any really worthwhile difference to the performance of their drives, someone would have done it years ago?

(Hmmmm.)

Well, why wouldn't they? Look Tea, there are really only two possibilities here. We know that no-one is shipping hard drives with any meaningful quantity of buffer RAM. Either (a) adding lots of extra RAM to a hard drive wouldn't help the performance enough to be worth the effort, or (b) the engineers at Seagate and WD and Maxtor and Samsung are all stupid. Got to be one or the other.

(But what about the cost?)

What cost? It's perfectly ordinary RAM, you know, nothing special about it. Good Lord, it wouldn't surprise me to discover that it actually costs them more to buy penny-packet quantities of those tiddly little 2MB and 8MB RAM chips than if they just went straight to the same mass-produced RAM that goes onto DIMMs and plugs into your motherboard.

(You're exaggerating. They probably save about two dollars a drive by using those little chips. A 256 Mb chip costs .. ahh ... about US$5 if I recall correctly, and 256Mb is ... uh ... 32MB. I know those 2MB and 8MB chips they use are specialty items these days, but they still would save a tiny bit on every drive, and if you sell enough drives that adds up to a lot.)

But if a bigger buffer really did anything special for the performance, they could sell the drive for maybe US$25 more. Certainly US$10 more, and that's quite a bit greater than the cost of the bigger buffer.

Q.E.D.
 

CougTek

Hairy Aussie
Joined
Jan 21, 2002
Messages
8,726
Location
Québec, Québec
Tannin said:
Good Lord, it wouldn't surprise me to discover that it actually costs them more to buy penny-packet quantities of those tiddly little 2MB and 8MB RAM chips than if they just went straight to the same mass-produced RAM that goes onto DIMMs and plugs into your motherboard.

(You're exaggerating. They probably save about two dollars a drive by using those little chips. A 256 Mb chip costs .. ahh ... about US$5 if I recall correctly, and 256Mb is ... uh ... 32MB....
The latest prices I saw placed DDR 256Mbit memory chips cheaper than comparable SDRAM, about 50¢ less at ~6.50U$ each. The best deal these days is on 128Mbit SDRAM chips, they sell for less than 2.50U$ each. That's still 16MB of RAM, twice as big as the buffer of the special edition Western Digital.
 

time

Storage? I am Storage!
Joined
Jan 18, 2002
Messages
4,932
Location
Brisbane, Oz
Tannin said:
At least that's what Storage Review discovered some years ago, using a pair of Maxtor drives.
Wasn't it IBM or WD drives, or perhaps one of each? Anyone remember if it was 512kB vs 2MB?
 

CougTek

Hairy Aussie
Joined
Jan 21, 2002
Messages
8,726
Location
Québec, Québec
IIRC, it was Maxtor DiamondMax Plus 40. One OEM version with a 512K buffer and another retail with a 2MB buffer.
 

i

Wannabe Storage Freak
Joined
Feb 10, 2002
Messages
1,080
Tannin said:
OK, now you've gone and pushed one of my buttons.

In another thread i said:
Cliptin said:
There is no technical reason not ot increase an ATA drives disk cache to 64M or 128M really. If they can write firmware algorithms to take advantage then a highbandwidth interface becomes more important.

I don't understand much about the internal workings of hard disks, so here goes: wouldn't increasing the on-disk buffer to something that high increase the chances of serious data loss after a power failure?

You say, "not really ... the operating system could just as easily be caching 64 Mb worth of data." To which I reply, "but what about a journaling file system?" If it's the OS that's doing the caching, at least it has the chance to manage the journal information such that data loss will be minimized. But if you put that cache on the brainless disk, well ... you're screwed.

Am I close?

Cliptin, please step foward, salute and accept the much coveted Tannin Order of Merit (with crossed f-nodes). You, sir, have said the first sensible thing about hard drive cache I've read since at least Tuesday.

I am so confused. Does this mean my wild speculations are way off? Can someone tell me what the effect of increased hard disk buffer sizes would be with respect to power failures? Anything to worry about or not?

I need to go back to bed and try reading this discussion again later. My head hurts. :p
 

myself

What is this storage?
Joined
Mar 25, 2002
Messages
29
I think the deduction is that your question is irrelevant, i. You don't have to worry about what might happen, because no hard disk manufacturer is going to try it.

I think.
 

Clocker

Storage? I am Storage!
Joined
Jan 14, 2002
Messages
3,554
Location
USA
It sounds to me like the overall opinion in this thread is that buffer size is irrelevant. I totally disagree with this line of thinking.

I don't think of the buffer on these drives as merely a data reservoir can be used to store data on a FIFO basis in order for the data pump (the drive interface) from starving. To the contrary, I feel that the results shown by Eugene's iPeak testing indicate that caching algorithms can be 'tuned' for the expected operating environment of the disk allowing more efficient reading/writing of a disk designed for a specific operating environment.

Early on, Eugene and I had several ICQ discussions about iPeak before the official release of it on SR. It was more of a Socratic discussion where I basically led myself to this conclusion.... I was impressed by the ability of iPeaks measurements to characterize the caching strategy (or buffer optimization mechanics if you don't want to call it cache). I don't have the details at hand but I remember that it was totally obvious to me that, for instance, an X15's cache hits characterized a much different target usage pattern than, say, a WD1000JB. Obviously, the X15 is a drive primarily targeted to server usage and the WD1000JB is a drive target toward consumer desktop usage. From my review of the iPeak cache hit patterns, it was painfully obvious that the firmware caching strategy for each drive had been tuned to their intended usage. That is why, I believe, that a WD100JB represents a much greater value than an X15. An intelligent buffer optimization pattern coupled with a larger than average buffer allows the WD1000JB to perform on par with the X15.

I think there is a point of diminishing returns on buffer size. I don't know what it is but I suspect that it may be related to the ability of the firmware to effectively use a given amount of cache. I think more cache is better and as HDD firmware capability improves, cache buffer will gradually increase until we move eventually to fully SSDs.

JMHO. One of the rare times you hear me speak out on a technical issue :)

Clocker
 

P5-133XL

Xmas '97
Joined
Jan 15, 2002
Messages
3,173
Location
Salem, Or
I think this topic has strayed slightly from the original post of the danger of a HD cache to HD benchmarking. I'm going to comment of the original issue of the danger of a HD cache.

First a read-only cache is not dangerous at all, because the data is still on the HD if the HD shuts down.

There is a distinct difference between a buffer and a cache. A buffer is sequential (typically first in first out (FIFO)), while a cache is non-sequential. Since we are only dealing with writes then we are basicly dealing with in-order delayed writing (buffer) vs. re-ordered elevator seeking.

Buffers are safer than caches when power is removed. The best example I can give you is dealing with directories/fats/files. Suppose the cache elevator seeked and wrote out the data in the following order of fat table, directory table, followed by file data. Then if the drive failes inbetween the fat table and the directory table then there will be file system corruption: The fat points to garbage. If the drive fails between the directory table and the file data then there will be data corruption.

However, if the drive is buffer driven then everything will be written out in the correct order - File data first, next the directory entry, followed by fat table. If the drive failes at any point before the fat table data has been written, then the most that can occur is that there is no file/directory only lost clusters. This is assuming that the OS is designed for this type of failure and all versions of DOS and Windows have been in the past.

The same effect can be shown even with a journaling file system. Elevator write caches can corrupt even a journaling file system because the log can be updated before the data was written causing possible corruption upon power loss. If the drives use buffers then everything gets written out in the correct order and thus the file system can be corrected incase of power failure.

I am unsure about NT/2000/XP's elevator seeking and how it operates to prevent file system corruption. My best guess is that there are certain files/locations that the OS will not elevator seek so as to protect the file system integrety.

This is why HD's have write-buffers rather than write-caches and why they are relatively safe even when the power has been interrupted.
 

Tannin

Storage? I am Storage!
Joined
Jan 15, 2002
Messages
4,448
Location
Huon Valley, Tasmania
Website
www.redhill.net.au
The point, gentlemen, should be looming in the dark by now: hard drive cache is a myth. In a manner of speaking, there is no such thing.

Let's approach hard drive "cache" by thinking about a directly similar but better understood cache: that between the CPU and its RAM. This one, of course, is a real cache, not a buffer. I hope that by considering the two "cache" systems side by side, we can begin to see why only one of them is really a cache in the true sense of the word. If I can stay awake that long.

Cougtek, you can help me out here. I can't remember the numbers when it comes to CPUs and memory bandwidths, so I'll just make some broadly appropriate ones up to illustrate my point, and you can come along and correct them for me.

A CPU has three levels of RAM access: I'll use a Pentium III 1000 as my example. It has 32k primary cache (which is the smallest and the fastest), 256k secondary cache (which is bigger and slower), and let's say 256MB of main RAM, which is slowest of all.

(OK, I didn't stay awake that long. :( It's now Tuesday morning.)

Now the structure here is quite clear and logical: you put the smallest, fastest cache as "close" as possible to the CPU. Ideally, of course, you make that cache as large as possible. (Indeed, notice that the AMD Duron has a primary cache that is substantially larger than the secondary, but this is a special case because the Duron's caches are exclusive. Normally it would make no sense to have a secondary cache which is not substantially larger than the primary.)

But what about hard drives?

Under perfect circumstances, the system can access the hard drive's "cache" at about 100MB/sec (i.e., ATA-133 minus a bit to allow for overhead, other PCI bus activity, and the slippery size of these damn decimal megabytes.)

But the system's main memory bandwidth is is vastly higher than that: even for PC-133 RAM on a 133MHz bus it's about one gigabyte per second. Main system memory, in other words, is at least eight times faster than the RAM on the hard drive which is usually and quite incorrectly called "cache".

That's the bandwidth. For the latency, the difference is even bigger. I don't have the figures, alas, but when you consider the need for a data request to navigate its way through the driver software, through the PCI bus, down the IDE cable to the drive, and then all the way back again, it can't even be of the same order of magnitude.

Caching hard drives is a good idea. But the drive itself is the worst possible place on which to put that cache.
 

time

Storage? I am Storage!
Joined
Jan 18, 2002
Messages
4,932
Location
Brisbane, Oz
I think you are looking at this with the wrong perspective, Tannin.

Firstly, the apparent advantages of the WD JB manifest themselves in reads, not writes. There are certainly seeking tricks they can do in the firmware to assist here, and that may in fact be part of the JB package, but the obvious use to put extra memory to is so called read-ahead caching.

This is not buffering, it is caching of speculative reads.

Your argument about cache proximity is fine up to a point, but it overlooks the fact that excessive speculative fetching by the OS cache will saturate the channel to the drive. That is, it would waste time transferring possibly useless data when it could be more profitably handling the next transaction.

Whereas the embedded drive cache can fully utilize the channel to service read requests while the drive is off doing more read ahead. If the cache is big enough and the caching algorithm good enough, the drive's ability to turn around requests would be limited solely by the channel bandwidth (in an ideal scenario).

What would the speculative reads be targeting? Apart from adjacent tracks, perhaps the drive is smart enough to focus on key parts of the file system? Of course, you could argue that this is only significant because of inefficiencies in Windows' caching strategies. I wonder if the JB might actually perform worse with a different OS?

As an alternative argument, drive cache size needs to increase anyway to better cope with increasing areal density. For example, if an 80GB drive has 40GB platters and about 55000 cylinders, that's about 1.5MB per cylinder. In other words, 2MB can only hold 1 track across all platters.

Not much scope for speculative reads there, and don't forget a portion of that goes to other duties such as write buffering.
 

James

Storage is cool
Joined
Jan 24, 2002
Messages
844
Location
Sydney, Australia
Tannin said:
Caching hard drives is a good idea. But the drive itself is the worst possible place on which to put that cache.
No, no, no. The idea behind the buffer on the hard drive is to collect data either in front of or behind the actual data needed "for free." In other words, it is data that isn't yet required (and may never be), but can be collected for little or no loss of performance.

If all that useless (for the moment) data is transferred to the system, you are sending data you don't need to across the PCI bus and therefore potentially increasing congestion and reducing the performance of other devices on the SCSI or IDE bus. If the data does end up being needed, the drive can send it at the full interface speed. If it doesn't, no great loss.

I see the drive buffer's role as being fairly low level one. The OS/application makes decisions about what data it is likely to need in the near future based on a set of criteria which is far more complex (and OS-specific, indeed) than can be placed on a piece of drive buffer management firmware. Oracle's cacheing requirements under Solaris/UltraSPARC are quite different from its cacheing requirements under Win2K/x86 - and SQL Server would be different again. And those are two applications whose functionality is somewhat similar - imagine the differences between Oracle and Photoshop?

Also, remember - it is a read buffer, not a write buffer/cache (that function, when required, is usually performed by the RAID controller card for obvious reasons).
 

Clocker

Storage? I am Storage!
Joined
Jan 14, 2002
Messages
3,554
Location
USA
James/TIme- You did a better job of explaining what I meant. Nice work!

KCH
 

Mercutio

Fatwah on Western Digital
Joined
Jan 17, 2002
Messages
21,637
Location
I am omnipresent
Just to break in for a second: DAMN why can't we have a thread like this on SR?

I'm sure some of the new guys like cas and Sivar would have some cool things to say as well (but not me. I'm not an engineer and don't pretend to be), if they could see this...

OK, please resume arguing about cache performance.
 

P5-133XL

Xmas '97
Joined
Jan 15, 2002
Messages
3,173
Location
Salem, Or
I'm sorry but as I read my post I realized that I made a mis-statement

P5-133XL said:
This is why HD's have write-buffers rather than write-caches and why they are relatively safe even when the power has been interrupted.

It is obvious to me that many drives actually now use a true write cache rahter than just a buffer. There are obvious performance benefits to elevator seeking and that is undoubtably why that decision has been made by the HD manufactures. The reason I believe that HD's are now using caches rather than buffers is the MS Windows shutdown bug. That particular bug can and does cause file system corruption. The specific details of the bug is that with very fast processors (>500Mhz) the OS tells the HD to flush its cache; However, the processor is so fast that sometimes it actually can finish its own shutdown tasks before the HD has finished writing out it's cache. The last thing the processor does is to instruct the ATX power supply to power-down. If the cache has not finished emptying then corruption of the file system can and does occur.
 

P5-133XL

Xmas '97
Joined
Jan 15, 2002
Messages
3,173
Location
Salem, Or
Mercutio said:
Just to break in for a second: DAMN why can't we have a thread like this on SR?

I'm sure some of the new guys like cas and Sivar would have some cool things to say as well (but not me. I'm not an engineer and don't pretend to be), if they could see this...

I think the ideal would be to bring those people here rather than take the discussion elsewhere.
 

Tannin

Storage? I am Storage!
Joined
Jan 15, 2002
Messages
4,448
Location
Huon Valley, Tasmania
Website
www.redhill.net.au
Mercutio said:
Just to break in for a second: DAMN why can't we have a thread like this on SR? I'm sure some of the new guys like cas and Sivar would have some cool things to say as well.
Actually, that thread is a spin-off of this one, Mercutio. If you read the fine print of Cas' first post at the start of the thread, you will see that:

Over at SR Cas said:
Even before hard drives had integrated controllers, and cache, they had buffers. As was pointed out recently by an esteemed member of our community, on another board, there is a subtle difference between a buffer and a cache.

So, Cas at least is perfectly well aware of this place - aware enough to respond to my thread within a few hours. (Nice to be an "esteemed member", by the way :) )

The way I see it, if Cas or Russofris or Sivar (or whoever) want to participate here by posting as well as reading, then they are very welcome indeed. And if they just want to read without posting, that's just fine too. That's the nice thing about discussion boards: participation is entirely voluntary.

Tim: I do not think it would be appropriate to go sending messages or anything else like that. Give people enough credit to be able to make their own decisions about this stuff. As for posting links, the rule I recommend and follow myself is this: I know that there was some friction between SR and SF some months ago, and so I simply ask myself "if I had in mind a thread at some other place, Realworld Tech or Ace's, say, would I post the link to it?" If I judge the link is relevant by those standards, then I do. If it seems superfluous, then I don't.
 

timwhit

Hairy Aussie
Joined
Jan 23, 2002
Messages
5,278
Location
Chicago, IL
Tannin said:
Tim: I do not think it would be appropriate to go sending messages or anything else like that. Give people enough credit to be able to make their own decisions about this stuff.

I was never told what the official rules are. Regarding whether this board is supposed to be somewhat private or what. I remember references to the friction between SR. The reason that I started posting here was because someone sent me an E-mail inviting me to join.

I'm sure there are several SR regulars that don't know that this place even exists. There are also several readers that don't care to acknowledge it.

I guess it is possible to find this site by searching Google. That is if you make it to the 3rd page after looking at all the other useless links (Searching for "Storage Forum").
 

Tannin

Storage? I am Storage!
Joined
Jan 15, 2002
Messages
4,448
Location
Huon Valley, Tasmania
Website
www.redhill.net.au
No rules to speak of Tim. Just the usual and commonsense rule of courtesy to Webmaster Eugene, I think. There was extensive discussion of this here some time ago, pages and pages of it, possibly back before you joined. It got boring. Let's leave it there.

PS: that "someone" was me, I think. I always enjoyed your posts on Storage Review back in the old days. Still enjoying them now, come to that - if you care to slip back to that 10K IDE thread running at SR right now you'll see that while you were posting this I was over at SR spanking you over something you said about the Barracuida ATA 1. :)

Now, in a minute, I'm going to go home and have something to eat, and then let young Tea off the leash. She says she wants to rip into a few of the (alleged) doubtful claims and (she says) spurious arguments put forward by the unfamiliar and unholy triumvirate of Clocker, Time and James.

I, on the other hand, have the urge to play some Age of Empires. The final decision on the night's entertainment at the Tannin household will require a small and hopefully non-violent family discussion, no-doubt.
 

timwhit

Hairy Aussie
Joined
Jan 23, 2002
Messages
5,278
Location
Chicago, IL
Tannin said:
PS: that "someone" was me, I think.

Nope, wrong again Tannin. ‘Twas Mercutio that invited me over here.

Ha, now I got you back for wrecking my argument over on the 10K IDE thread.
 

cas

Learning Storage Performance
Joined
May 14, 2002
Messages
111
Location
Pittsburgh, PA
timwhit said:
The reason that I started posting here was because someone sent me an E-mail inviting me to join.
I had no idea that some people got invitations.

As Tannin observed, I have been aware of SF for some time now. I resisted joining, only through the flawed logic that belonging to one board would consume less of my time, than belonging to two.

In the end, I spend plenty of time following threads on SF, so I might as well join.

As a new member of the board, I would like to thank those who are providing the infrastructure. I will try not to abuse it with my largiloquent posts.
 

Tea

Storage? I am Storage!
Joined
Jan 15, 2002
Messages
3,749
Location
27a No Fixed Address, Oz.
Website
www.redhill.net.au
Hi Cas, nice to see you over here.

Before too long you will no doubt end up like Tannin - totally unable to remember which board he is posting on without stopping to read the URL in the address window. He's made a few howlers already.
 

Mercutio

Fatwah on Western Digital
Joined
Jan 17, 2002
Messages
21,637
Location
I am omnipresent
cas said:
I had no idea that some people got invitations.

As Tannin observed, I have been aware of SF for some time now. I resisted joining, only through the flawed logic that belonging to one board would consume less of my time, than belonging to two.

Invitations were made at the first, almost entirely to SR regulars who had visible email addresses. If you didn't get one, that's probably why. Sorry. The whole matter caused a great deal of frustration on our part. I think it's still a sore point.

I don't think the traffic level is so high here you couldn't keep track. Besides, we have cool threads here, too.
 

cas

Learning Storage Performance
Joined
May 14, 2002
Messages
111
Location
Pittsburgh, PA
P5-133XL said:
It is obvious to me that many drives actually now use a true write cache rahter than just a buffer.
This issue has been floating around here for a couple of days.

When I said on SR that “while a cache may be used as a buffer, not all buffers are caches”, I was using industry shorthand. The term cache, is of course short for buffer cache, which literally means a store of buffers. When used in this way, we assume that the buffers are organized such that they contain copies of information from a larger data store. Since any information written to the drive, is a part of that larger store, the cache itself is used to buffer the information. Clear? ;)

Here is what happens in a SCSI system running Windows NT.

By default, regular files opened on NT use a write back caching scheme. When I “write” to a file, I am really copying data to the system’s unified cache. In the unlikely event that I run out of cache space, my writes will block while the cache is flushed to disk, and return only when space is made available.

It’s more likely, that there will be plenty of room in the cache. In this case, dirty pages will accumulate, without actually writing to disk. Once a second IIRC, the lazy writer wakes up and schedules some of the dirty pages to be written out to disk. Eventually, all of your ‘written’ pages will be flushed this way. If you are in a hurry, you can call FlushFileBuffers, which will block until all of the dirty pages associated with your file, have been flushed.

There are a couple of things to can do to modify the operation of the lazy writer. If the file in question is just a temporary file, which is about to be read, you probably don’t want to write it to disk at all. If you let NT know this, the lazy writer won’t flush your pages, unless it starts to run out of memory.

If you are concerned about the integrity of your data, you can open the file with FILE_FLAG_WRITE_THROUGH, which disables write back caching on that file. In truth, writes are still cached, but they are flushed before the write call returns.

Or are they?

Just because the device completes a SCSI request, that does not necessarily mean that data has been encoded on the media. As many have observed, the drives themselves may be operating their caches in write back mode. How then can file systems like NTFS be sure that writes are done in the proper order?

If you look carefully at the various SCSI write commands. All but the six byte command, contain a FUA or Force Unit Access bit that will override the global cache settings. If this bit is set, the command will not be returned GOOD, until the data has been safely stored on the disk.
 

cas

Learning Storage Performance
Joined
May 14, 2002
Messages
111
Location
Pittsburgh, PA
Mercutio said:
If you didn't get one, that's probably why.
I was just yanking your chain. I knew I could join at any time.

BTW Above you mention that you are not an engineer, but I could have sworn that I once read you briefly had a job as a programmer over a garage or a machine shop.

Am I confusing you with someone else?
 

Mercutio

Fatwah on Western Digital
Joined
Jan 17, 2002
Messages
21,637
Location
I am omnipresent
I have a BSCS and a set of skills that I try very hard to forget about. I always like the IT (service) side of computer work, and I find real-life programming highly unsatisfying And yes, I worked as a programmer, in an office I shared with threee other people, above a heavy-construction-equipment garage and wash.
 

P5-133XL

Xmas '97
Joined
Jan 15, 2002
Messages
3,173
Location
Salem, Or
[quote="cas]
If you look carefully at the various SCSI write commands. All but the six byte command, contain a FUA or Force Unit Access bit that will override the global cache settings. If this bit is set, the command will not be returned GOOD, until the data has been safely stored on the disk.[/quote]

If you look at the ATA command set, there isn't a unified method of dealing with a cache. Before version ata-4 all you could do was turn on/off or set the size of the cache. ATA-5 allows for flushing the cache but it is an optional command. ATA-6/7 the flushing command is mandatory. I looked and saw no such SCSI flag equivilent for ATA
 

cas

Learning Storage Performance
Joined
May 14, 2002
Messages
111
Location
Pittsburgh, PA
Mercutio said:
I find real-life programming highly unsatisfying And yes, I worked as a programmer, in an office I shared with threee other people, above a heavy-construction-equipment garage and wash.
Tom Demarco would not have approved of those conditions.

I am sorry to hear that you had such a negative experience as a programmer. Having seen the many ways in which you have helped people with tricky IT problems on SR however, I suspect you are right where you belong.
 
Top