Typical symptoms of memory running beyond its capabilities

Jake the Dog

Storage is cool
Joined
Jan 27, 2002
Messages
895
Location
melb.vic.au
hi,

excluding the CPU, or a boards ability to run a very high FSB, what sort of stability issues are typical of RAM being run faster than it can cope?

I have DDR400 RAM rated at CAS2.5 which I run at CAS2 @ 200MHz (can and will run CAS to @ 211MHz - haven't t tried further yet) I have the occasional stability issue (bluescreens) and having considered and tested just about everything else, I'm now looking at RAM being a source of the bluescreens, I won't get the chance to do some testing tonight so I thought, I'd all the well informed folks in preparation for further problem finding tomorrow.

before people make suggestions regarding other components please note that I have considered and tested the following things:

CPU's ability run a high FSB
CPU core voltage adjusted (no diff from 1.65-1.85v except temps!)
mobo's ability run a high FSB
PSU is supplying ample power at correct voltages
CPU and NB are well cooled
CPU & mobo temps are good
PCI/AGP speeds running at 33/66MHz respectively

I'm just looking for some info, help, comments, etc on RAM issue at high speeds. would increasing DIMM voltage make help RAM run at CAS2 instead of CAS2.5 at it rated speed?

TIA for any help received :)
 

P5-133XL

Xmas '97
Joined
Jan 15, 2002
Messages
3,173
Location
Salem, Or
Ram stability issues can actually have virtually any symptom possible because just about everything goes through ram. With that said, it will normally show up as blue screens or application errors but occasionally it shows up as file system/drive corruption instead.

Try running memtest86 for an extended amount of time (overnight). It will normally exercise your ram enough to detect RAM problems.
 

time

Storage? I am Storage!
Joined
Jan 18, 2002
Messages
4,932
Location
Brisbane, Oz
I believe you are right, your RAM is the limiting factor. But you would probably have to spend two to four times as much to improve on it, with products such as Corsair XMS.

Be very careful about increasing RAM voltage. I've read that the latest modules don't like anything above 2.5V (to the point of damage), and I received a lecture to that effect from a memory manufacturer engineer. He suggested that overvoltage was a cause for RAM instability, not stability. I suggest you read up on it before trying it ...
 

blakerwry

Storage? I am Storage!
Joined
Oct 12, 2002
Messages
4,203
Location
Kansas City, USA
Website
justblake.com
time said:
I believe you are right, your RAM is the limiting factor. But you would probably have to spend two to four times as much to improve on it, with products such as Corsair XMS.

Be very careful about increasing RAM voltage. I've read that the latest modules don't like anything above 2.5V (to the point of damage), and I received a lecture to that effect from a memory manufacturer engineer. He suggested that overvoltage was a cause for RAM instability, not stability. I suggest you read up on it before trying it ...

I've heard the same thing. Personally, I've never believed that increasing RAM voltage was a smart idea, 1) because they *will* fry and 2) because there's little or no benefit.

Epox made their default RAM voltage on one of their boards higher than the 2.5v default for DDR(i think it was 2.6v). Aparently several people have noticed that setting the voltage back down to 2.5v helps their overclocking attempts.


I agree with P5 on this one completely.
 

Jake the Dog

Storage is cool
Joined
Jan 27, 2002
Messages
895
Location
melb.vic.au
well I did manage to get to my PC. I slowed it down a little (11x200MHz) and run a windows MemTest for a few hours. the results no errors, I need to run it overnight to be sure the RAM's is OK but so far it looks good.

I'm wary to increase DIMM voltage as well, hence I asked the question. I haven't done it so far.

My CPU die temps seems to be higher than other people are reporting despite my semi-capable and lapped HSF. I'm thinking about water colling now...
 

Jake the Dog

Storage is cool
Joined
Jan 27, 2002
Messages
895
Location
melb.vic.au
btw, I'm not convinced it's the RAM because it's good RAM. it's OCZ DDR400 Basic Series RAM, which whilst it's not extreme low latency RAM, it is of high quality design, componentry and construction. it's SPD timings are 2.5-3-3-9 but I have successfully run this RAM at 2-2-2-5 @ 170MHz and 2-3-3-6 @ 211MHz (haven't tried higher yet). bear in mind that this was all in dual channel mode too.

I'm just going through the motions of eliminating it as source of my occasional bluescreens.
 

time

Storage? I am Storage!
Joined
Jan 18, 2002
Messages
4,932
Location
Brisbane, Oz
On a slightly related note, does anyone know a handy utility to read the Athlon on-die temperature, rather than the socket?
 

LiamC

Storage Is My Life
Joined
Feb 7, 2002
Messages
2,016
Location
Canberra
JTD,
try setting the FSB to 200MHz and the memory to 166MHz or 133MHz. Test.

What this will tell you - your board can (or cannot) handle high FSB. Your memory should run fine at 166 (PC2700) speeds.

Next set your FSB to 166MHz and memory to 200MHz. If the problems reoccur, then the problem may be the northbridge. As your memory is rated for 200MHz operation, it is probably the CPU to north bridge link, or the northbridge itself that is not handling things. I recently had to do these tests with an EPoX 8RDA+ and came to the conclusion that its the NB that is holding things back. Next item is to lap the NB heatsink and add a fan and see if this improves things. There is a chipset voltage mod doing the rounds as well that may help. This should confirm things. HTH
 

time

Storage? I am Storage!
Joined
Jan 18, 2002
Messages
4,932
Location
Brisbane, Oz
Thanks Tea, but with the Epox 8RDA+ that utility shows on-die temp at about 23C, while socket temperature is 41C. :(

LiamC, what is the significance of running the FSB and RAM asynchronously? If the FSB works at 200MHz+, why does failure at 166MHz indicate a Northbridge problem? And above all, how do you arrive at the conclusion that the Northbridge is getting too hot from observing unstable operation at 166MHz?
 

Jake the Dog

Storage is cool
Joined
Jan 27, 2002
Messages
895
Location
melb.vic.au
thanks for your input Liam but I already went through the process you described last week. also, I have already lapped the "fanned" NB HS. (in fact, this was the first thing I did when I got this mobo home!). having done both these things, I've ascertained that this mobo can handle high FSB frequencies (as I mentioned in my first post :p)


time: the Soltek HW monitor displays the CPU die temp and socket temp arse-about. your true in-die temp is thus 41C.
 

LiamC

Storage Is My Life
Joined
Feb 7, 2002
Messages
2,016
Location
Canberra
Time,

What we want to do is run the FSB at its rated speed so we know there is little chance of error. The NB controls access to SB, memory and AGP.

If we have stable operation at stock speeds (which is only an assumption) and systems aren't being run way out of spec, ie a 1800MHz processor @ 2200MHz, then when at over clocked FSB/memory settings, something has failed.

If you run the FSB @ 200 (stock is 166) but the memory @166 or 133, and system fails, then we can safely assume that the 200MHz FSB is the culprit.

If the system runs flawlessly (as happened to me) then the CPU NB connection is OK - ie data is getting to the NB without corruption.

If memtest86 says the memory itself is fine when running @ 200MHz, then something in the NB memory connection is not holding its end up. The traces on the MB are just that so are unlikely to fail, but it's not unheard of. If it's a dual channel board, check each channel individually. Doesn't eliminate the possibility, but does make it less likely.

So if the memory checks out, the CPU/FSB checks out then it is either the traces or the NB that is failing to 200/200 test. By running the FSB @ 166 or 133, we are eliminating the possibility that some weird corruption is happening at high bus speeds. My understanding is that high speed circuit design is done in absolute time intervals, ie it will take x nanoseconds for the clock pulse to get from A to B. If you raise the FSB, the absolute time interval is (in the case of 166FSB v 200FSB) 20% shorter! Maybe the circuits weren't designed to operate that fast? So let's eliminate the possibility - or at least take steps to reduce it. Just trying to be sure.


As for the too hot scenario, You read into what I wrote what I didn't say. The NB on my EPoX does get hot, - very warm in fact. Heat can cause IC's to fail. By raising operating frequencies from 166 to 200, you are causing (some) circuits to switch faster, which generates more heat. I am going to investigate whether this is the case (and casue of failure). I didn't say it was the cause. I observed the heat by simple put finger on NB, heatsink and SB whilst the board was running at various speeds. It does get quite warm.

Have I missed something or jumped to an erroneous conclusion?
 
Top